API reference

This page is generated directly from the package docstrings.

High-level model fitting

hapc.hapc(X: ndarray, Y: ndarray, family: str = 'gaussian', max_degree: int = 1, npcs: int | None = None, lambda_: float = 0.01, norm: str = 'sv', predict: ndarray | None = None, max_iter: int = 5000, tol: float = 0.001, step_factor: float = 0.8, verbose: bool = False, crit: str = 'grad', center: bool = True, approx: bool = False, ini: str = '1')[source]

Single-λ HAPC fit (Python counterpart of R hapc()).

Parameters:
  • X (np.ndarray, shape (n, p)) – Features.

  • Y (np.ndarray, shape (n,)) – Response. For family="binomial": hard labels in {0,1} or {-1,+1}, or soft labels in [0,1] (e.g. EM-HAL E-step posteriors). Soft labels are supported only for norm in {"1","2"}; norm="sv" requires hard labels.

  • family ({"gaussian", "binomial"}, default "gaussian") – Loss family.

  • max_degree (int, default 1) – Maximum interaction order for the HAL basis.

  • npcs (int, optional) – Number of principal components. Defaults to n.

  • lambda (float, default 0.01) – Regularisation parameter.

  • norm ({"sv", "1", "2"}, default "sv") – Norm constraint. "sv" runs PGD, "1" is closed-form LASSO, "2" is closed-form ridge.

  • predict (np.ndarray, optional) – Test features for prediction.

  • max_iter (int, default 5000) – Maximum PGD iterations (norm="sv" only).

  • tol (float, default 1e-3) – Convergence tolerance (norm="sv" only).

  • step_factor (float, default 0.8) – PGD step factor (norm="sv" only).

  • verbose (bool, default False) – Print progress (norm="sv" only).

  • crit ({"grad", "risk"}, default "grad") – PGD stopping criterion.

  • center (bool, default True) – Centre H and Y.

  • approx (bool, default False) – Use approximate eigendecomposition for norm in {"1","2"} (currently ignored — exact eigendecomposition is always used).

  • ini ({"1", "2"}, default "1") – PGD initialiser for norm="sv": "1" = LASSO (default, matches R), "2" = ridge.

Returns:

object – One of SinglePcghalResult (gaussian norm="sv"), SingleLambdaResult (gaussian norm in {"1","2"}), or SinglePcghalClassificationResult (binomial).

Examples

>>> import numpy as np
>>> from hapc import hapc
>>> rng = np.random.default_rng(0)
>>> X = rng.standard_normal((50, 3))
>>> Y = X[:, 0] + 0.5 * X[:, 1] + rng.standard_normal(50) * 0.1
>>> fit = hapc(X, Y, max_degree=2, npcs=20, lambda_=0.05, norm="sv")
>>> fit.alpha.shape
(...,)
hapc.cv_hapc(X: ndarray, Y: ndarray, family: str = 'gaussian', max_degree: int = 1, npcs: int | None = None, log_lambda_min: float = -5, log_lambda_max: float = -3, grid_length: int = 10, nfolds: int = 5, norm: str = 'sv', predict: ndarray | None = None, max_iter: int = 5000, tol: float = 0.001, step_factor: float = 0.8, verbose: bool = False, crit: str = 'grad', center: bool = True, approx: bool = False, ini: str = '1', Delta: ndarray | None = None, time_grid: ndarray | None = None) CVResult[source]

k-fold cross-validated HAPC fit (Python counterpart of R cv.hapc()).

Parameters:
  • X (np.ndarray, shape (n, p)) – Features.

  • Y (np.ndarray, shape (n,)) – Response.

  • family ({"gaussian", "binomial", "logit-hazard"}, default "gaussian") – Loss family. For "binomial" the loss is always logistic (deviance CV, not MSE). norm="sv" runs logistic ridge + PGD per fold; norm="2" runs logistic ridge only; norm="1" runs logistic LASSO via sklearn.linear_model.LogisticRegression with solver="liblinear" (mirrors R glmnet for the binomial+norm=”1” path). "logit-hazard" is a discrete-time logistic hazard model: Y is then the observed time T = min(T_event, C), Delta (event indicator) is required, and the call forwards to hapc.hazard_hapc() (returning a hapc.hazard.HazardResult, not a CVResult). norm="sv" is not supported for this family.

  • Delta (np.ndarray, optional) – Event indicator (0/1), required only for family="logit-hazard".

  • time_grid (np.ndarray, optional) – Discrete time grid for family="logit-hazard" (see hapc.hazard_hapc()).

  • max_degree (int, default 1) – Maximum interaction order.

  • npcs (int, optional) – Number of PCs. Defaults to n.

  • log_lambda_min (float) – Bounds of the log-λ grid (defaults -5, -3).

  • log_lambda_max (float) – Bounds of the log-λ grid (defaults -5, -3).

  • grid_length (int, default 10) – Number of grid points.

  • nfolds (int, default 5) – Number of CV folds. Identical fold partitions to R.

  • norm ({"sv", "1", "2"}, default "sv") – Norm constraint. See hapc.single.hapc().

  • predict (np.ndarray, optional) – Test features.

  • max_iter (optional) – PGD parameters used when norm="sv".

  • tol (optional) – PGD parameters used when norm="sv".

  • step_factor (optional) – PGD parameters used when norm="sv".

  • crit (optional) – PGD parameters used when norm="sv".

  • center (optional) – See hapc.single.hapc().

  • approx (optional) – See hapc.single.hapc().

  • verbose (optional) – See hapc.single.hapc().

  • ini (optional) – See hapc.single.hapc().

Returns:

CVResult

Examples

>>> import numpy as np
>>> from hapc import cv_hapc
>>> rng = np.random.default_rng(0)
>>> X = rng.standard_normal((100, 3))
>>> Y = np.sin(np.pi * X[:, 0]) + rng.standard_normal(100) * 0.1
>>> cv = cv_hapc(X, Y, max_degree=2, npcs=99, nfolds=5)
>>> bool(cv.best_lambda > 0)
True
hapc.hazard_hapc(X: ndarray, T: ndarray, Delta: ndarray, norm: str = '1', max_degree: int = 1, npcs: int | None = None, time_grid: ndarray | None = None, log_lambda_min: float = -4, log_lambda_max: float = -1, grid_length: int = 15, nfolds: int = 5, predict: ndarray | None = None, center: bool = True, verbose: bool = False, max_iter: int = 5000, tol: float = 0.001, step_factor: float = 0.8) HazardResult[source]

Discrete-time logistic hazard HAPC fit (family="logit-hazard").

Convenience wrapper around hapc.cv_hapc() with family="binomial". The right-censored survival data (X, T, Delta) are expanded into a person-period table (one row per subject-per-interval-at-risk) whose binary response is the discrete hazard indicator, the visit time is prepended as the first HAL covariate, and the regularisation parameter lambda is chosen by cross-validated logistic deviance. R counterpart: hapc::hazard.hapc().

Parameters:
  • X (np.ndarray, shape (n, p)) – Baseline covariates, one row per subject.

  • T (np.ndarray, shape (n,)) – Observed times T_i = min(T_event_i, C_i). Assumed discrete.

  • Delta (np.ndarray, shape (n,)) – Event indicators Delta_i in {0,1} (1 = event, 0 = right-censored).

  • norm ({"1", "2"}, default "1") – "1" = logistic LASSO, "2" = logistic ridge. "sv" raises NotImplementedError.

  • max_degree (int, default 1) – HAL interaction order over [time, X].

  • npcs (int, optional) – Number of principal components. Defaults to the number of person-period rows (capped internally as in hapc.cv_hapc()).

  • time_grid (np.ndarray, optional) – Discrete time grid (risk-set grid). Defaults to min(T):max(T) when T is integer-valued, else np.unique(T). Subjects are assumed at risk from min(time_grid) onward.

  • log_lambda_min (float / int) – Log-λ CV grid (defaults -4, -1, 15).

  • log_lambda_max (float / int) – Log-λ CV grid (defaults -4, -1, 15).

  • grid_length (float / int) – Log-λ CV grid (defaults -4, -1, 15).

  • nfolds (int, default 5) – Number of CV folds.

  • predict (np.ndarray, optional, shape (m, p)) – Baseline covariates for new subjects. When supplied the model (refit at the CV-selected λ) is evaluated on the full time grid for each new subject, returning the hazard surface and the implied survival curves.

  • center – Passed through to hapc.cv_hapc() / hapc.hapc().

  • verbose – Passed through to hapc.cv_hapc() / hapc.hapc().

  • max_iter – Passed through to hapc.cv_hapc() / hapc.hapc().

  • tol – Passed through to hapc.cv_hapc() / hapc.hapc().

  • step_factor – Passed through to hapc.cv_hapc() / hapc.hapc().

Returns:

HazardResult – See HazardResult. hazard holds the cross-validated discrete hazard for each person-period row; predict_hazard / predict_survival are populated when predict is supplied.

Notes

Model. The discrete hazard is the conditional event probability in interval t given survival up to t,

lambda(t | x) = P(T_event = t | T_event >= t, X = x),

modelled on the logit scale by a Highly Adaptive Principal Components fit f of the augmented covariate (t, x):

logit lambda(t | x) = f(t, x).

The HAL basis spans indicator (and, for max_degree > 1, interaction) tensor products in (t, x), so the time effect, the covariate effects and their interactions are estimated nonparametrically; the L1/L2 penalty (norm) controls smoothness and is tuned by cross-validation.

Person-period likelihood. Under independent right-censoring the observed-data likelihood factorises over the at-risk intervals,

prod_i prod_{t <= T_i}

lambda(t | x_i) ** Y_it * (1 - lambda(t | x_i)) ** (1 - Y_it),

with Y_it = 1(T_event_i = t). This is exactly the Bernoulli (logistic) likelihood of the expanded person-period table, so fitting a binomial HAPC model to Y_it against (t, x_i) estimates the discrete hazard (Cox 1972; Brown 1975; Allison 1982).

Survival. The conditional survival function follows from the estimated hazard by the product-limit relation

S(t | x) = P(T_event > t | x) = prod_{s <= t} (1 - lambda(s | x)),

returned in predict_survival for new subjects when predict is given.

References

Cox, D. R. (1972). Regression models and life-tables. JRSS B, 34(2), 187-220.

Brown, C. C. (1975). On the use of indicator variables for studying the time-dependence of parameters in a response-time model. Biometrics, 31(4), 863-872.

Allison, P. D. (1982). Discrete-time methods for the analysis of event histories. Sociological Methodology, 13, 61-98.

Singer, J. D. and Willett, J. B. (2003). Applied Longitudinal Data Analysis. Oxford University Press.

Benkeser, D. and van der Laan, M. (2016). The Highly Adaptive Lasso estimator. IEEE DSAA, 689-696.

Examples

>>> import numpy as np
>>> from hapc import hazard_hapc
>>> rng = np.random.default_rng(0)
>>> n = 200
>>> X = np.column_stack([rng.uniform(size=n), rng.integers(0, 2, n)]).astype(float)
>>> grid = np.arange(1, 7)
>>> def haz(t, x): return 1 / (1 + np.exp(-(-2.6 + 0.3 * t + 1.3 * x[0] - 0.9 * x[1])))
>>> Tev = np.full(n, grid.max())
>>> for i in range(n):
...     for t in grid:
...         if rng.random() < haz(t, X[i]):
...             Tev[i] = t; break
>>> C = rng.choice(grid, n)
>>> Tobs = np.minimum(Tev, C); Delta = (Tev <= C).astype(float)
>>> fit = hazard_hapc(X, Tobs, Delta, norm="1", max_degree=2, time_grid=grid)
>>> bool(fit.best_lambda > 0)
True
hapc.ate_hapc(X: ndarray, Y: ndarray, A: ndarray, alpha: float = 0.05, max_degree: int = 1, npcs: int | None = None, log_lambda_prop_min: float = -5, log_lambda_prop_max: float = 2, grid_length_prop: int = 14, log_lambda_out_min: float = -10, log_lambda_out_max: float = 2, grid_length_out: int = 22, nfolds: int = 5, norm: str = 'sv', predict: ndarray | None = None, max_iter: int = 5000, tol: float = 0.001, step_factor: float = 0.8, verbose: bool = False, crit: str = 'grad', center: bool = True, approx: bool = False, ini: str = '1', plot_diagnostics: bool = False, method: str = 'undersmooth', cf_folds: int = 5, cf_seed: int = 0, report_undersmoothing: bool = False) ATEResult[source]

ATE estimate with HAPC nuisances and outcome undersmoothing.

Parameters:
  • X (np.ndarray, shape (n, p)) – Covariate matrix W (do NOT include the treatment column).

  • Y (np.ndarray, shape (n,)) – Continuous outcome.

  • A (np.ndarray, shape (n,)) – Binary treatment in {0,1} or {-1,+1}.

  • alpha (float, default 0.05) – Significance level. The returned interval has confidence 1 - alpha.

  • max_degree – Same meaning and defaults as in hapc.cv_hapc() (except λ grids, see below). predict is accepted for signature parity with cv_hapc() and is currently ignored (ate_hapc always evaluates the EIF on the training sample).

  • npcs – Same meaning and defaults as in hapc.cv_hapc() (except λ grids, see below). predict is accepted for signature parity with cv_hapc() and is currently ignored (ate_hapc always evaluates the EIF on the training sample).

  • nfolds – Same meaning and defaults as in hapc.cv_hapc() (except λ grids, see below). predict is accepted for signature parity with cv_hapc() and is currently ignored (ate_hapc always evaluates the EIF on the training sample).

  • norm – Same meaning and defaults as in hapc.cv_hapc() (except λ grids, see below). predict is accepted for signature parity with cv_hapc() and is currently ignored (ate_hapc always evaluates the EIF on the training sample).

  • predict – Same meaning and defaults as in hapc.cv_hapc() (except λ grids, see below). predict is accepted for signature parity with cv_hapc() and is currently ignored (ate_hapc always evaluates the EIF on the training sample).

  • max_iter – Same meaning and defaults as in hapc.cv_hapc() (except λ grids, see below). predict is accepted for signature parity with cv_hapc() and is currently ignored (ate_hapc always evaluates the EIF on the training sample).

  • tol – Same meaning and defaults as in hapc.cv_hapc() (except λ grids, see below). predict is accepted for signature parity with cv_hapc() and is currently ignored (ate_hapc always evaluates the EIF on the training sample).

  • step_factor – Same meaning and defaults as in hapc.cv_hapc() (except λ grids, see below). predict is accepted for signature parity with cv_hapc() and is currently ignored (ate_hapc always evaluates the EIF on the training sample).

  • verbose – Same meaning and defaults as in hapc.cv_hapc() (except λ grids, see below). predict is accepted for signature parity with cv_hapc() and is currently ignored (ate_hapc always evaluates the EIF on the training sample).

  • crit – Same meaning and defaults as in hapc.cv_hapc() (except λ grids, see below). predict is accepted for signature parity with cv_hapc() and is currently ignored (ate_hapc always evaluates the EIF on the training sample).

  • center – Same meaning and defaults as in hapc.cv_hapc() (except λ grids, see below). predict is accepted for signature parity with cv_hapc() and is currently ignored (ate_hapc always evaluates the EIF on the training sample).

  • approx – Same meaning and defaults as in hapc.cv_hapc() (except λ grids, see below). predict is accepted for signature parity with cv_hapc() and is currently ignored (ate_hapc always evaluates the EIF on the training sample).

  • ini – Same meaning and defaults as in hapc.cv_hapc() (except λ grids, see below). predict is accepted for signature parity with cv_hapc() and is currently ignored (ate_hapc always evaluates the EIF on the training sample).

  • log_lambda_prop_min – Equally spaced log-λ grid for propensity cross-validation (A ~ W, binomial), same rule as cv_hapc().

  • log_lambda_prop_max – Equally spaced log-λ grid for propensity cross-validation (A ~ W, binomial), same rule as cv_hapc().

  • grid_length_prop – Equally spaced log-λ grid for propensity cross-validation (A ~ W, binomial), same rule as cv_hapc().

  • log_lambda_out_min – Log-λ grid (defaults -10, 2, 22) for outcome cross-validation Y ~ (A, W) (gaussian) and for the undersmoothing scan over λ λ_CV. The grid deliberately reaches very small λ: for method="undersmooth" the (nearly) unbiased regime typically sits well below the CV-optimal λ, so a grid that stops too high (e.g. the old [e^{-5}, e^{-3}]) never reaches it and no undersmoothing effectively occurs. Undersmoothing only removes bias when the PC basis is rich enough to represent the outcome surface, so use the full basis (npcs = n, the default); a truncated basis floors the bias at an approximation level that no λ can remove.

  • log_lambda_out_max – Log-λ grid (defaults -10, 2, 22) for outcome cross-validation Y ~ (A, W) (gaussian) and for the undersmoothing scan over λ λ_CV. The grid deliberately reaches very small λ: for method="undersmooth" the (nearly) unbiased regime typically sits well below the CV-optimal λ, so a grid that stops too high (e.g. the old [e^{-5}, e^{-3}]) never reaches it and no undersmoothing effectively occurs. Undersmoothing only removes bias when the PC basis is rich enough to represent the outcome surface, so use the full basis (npcs = n, the default); a truncated basis floors the bias at an approximation level that no λ can remove.

  • grid_length_out – Log-λ grid (defaults -10, 2, 22) for outcome cross-validation Y ~ (A, W) (gaussian) and for the undersmoothing scan over λ λ_CV. The grid deliberately reaches very small λ: for method="undersmooth" the (nearly) unbiased regime typically sits well below the CV-optimal λ, so a grid that stops too high (e.g. the old [e^{-5}, e^{-3}]) never reaches it and no undersmoothing effectively occurs. Undersmoothing only removes bias when the PC basis is rich enough to represent the outcome surface, so use the full basis (npcs = n, the default); a truncated basis floors the bias at an approximation level that no λ can remove.

  • plot_diagnostics (bool, default False) – If True, open a matplotlib figure with (1) propensity CV curve (logistic deviance vs λ), (2) outcome CV curve (MSE vs λ), and (3) the undersmoothing path: |mean(EIF)| vs outcome λ with the threshold line and vertical markers for the CV and selected undersmoothed λ. Requires matplotlib (pip install matplotlib). Only meaningful for method="undersmooth" (ignored otherwise).

  • method ({"undersmooth", "crossfit"}, default "undersmooth") –

    Bias-control strategy.

    • "undersmooth" — the original single-sample procedure: fit the nuisances on the full sample, evaluate the EIF in-sample, and pick the outcome λ by the undersmoothing gate (Notes). Kept as the default for backward compatibility.

    • "crossfit" — DML-style K-fold cross-fitting: CV-select and fit both nuisances on the training folds and evaluate the EIF only on the held-out fold. No undersmoothing is performed (CV-optimal λ is used per fold); the outcome λ grid is still used for the per-fold CV. This makes the EIF mean and variance honest, so the AIPW estimate is √n-unbiased and the Wald CI attains nominal coverage without relying on the undersmoothing gate. Prefer this unless you specifically need the single-sample undersmoothed estimator.

  • cf_folds (int, default 5) – Number of cross-fitting folds (only used when method="crossfit"). Folds are stratified by treatment so both arms appear in every split.

  • cf_seed (int, default 0) – Seed for the (deterministic) cross-fitting fold assignment.

  • report_undersmoothing (bool, default False) – If True (and method="undersmooth"), print a table to stdout of |mean(EIF)| versus outcome λ over the undersmoothing region (λ λ_CV), with the threshold τ = σ_CV / (√n · log n), so the user can confirm the undersmoothing gate fired correctly. Ignored for method="crossfit" (which evaluates the EIF out-of-fold).

Returns:

ATEResult – Named tuple with three fields (estimate, lower, upper).

Notes

The procedure is:

  1. Cross-validate the propensity A ~ W (binomial) on its grid and the outcome Y ~ (A, W) (gaussian) on the outcome grid (independently specified).

  2. Fix the propensity at its CV-best λ; refit on the full sample to obtain π̂(W_i) = P(A=1 | W_i).

  3. At the CV-best outcome λ, compute a plugin-centered influence vector (same mean as the DR EIF at \(\psi=\overline{\mu}_1-\overline{\mu}_0\)) and let σ = std(·).

  4. Threshold τ = σ / (√n · log n).

  5. Among the outcome λ in the undersmoothing region λ λ_CV, pick the smallest λ for which |mean(EIF_diff)| τ (the most undersmoothed fit still inside the band) — call it λ_u. If none qualifies, λ_u minimises |mean(EIF_diff)| (with a warning).

  6. Doubly robust point estimate at the undersmoothed outcome fit (π̂, μ̂₁^{λ_u}, μ̂₀^{λ_u}): ψ̂ = mean(A/π̂·(Y-μ̂₁)+μ̂₁ - (1-A)/(1-π̂)·(Y-μ̂₀) - μ̂₀). The CI uses the one-step influence function evaluated at the CV outcome fit (μ̂₁^{λ_CV}, μ̂₀^{λ_CV}) (centered at ψ̂): φ_i = A_i/π̂_i·(Y_i-μ̂_{1i}) + μ̂_{1i} - (1-A_i)/(1-π̂_i)·(Y_i-μ̂_{0i}) - μ̂_{0i} - ψ̂, and CI: ψ̂ ± z_{1-α/2} · std(φ) / √n. The CV (rather than undersmoothed) fit is used for std(φ) because a heavily undersmoothed, near-interpolating fit has artificially small in-sample residuals that collapse the SE and destroy coverage; the CV fit yields honest residuals. (For method="crossfit" no such issue arises — the EIF is evaluated out-of-fold.)

    This contrasts with plug-in G-computation mean(μ̂₁(W)-μ̂₀(W)), which can be materially biased when both nuisances are estimated on the same sample and the outcome regressions are regularized. The DR ψ̂ is consistent if either the propensity or the pair (μ̂₁, μ̂₀) is correctly specified (standard double robustness).

Examples

>>> import numpy as np
>>> from hapc import ate_hapc
>>> rng = np.random.default_rng(0)
>>> n = 200
>>> W = np.column_stack([rng.uniform(-2, 2, n), rng.normal(0, 0.5, n)])
>>> p = 1.0 / (1.0 + np.exp(-(W[:, 0] + 0.5 * W[:, 1])))
>>> A = rng.binomial(1, p, n)
>>> Y = 2 * W[:, 0] + 0.5 + rng.normal(0, 0.5, n)  # truth: ATE=0
>>> res = ate_hapc(W, Y, A, alpha=0.05, max_degree=2, npcs=50,
...                grid_length_prop=4, grid_length_out=4, nfolds=3,
...                norm="2")
>>> bool(res.lower <= res.estimate <= res.upper)
True

Result types

class hapc.HazardResult(hazard: ndarray, ids: ndarray, times: ndarray, Y: ndarray, time_grid: ndarray, lambdas: ndarray, risk: ndarray, best_lambda: float, interior: bool, cv: CVResult, predict_hazard: ndarray | None, predict_survival: ndarray | None)[source]

Output of hazard_hapc().

hazard

Estimated discrete hazard for each person-period row (aligned with ids/times); the cross-validated predictions at the winning λ.

Type:

np.ndarray, shape (N,)

ids

Subject index (0-based) for each person-period row.

Type:

np.ndarray, shape (N,)

times

Grid time for each person-period row.

Type:

np.ndarray, shape (N,)

Y

Binary hazard label for each person-period row.

Type:

np.ndarray, shape (N,)

time_grid

The discrete time grid used.

Type:

np.ndarray, shape (K,)

lambdas

CV λ grid.

Type:

np.ndarray, shape (L,)

risk

Mean cross-validated logistic deviance per λ.

Type:

np.ndarray, shape (L,)

best_lambda

Deviance-minimising λ.

Type:

float

interior

True when best_lambda is strictly inside the grid (not at either endpoint) — a basic check that the grid brackets the optimum.

Type:

bool

cv

The full underlying hapc.cv_hapc() result.

Type:

CVResult

predict_hazard

Hazard surface for new subjects (only when predict is supplied).

Type:

np.ndarray or None, shape (m, K)

predict_survival

Survival curves S(t|x) = prod_{g<=t}(1 - hazard(g|x)) for new subjects (only when predict is supplied).

Type:

np.ndarray or None, shape (m, K)

class hapc.CVResult(mses: ndarray, lambdas: ndarray, best_lambda: float, best_model_alpha: ndarray, predictions: ndarray | None)[source]

Output of all CV variants.

mses

Mean cross-validated risk per λ. For binomial CV this is logistic deviance (the field is still called mses for API uniformity).

Type:

np.ndarray, shape (L,)

lambdas

λ grid that was searched.

Type:

np.ndarray, shape (L,)

best_lambda

λ minimising mses.

Type:

float

best_model_alpha

α refit on the full training data at best_lambda.

Type:

np.ndarray, shape (k,)

predictions

Predictions on predict (if supplied). For pcghal_cv_classi these are probabilities; for pcghal_cv / fasthal_cv they match the underlying regression head (continuous).

Type:

np.ndarray or None

class hapc.ATEResult(estimate: float, lower: float, upper: float)[source]

Output of ate_hapc().

estimate

Doubly robust (AIPW-style) ATE at the undersmoothed outcome model: mean(A/π̂·(Y-μ̂₁)+μ̂₁ - (1-A)/(1-π̂)·(Y-μ̂₀) - μ̂₀), matching the efficient influence function used for the Wald interval (see Notes).

Type:

float

lower

Lower endpoint of the (1 - alpha) Wald confidence interval.

Type:

float

upper

Upper endpoint of the (1 - alpha) Wald confidence interval.

Type:

float

Design & kernels

hapc.design_hapc(X: ndarray, max_degree: int, npcs: int, center: bool = True) DesignOutput[source]

Compute the PC-HAL design ingredients (H, U, d, V).

Counterpart to R design.hapc().

Parameters:
  • X (np.ndarray, shape (n, p)) – Feature matrix.

  • max_degree (int) – Maximum interaction order for the HAL basis.

  • npcs (int) – Number of principal components to keep.

  • center (bool, default True) – Whether to centre H before SVD.

Returns:

DesignOutput – Named tuple (H, U, d, V).

Examples

>>> import numpy as np
>>> from hapc import design_hapc
>>> X = np.random.randn(50, 3)
>>> des = design_hapc(X, max_degree=2, npcs=20)
>>> des.U.shape
(50, 20)
hapc.kernel_hapc(X: ndarray, max_degree: int = 1, center: bool = True) ndarray[source]

HAL kernel matrix K = H Hᵀ (computed implicitly).

Counterpart to R kernel.hapc().

Parameters:
  • X (np.ndarray, shape (n, p)) – Feature matrix.

  • max_degree (int, default 1) – Maximum interaction order.

  • center (bool, default True) – Whether to centre H.

Returns:

np.ndarray, shape (n, n) – Kernel matrix.

hapc.cross_kernel_hapc(X: ndarray, Xte: ndarray, max_degree: int = 1, center: bool = True) ndarray[source]

Cross-kernel K = H_te H_trᵀ for prediction.

Counterpart to R cross_kernel.hapc().

Parameters:
  • X (np.ndarray, shape (n_tr, p)) – Training features (defines the basis).

  • Xte (np.ndarray, shape (n_te, p)) – Test features.

  • max_degree (int, default 1) – Maximum interaction order.

  • center (bool, default True) – Whether to apply the same column centring as in training.

Returns:

np.ndarray, shape (n_te, n_tr) – Cross-kernel matrix.

Solvers & single-λ fits

hapc.ridge_regression(Y: ndarray, U: ndarray, D2: ndarray, lambda_: float) ndarray[source]

Closed-form ridge in the rotated basis: α = (D² + λ)⁻¹ D Uᵀ Y.

Counterpart to R ridge_call().

Parameters:
  • Y (np.ndarray, shape (n,)) – Response.

  • U (np.ndarray, shape (n, k)) – Left singular vectors.

  • D2 (np.ndarray, shape (k,)) – Squared singular values.

  • lambda (float) – Regularisation parameter.

Returns:

np.ndarray, shape (k,) – Ridge coefficients on the PC basis.

hapc.fast_pchal(U: ndarray, D2: ndarray, Y: ndarray, lambda_: float) ndarray[source]

Closed-form LASSO in the rotated basis (soft-thresholding).

Counterpart to R fast_pchal_call().

Parameters:
  • U (np.ndarray, shape (n, k)) – Left singular vectors.

  • D2 (np.ndarray, shape (k,)) – Squared singular values.

  • Y (np.ndarray, shape (n,)) – Response.

  • lambda (float) – Regularisation parameter.

Returns:

np.ndarray, shape (k,) – LASSO coefficients on the PC basis.

hapc.single_pcghal(X: ndarray, Y: ndarray, max_degree: int, npcs: int, lambda_: float, predict: ndarray | None = None, center: bool = True, approx: bool = False, verbose: bool = False, max_iter: int = 5000, tol: float = 0.001, step_factor: float = 0.8, crit: str = 'grad', ini: str = '1') SinglePcghalResult[source]

Single-λ gaussian fit with PGD (norm="sv").

Calls C++ single_pcghal_fit, which initialises α with LASSO (ini="1", default) or ridge (ini="2"), then runs projected gradient descent.

Parameters:
hapc.single_lambda_fit(X: ndarray, Y: ndarray, max_degree: int, npcs: int, lambda_: float, predict: ndarray | None = None, center: bool = True, approx: bool = False, l1: bool = False) SingleLambdaResult[source]

Closed-form gaussian fit for norm="1" (l1=True) or norm="2".

Parameters:
  • X (np.ndarray, shape (n, p)) – Features.

  • Y (np.ndarray, shape (n,)) – Response.

  • max_degree (int) – Max interaction order.

  • npcs (int) – Number of PCs.

  • lambda (float) – Regularisation parameter.

  • predict (np.ndarray, optional) – Test features. If supplied, predictions are returned.

  • center (bool, default True) – Centre H and Y.

  • approx (bool, default False) – (Currently ignored; reserved for future power-iteration eigensolver.)

  • l1 (bool, default False) – True for LASSO (norm=”1”), False for ridge (norm=”2”).

Returns:

SingleLambdaResult

hapc.single_pcghal_classification(X: ndarray, Y: ndarray, max_degree: int, npcs: int, lambda_: float, predict: ndarray | None = None, center: bool = True, verbose: bool = False, max_iter: int = 5000, tol: float = 0.001, step_factor: float = 0.8) SinglePcghalClassificationResult[source]

Single-λ binomial HAPC with norm="sv" (logistic loss + PGD).

Matches R hapc(..., family="binomial", norm="sv"): logistic ridge initialisation for α, then the same projected-gradient optimiser as pcghal_classi_call.

Parameters:
  • X (np.ndarray, shape (n, p)) – Features.

  • Y (np.ndarray, shape (n,)) – Binary labels in {0,1} or {-1,+1}.

  • max_degree (int) – HAL interaction order.

  • npcs (int) – Number of PCs (same role as in design_hapc()).

  • lambda (float) – Regularisation strength (passed to the ridge initialiser and CV-style full-data refit used in the shared C++ layer).

  • predict (np.ndarray, optional, shape (m, p)) – Test features. When given, returns log-odds, probabilities, and thresholded classes in {-1,+1}.

  • center (bool, default True) – Centre the HAL design.

  • verbose (bool, default False) – Print PGD diagnostics.

  • max_iter (int, default 5000) – Maximum PGD iterations.

  • tol (float, default 1e-3) – Convergence tolerance for PGD.

  • step_factor (float, default 0.8) – PGD step-size factor.

Returns:

SinglePcghalClassificationResult – Fields alpha, risk, iter, and optional predictions / probabilities / predicted_classes.

hapc.single_pcghal_classification_ridge_only(X: ndarray, Y: ndarray, max_degree: int, npcs: int, lambda_: float, predict: ndarray | None = None, center: bool = True, verbose: bool = False) SinglePcghalClassificationResult[source]

Binomial norm="2": logistic ridge initialiser only (no PGD).

Mirrors R hapc(..., family="binomial", norm="2") and is intended as a fast linear-in-kernel baseline.

Parameters:
Returns:

SinglePcghalClassificationResult

hapc.single_pcghal_classification_lasso(X: ndarray, Y: ndarray, max_degree: int, npcs: int, lambda_: float, predict: ndarray | None = None, center: bool = True, verbose: bool = False, max_iter: int = 1000) SinglePcghalClassificationResult[source]

Single-λ binomial HAPC with norm="1" — logistic LASSO.

Solves the L1-penalised logistic regression in the PC basis using sklearn.linear_model.LogisticRegression with the liblinear solver. Mirrors R hapc(family="binomial", norm="1"), which uses glmnet(..., family="binomial", alpha=1, intercept=FALSE) on the same Xtilde = U * d design.

Lambda parameterisation. Sklearn’s L1 logistic objective with liblinear is

min_w   ||w||_1  +  C * sum_i log(1 + exp(-y_i x_i^T w))      (1)

while the package convention (matching glmnet’s binomial loss) is

min_w   (1/n) * sum_i log(1 + exp(-y_i x_i^T w))  +  lambda * ||w||_1   (2)

Equating (1) and (2) up to scaling yields C = 1 / (n * lambda), which is what we use here so that the same lambda produces matching coefficients in R (glmnet) and Python (sklearn).

Parameters:
  • X (np.ndarray, shape (n, p)) – Features.

  • Y (np.ndarray, shape (n,)) – Binary labels in {0,1} or {-1,+1}.

  • max_degree (int) – HAL interaction order.

  • npcs (int) – Number of PCs.

  • lambda (float) – L1 regularisation strength on the per-sample loss; must be > 0.

  • predict (np.ndarray, optional, shape (m, p)) – Test features.

  • center (bool, default True) – Centre the HAL design.

  • verbose (bool, default False) – Reserved for parity with sibling routines (sklearn is silent).

  • max_iter (int, default 1000) – Maximum iterations for liblinear.

Returns:

SinglePcghalClassificationResultalpha is the LASSO coefficient on the PC basis, risk is the training logistic loss, iter is the iteration count returned by liblinear, and predictions / probabilities / predicted_classes are populated when predict is supplied.

Cross-validation

hapc.pcghal_cv(X: ndarray, Y: ndarray, max_degree: int, npcs: int, lambdas: ndarray | None = None, log_lambda_min: float = -5, log_lambda_max: float = -3, grid_length: int = 10, nfolds: int = 5, predict: ndarray | None = None, center: bool = True, approx: bool = False, verbose: bool = False, max_iter: int = 5000, tol: float = 0.001, step_factor: float = 0.8, crit: str = 'grad', ini: str = '1') CVResult[source]

k-fold CV for gaussian HAPC with norm="sv".

Per fold: initialise α with LASSO (ini="1", default) or ridge (ini="2"), then projected gradient descent on squared error — same pipeline as R cv.hapc(family="gaussian", norm="sv").

Delegates to C++ pcghal_cv_fit (shared with R), including fold IDs.

Parameters:
  • X (np.ndarray, shape (n, p)) – Features.

  • Y (np.ndarray, shape (n,)) – Continuous response.

  • max_degree (int) – HAL interaction order.

  • npcs (int) – Number of PCs.

  • lambdas (np.ndarray, optional) – Explicit λ grid. If None, built from log_* and grid_length.

  • log_lambda_min (float) – Log-space grid endpoints when lambdas is None.

  • log_lambda_max (float) – Log-space grid endpoints when lambdas is None.

  • grid_length (int, default 10) – Number of grid points when lambdas is None.

  • nfolds (int, default 5) – CV folds.

  • predict (np.ndarray, optional, shape (m, p)) – Test matrix for predictions at the winning λ.

  • center (bool) – Passed through to design / kernel (approx reserved).

  • approx (bool) – Passed through to design / kernel (approx reserved).

  • verbose (bool) – Print PGD progress.

  • max_iter (float / int) – PGD budget and step-size factor.

  • tol (float / int) – PGD budget and step-size factor.

  • step_factor (float / int) – PGD budget and step-size factor.

  • crit ({"grad", "risk"}, default "grad") – PGD stopping rule (matches R default).

  • ini ({"1", "2"}, default "1") – "1" = LASSO initialisation, "2" = ridge.

Returns:

CVResultmses holds mean CV MSE per λ; best_model_alpha is α refit on all data at best_lambda.

hapc.pcghal_cv_classi(X: ndarray, Y: ndarray, max_degree: int, npcs: int, lambdas: ndarray | None = None, log_lambda_min: float = -5, log_lambda_max: float = -3, grid_length: int = 10, nfolds: int = 5, predict: ndarray | None = None, center: bool = True, verbose: bool = False, max_iter: int = 5000, tol: float = 0.001, step_factor: float = 0.8, with_pgd: bool = True) CVResult[source]

k-fold logistic-loss CV for binomial HAPC.

Calls the shared C++ logistic CV used by R’s cv.hapc(family="binomial"). Risk is always logistic deviance; we never use squared error on a binary response.

Parameters:
  • X (np.ndarray) – Features and 0/1 response.

  • Y (np.ndarray) – Features and 0/1 response.

  • with_pgd (bool, default True) – Truenorm="sv" (logistic ridge initialiser + projected gradient descent on logistic loss, per fold). Falsenorm="2" (logistic ridge only, no PGD), still scored with logistic deviance.

  • max_degree – See cv_hapc().

  • npcs – See cv_hapc().

  • lambdas – See cv_hapc().

  • log_lambda_min – See cv_hapc().

  • log_lambda_max – See cv_hapc().

  • grid_length – See cv_hapc().

  • nfolds – See cv_hapc().

  • predict – See cv_hapc().

  • center – See cv_hapc().

  • verbose – See cv_hapc().

  • max_iter – See cv_hapc().

  • tol – See cv_hapc().

  • step_factor – See cv_hapc().

Returns:

CVResultmses holds mean per-fold logistic deviance. predictions are probabilities on predict.

hapc.pcghal_cv_classi_lasso(X: ndarray, Y: ndarray, max_degree: int, npcs: int, lambdas: ndarray | None = None, log_lambda_min: float = -5, log_lambda_max: float = -3, grid_length: int = 10, nfolds: int = 5, predict: ndarray | None = None, center: bool = True, verbose: bool = False, max_iter: int = 1000) CVResult[source]

k-fold logistic-LASSO CV for binomial HAPC, norm="1".

Per fold, fits hapc.single_pcghal_classification_lasso() (sklearn LogisticRegression with penalty="l1", solver="liblinear", fit_intercept=False) on the training portion and scores the held-out portion with logistic deviance. Mirrors R cv.hapc(family="binomial", norm="1"), which uses glmnet per fold.

:param See cv_hapc().:

Returns:

CVResultmses holds mean per-fold logistic deviance. best_model_alpha is the LASSO α refit on the full data at best_lambda. predictions are probabilities at best_lambda on predict.

hapc.fasthal_cv(X: ndarray, Y: ndarray, npcs: int, lambdas: ndarray, nfolds: int = 5, predict: ndarray | None = None, max_degree: int = 1, center: bool = True, approx: bool = False, l1: bool = True) CVResult[source]

k-fold CV with closed-form LASSO or ridge in the PC basis per fold.

Used for gaussian norm="1" / norm="2" and, in R, also for family="binomial" with those norms (squared-error CV on Y).

Calls the same C++ routine as R fasthal_cv_call with identical fold construction (std::mt19937(12345) shuffle), so CV scores match R.

Parameters:
  • X (np.ndarray, shape (n, p)) – Training features.

  • Y (np.ndarray, shape (n,)) – Response (continuous, or binary labels if using the binomial branch in cv_hapc()).

  • npcs (int) – Number of principal components.

  • lambdas (np.ndarray, shape (L,)) – Penalty grid (must be supplied explicitly; cv_hapc() builds it).

  • nfolds (int, default 5) – Number of CV folds.

  • predict (np.ndarray, optional, shape (m, p)) – Test features for out-of-sample predictions at the CV-winning λ.

  • max_degree (int, default 1) – HAL interaction order.

  • center (bool, default True) – Centre kernel / response like training.

  • approx (bool, default False) – If True, approximate top eigenvectors (same flag as R).

  • l1 (bool, default True) – True for LASSO (norm="1"), False for ridge (norm="2").

Returns:

CVResultmses are mean squared errors on held-out folds.