API reference¶
This page is generated directly from the package docstrings.
High-level model fitting¶
- hapc.hapc(X: ndarray, Y: ndarray, family: str = 'gaussian', max_degree: int = 1, npcs: int | None = None, lambda_: float = 0.01, norm: str = 'sv', predict: ndarray | None = None, max_iter: int = 5000, tol: float = 0.001, step_factor: float = 0.8, verbose: bool = False, crit: str = 'grad', center: bool = True, approx: bool = False, ini: str = '1')[source]¶
Single-λ HAPC fit (Python counterpart of R
hapc()).- Parameters:
X (np.ndarray, shape (n, p)) – Features.
Y (np.ndarray, shape (n,)) – Response. For
family="binomial": hard labels in{0,1}or{-1,+1}, or soft labels in[0,1](e.g. EM-HAL E-step posteriors). Soft labels are supported only fornormin{"1","2"};norm="sv"requires hard labels.family ({"gaussian", "binomial"}, default "gaussian") – Loss family.
max_degree (int, default 1) – Maximum interaction order for the HAL basis.
npcs (int, optional) – Number of principal components. Defaults to
n.lambda (float, default 0.01) – Regularisation parameter.
norm ({"sv", "1", "2"}, default "sv") – Norm constraint.
"sv"runs PGD,"1"is closed-form LASSO,"2"is closed-form ridge.predict (np.ndarray, optional) – Test features for prediction.
max_iter (int, default 5000) – Maximum PGD iterations (
norm="sv"only).tol (float, default 1e-3) – Convergence tolerance (
norm="sv"only).step_factor (float, default 0.8) – PGD step factor (
norm="sv"only).verbose (bool, default False) – Print progress (
norm="sv"only).crit ({"grad", "risk"}, default "grad") – PGD stopping criterion.
center (bool, default True) – Centre H and Y.
approx (bool, default False) – Use approximate eigendecomposition for
norm in {"1","2"}(currently ignored — exact eigendecomposition is always used).ini ({"1", "2"}, default "1") – PGD initialiser for
norm="sv":"1"= LASSO (default, matches R),"2"= ridge.
- Returns:
object – One of
SinglePcghalResult(gaussiannorm="sv"),SingleLambdaResult(gaussiannormin{"1","2"}), orSinglePcghalClassificationResult(binomial).
Examples
>>> import numpy as np >>> from hapc import hapc >>> rng = np.random.default_rng(0) >>> X = rng.standard_normal((50, 3)) >>> Y = X[:, 0] + 0.5 * X[:, 1] + rng.standard_normal(50) * 0.1 >>> fit = hapc(X, Y, max_degree=2, npcs=20, lambda_=0.05, norm="sv") >>> fit.alpha.shape (...,)
- hapc.cv_hapc(X: ndarray, Y: ndarray, family: str = 'gaussian', max_degree: int = 1, npcs: int | None = None, log_lambda_min: float = -5, log_lambda_max: float = -3, grid_length: int = 10, nfolds: int = 5, norm: str = 'sv', predict: ndarray | None = None, max_iter: int = 5000, tol: float = 0.001, step_factor: float = 0.8, verbose: bool = False, crit: str = 'grad', center: bool = True, approx: bool = False, ini: str = '1', Delta: ndarray | None = None, time_grid: ndarray | None = None) CVResult[source]¶
k-fold cross-validated HAPC fit (Python counterpart of R
cv.hapc()).- Parameters:
X (np.ndarray, shape (n, p)) – Features.
Y (np.ndarray, shape (n,)) – Response.
family ({"gaussian", "binomial", "logit-hazard"}, default "gaussian") – Loss family. For
"binomial"the loss is always logistic (deviance CV, not MSE).norm="sv"runs logistic ridge + PGD per fold;norm="2"runs logistic ridge only;norm="1"runs logistic LASSO viasklearn.linear_model.LogisticRegressionwithsolver="liblinear"(mirrors Rglmnetfor the binomial+norm=”1” path)."logit-hazard"is a discrete-time logistic hazard model:Yis then the observed timeT = min(T_event, C),Delta(event indicator) is required, and the call forwards tohapc.hazard_hapc()(returning ahapc.hazard.HazardResult, not aCVResult).norm="sv"is not supported for this family.Delta (np.ndarray, optional) – Event indicator (0/1), required only for
family="logit-hazard".time_grid (np.ndarray, optional) – Discrete time grid for
family="logit-hazard"(seehapc.hazard_hapc()).max_degree (int, default 1) – Maximum interaction order.
npcs (int, optional) – Number of PCs. Defaults to
n.log_lambda_min (float) – Bounds of the log-λ grid (defaults
-5,-3).log_lambda_max (float) – Bounds of the log-λ grid (defaults
-5,-3).grid_length (int, default 10) – Number of grid points.
nfolds (int, default 5) – Number of CV folds. Identical fold partitions to R.
norm ({"sv", "1", "2"}, default "sv") – Norm constraint. See
hapc.single.hapc().predict (np.ndarray, optional) – Test features.
max_iter (optional) – PGD parameters used when
norm="sv".tol (optional) – PGD parameters used when
norm="sv".step_factor (optional) – PGD parameters used when
norm="sv".crit (optional) – PGD parameters used when
norm="sv".center (optional) – See
hapc.single.hapc().approx (optional) – See
hapc.single.hapc().verbose (optional) – See
hapc.single.hapc().ini (optional) – See
hapc.single.hapc().
- Returns:
CVResult
Examples
>>> import numpy as np >>> from hapc import cv_hapc >>> rng = np.random.default_rng(0) >>> X = rng.standard_normal((100, 3)) >>> Y = np.sin(np.pi * X[:, 0]) + rng.standard_normal(100) * 0.1 >>> cv = cv_hapc(X, Y, max_degree=2, npcs=99, nfolds=5) >>> bool(cv.best_lambda > 0) True
- hapc.hazard_hapc(X: ndarray, T: ndarray, Delta: ndarray, norm: str = '1', max_degree: int = 1, npcs: int | None = None, time_grid: ndarray | None = None, log_lambda_min: float = -4, log_lambda_max: float = -1, grid_length: int = 15, nfolds: int = 5, predict: ndarray | None = None, center: bool = True, verbose: bool = False, max_iter: int = 5000, tol: float = 0.001, step_factor: float = 0.8) HazardResult[source]
Discrete-time logistic hazard HAPC fit (
family="logit-hazard").Convenience wrapper around
hapc.cv_hapc()withfamily="binomial". The right-censored survival data(X, T, Delta)are expanded into a person-period table (one row per subject-per-interval-at-risk) whose binary response is the discrete hazard indicator, the visit time is prepended as the first HAL covariate, and the regularisation parameterlambdais chosen by cross-validated logistic deviance. R counterpart:hapc::hazard.hapc().- Parameters:
X (np.ndarray, shape (n, p)) – Baseline covariates, one row per subject.
T (np.ndarray, shape (n,)) – Observed times
T_i = min(T_event_i, C_i). Assumed discrete.Delta (np.ndarray, shape (n,)) – Event indicators
Delta_i in {0,1}(1 = event, 0 = right-censored).norm ({"1", "2"}, default "1") –
"1"= logistic LASSO,"2"= logistic ridge."sv"raisesNotImplementedError.max_degree (int, default 1) – HAL interaction order over
[time, X].npcs (int, optional) – Number of principal components. Defaults to the number of person-period rows (capped internally as in
hapc.cv_hapc()).time_grid (np.ndarray, optional) – Discrete time grid (risk-set grid). Defaults to
min(T):max(T)whenTis integer-valued, elsenp.unique(T). Subjects are assumed at risk frommin(time_grid)onward.log_lambda_min (float / int) – Log-λ CV grid (defaults
-4,-1,15).log_lambda_max (float / int) – Log-λ CV grid (defaults
-4,-1,15).grid_length (float / int) – Log-λ CV grid (defaults
-4,-1,15).nfolds (int, default 5) – Number of CV folds.
predict (np.ndarray, optional, shape (m, p)) – Baseline covariates for new subjects. When supplied the model (refit at the CV-selected λ) is evaluated on the full time grid for each new subject, returning the hazard surface and the implied survival curves.
center – Passed through to
hapc.cv_hapc()/hapc.hapc().verbose – Passed through to
hapc.cv_hapc()/hapc.hapc().max_iter – Passed through to
hapc.cv_hapc()/hapc.hapc().tol – Passed through to
hapc.cv_hapc()/hapc.hapc().step_factor – Passed through to
hapc.cv_hapc()/hapc.hapc().
- Returns:
HazardResult – See
HazardResult.hazardholds the cross-validated discrete hazard for each person-period row;predict_hazard/predict_survivalare populated whenpredictis supplied.
Notes
Model. The discrete hazard is the conditional event probability in interval
tgiven survival up tot,lambda(t | x) = P(T_event = t | T_event >= t, X = x),
modelled on the logit scale by a Highly Adaptive Principal Components fit
fof the augmented covariate(t, x):logit lambda(t | x) = f(t, x).
The HAL basis spans indicator (and, for
max_degree > 1, interaction) tensor products in(t, x), so the time effect, the covariate effects and their interactions are estimated nonparametrically; the L1/L2 penalty (norm) controls smoothness and is tuned by cross-validation.Person-period likelihood. Under independent right-censoring the observed-data likelihood factorises over the at-risk intervals,
- prod_i prod_{t <= T_i}
lambda(t | x_i) ** Y_it * (1 - lambda(t | x_i)) ** (1 - Y_it),
with
Y_it = 1(T_event_i = t). This is exactly the Bernoulli (logistic) likelihood of the expanded person-period table, so fitting a binomial HAPC model toY_itagainst(t, x_i)estimates the discrete hazard (Cox 1972; Brown 1975; Allison 1982).Survival. The conditional survival function follows from the estimated hazard by the product-limit relation
S(t | x) = P(T_event > t | x) = prod_{s <= t} (1 - lambda(s | x)),
returned in
predict_survivalfor new subjects whenpredictis given.References
Cox, D. R. (1972). Regression models and life-tables. JRSS B, 34(2), 187-220.
Brown, C. C. (1975). On the use of indicator variables for studying the time-dependence of parameters in a response-time model. Biometrics, 31(4), 863-872.
Allison, P. D. (1982). Discrete-time methods for the analysis of event histories. Sociological Methodology, 13, 61-98.
Singer, J. D. and Willett, J. B. (2003). Applied Longitudinal Data Analysis. Oxford University Press.
Benkeser, D. and van der Laan, M. (2016). The Highly Adaptive Lasso estimator. IEEE DSAA, 689-696.
Examples
>>> import numpy as np >>> from hapc import hazard_hapc >>> rng = np.random.default_rng(0) >>> n = 200 >>> X = np.column_stack([rng.uniform(size=n), rng.integers(0, 2, n)]).astype(float) >>> grid = np.arange(1, 7) >>> def haz(t, x): return 1 / (1 + np.exp(-(-2.6 + 0.3 * t + 1.3 * x[0] - 0.9 * x[1]))) >>> Tev = np.full(n, grid.max()) >>> for i in range(n): ... for t in grid: ... if rng.random() < haz(t, X[i]): ... Tev[i] = t; break >>> C = rng.choice(grid, n) >>> Tobs = np.minimum(Tev, C); Delta = (Tev <= C).astype(float) >>> fit = hazard_hapc(X, Tobs, Delta, norm="1", max_degree=2, time_grid=grid) >>> bool(fit.best_lambda > 0) True
- hapc.ate_hapc(X: ndarray, Y: ndarray, A: ndarray, alpha: float = 0.05, max_degree: int = 1, npcs: int | None = None, log_lambda_prop_min: float = -5, log_lambda_prop_max: float = 2, grid_length_prop: int = 14, log_lambda_out_min: float = -10, log_lambda_out_max: float = 2, grid_length_out: int = 22, nfolds: int = 5, norm: str = 'sv', predict: ndarray | None = None, max_iter: int = 5000, tol: float = 0.001, step_factor: float = 0.8, verbose: bool = False, crit: str = 'grad', center: bool = True, approx: bool = False, ini: str = '1', plot_diagnostics: bool = False, method: str = 'undersmooth', cf_folds: int = 5, cf_seed: int = 0, report_undersmoothing: bool = False) ATEResult[source]¶
ATE estimate with HAPC nuisances and outcome undersmoothing.
- Parameters:
X (np.ndarray, shape (n, p)) – Covariate matrix
W(do NOT include the treatment column).Y (np.ndarray, shape (n,)) – Continuous outcome.
A (np.ndarray, shape (n,)) – Binary treatment in
{0,1}or{-1,+1}.alpha (float, default 0.05) – Significance level. The returned interval has confidence
1 - alpha.max_degree – Same meaning and defaults as in
hapc.cv_hapc()(except λ grids, see below).predictis accepted for signature parity withcv_hapc()and is currently ignored (ate_hapcalways evaluates the EIF on the training sample).npcs – Same meaning and defaults as in
hapc.cv_hapc()(except λ grids, see below).predictis accepted for signature parity withcv_hapc()and is currently ignored (ate_hapcalways evaluates the EIF on the training sample).nfolds – Same meaning and defaults as in
hapc.cv_hapc()(except λ grids, see below).predictis accepted for signature parity withcv_hapc()and is currently ignored (ate_hapcalways evaluates the EIF on the training sample).norm – Same meaning and defaults as in
hapc.cv_hapc()(except λ grids, see below).predictis accepted for signature parity withcv_hapc()and is currently ignored (ate_hapcalways evaluates the EIF on the training sample).predict – Same meaning and defaults as in
hapc.cv_hapc()(except λ grids, see below).predictis accepted for signature parity withcv_hapc()and is currently ignored (ate_hapcalways evaluates the EIF on the training sample).max_iter – Same meaning and defaults as in
hapc.cv_hapc()(except λ grids, see below).predictis accepted for signature parity withcv_hapc()and is currently ignored (ate_hapcalways evaluates the EIF on the training sample).tol – Same meaning and defaults as in
hapc.cv_hapc()(except λ grids, see below).predictis accepted for signature parity withcv_hapc()and is currently ignored (ate_hapcalways evaluates the EIF on the training sample).step_factor – Same meaning and defaults as in
hapc.cv_hapc()(except λ grids, see below).predictis accepted for signature parity withcv_hapc()and is currently ignored (ate_hapcalways evaluates the EIF on the training sample).verbose – Same meaning and defaults as in
hapc.cv_hapc()(except λ grids, see below).predictis accepted for signature parity withcv_hapc()and is currently ignored (ate_hapcalways evaluates the EIF on the training sample).crit – Same meaning and defaults as in
hapc.cv_hapc()(except λ grids, see below).predictis accepted for signature parity withcv_hapc()and is currently ignored (ate_hapcalways evaluates the EIF on the training sample).center – Same meaning and defaults as in
hapc.cv_hapc()(except λ grids, see below).predictis accepted for signature parity withcv_hapc()and is currently ignored (ate_hapcalways evaluates the EIF on the training sample).approx – Same meaning and defaults as in
hapc.cv_hapc()(except λ grids, see below).predictis accepted for signature parity withcv_hapc()and is currently ignored (ate_hapcalways evaluates the EIF on the training sample).ini – Same meaning and defaults as in
hapc.cv_hapc()(except λ grids, see below).predictis accepted for signature parity withcv_hapc()and is currently ignored (ate_hapcalways evaluates the EIF on the training sample).log_lambda_prop_min – Equally spaced log-λ grid for propensity cross-validation (
A ~ W, binomial), same rule ascv_hapc().log_lambda_prop_max – Equally spaced log-λ grid for propensity cross-validation (
A ~ W, binomial), same rule ascv_hapc().grid_length_prop – Equally spaced log-λ grid for propensity cross-validation (
A ~ W, binomial), same rule ascv_hapc().log_lambda_out_min – Log-λ grid (defaults
-10,2,22) for outcome cross-validationY ~ (A, W)(gaussian) and for the undersmoothing scan overλ ≤ λ_CV. The grid deliberately reaches very small λ: formethod="undersmooth"the (nearly) unbiased regime typically sits well below the CV-optimal λ, so a grid that stops too high (e.g. the old[e^{-5}, e^{-3}]) never reaches it and no undersmoothing effectively occurs. Undersmoothing only removes bias when the PC basis is rich enough to represent the outcome surface, so use the full basis (npcs = n, the default); a truncated basis floors the bias at an approximation level that no λ can remove.log_lambda_out_max – Log-λ grid (defaults
-10,2,22) for outcome cross-validationY ~ (A, W)(gaussian) and for the undersmoothing scan overλ ≤ λ_CV. The grid deliberately reaches very small λ: formethod="undersmooth"the (nearly) unbiased regime typically sits well below the CV-optimal λ, so a grid that stops too high (e.g. the old[e^{-5}, e^{-3}]) never reaches it and no undersmoothing effectively occurs. Undersmoothing only removes bias when the PC basis is rich enough to represent the outcome surface, so use the full basis (npcs = n, the default); a truncated basis floors the bias at an approximation level that no λ can remove.grid_length_out – Log-λ grid (defaults
-10,2,22) for outcome cross-validationY ~ (A, W)(gaussian) and for the undersmoothing scan overλ ≤ λ_CV. The grid deliberately reaches very small λ: formethod="undersmooth"the (nearly) unbiased regime typically sits well below the CV-optimal λ, so a grid that stops too high (e.g. the old[e^{-5}, e^{-3}]) never reaches it and no undersmoothing effectively occurs. Undersmoothing only removes bias when the PC basis is rich enough to represent the outcome surface, so use the full basis (npcs = n, the default); a truncated basis floors the bias at an approximation level that no λ can remove.plot_diagnostics (bool, default False) – If True, open a matplotlib figure with (1) propensity CV curve (logistic deviance vs λ), (2) outcome CV curve (MSE vs λ), and (3) the undersmoothing path:
|mean(EIF)|vs outcome λ with the threshold line and vertical markers for the CV and selected undersmoothed λ. Requiresmatplotlib(pip install matplotlib). Only meaningful formethod="undersmooth"(ignored otherwise).method ({"undersmooth", "crossfit"}, default "undersmooth") –
Bias-control strategy.
"undersmooth"— the original single-sample procedure: fit the nuisances on the full sample, evaluate the EIF in-sample, and pick the outcome λ by the undersmoothing gate (Notes). Kept as the default for backward compatibility."crossfit"— DML-style K-fold cross-fitting: CV-select and fit both nuisances on the training folds and evaluate the EIF only on the held-out fold. No undersmoothing is performed (CV-optimal λ is used per fold); the outcome λ grid is still used for the per-fold CV. This makes the EIF mean and variance honest, so the AIPW estimate is √n-unbiased and the Wald CI attains nominal coverage without relying on the undersmoothing gate. Prefer this unless you specifically need the single-sample undersmoothed estimator.
cf_folds (int, default 5) – Number of cross-fitting folds (only used when
method="crossfit"). Folds are stratified by treatment so both arms appear in every split.cf_seed (int, default 0) – Seed for the (deterministic) cross-fitting fold assignment.
report_undersmoothing (bool, default False) – If True (and
method="undersmooth"), print a table to stdout of|mean(EIF)|versus outcome λ over the undersmoothing region (λ ≤ λ_CV), with the thresholdτ = σ_CV / (√n · log n), so the user can confirm the undersmoothing gate fired correctly. Ignored formethod="crossfit"(which evaluates the EIF out-of-fold).
- Returns:
ATEResult – Named tuple with three fields
(estimate, lower, upper).
Notes
The procedure is:
Cross-validate the propensity
A ~ W(binomial) on its grid and the outcomeY ~ (A, W)(gaussian) on the outcome grid (independently specified).Fix the propensity at its CV-best λ; refit on the full sample to obtain
π̂(W_i) = P(A=1 | W_i).At the CV-best outcome λ, compute a plugin-centered influence vector (same mean as the DR EIF at \(\psi=\overline{\mu}_1-\overline{\mu}_0\)) and let
σ = std(·).Threshold
τ = σ / (√n · log n).Among the outcome λ in the undersmoothing region
λ ≤ λ_CV, pick the smallest λ for which|mean(EIF_diff)| ≤ τ(the most undersmoothed fit still inside the band) — call itλ_u. If none qualifies,λ_uminimises|mean(EIF_diff)|(with a warning).Doubly robust point estimate at the undersmoothed outcome fit
(π̂, μ̂₁^{λ_u}, μ̂₀^{λ_u}):ψ̂ = mean(A/π̂·(Y-μ̂₁)+μ̂₁ - (1-A)/(1-π̂)·(Y-μ̂₀) - μ̂₀). The CI uses the one-step influence function evaluated at the CV outcome fit(μ̂₁^{λ_CV}, μ̂₀^{λ_CV})(centered atψ̂):φ_i = A_i/π̂_i·(Y_i-μ̂_{1i}) + μ̂_{1i} - (1-A_i)/(1-π̂_i)·(Y_i-μ̂_{0i}) - μ̂_{0i} - ψ̂, andCI: ψ̂ ± z_{1-α/2} · std(φ) / √n. The CV (rather than undersmoothed) fit is used forstd(φ)because a heavily undersmoothed, near-interpolating fit has artificially small in-sample residuals that collapse the SE and destroy coverage; the CV fit yields honest residuals. (Formethod="crossfit"no such issue arises — the EIF is evaluated out-of-fold.)This contrasts with plug-in G-computation
mean(μ̂₁(W)-μ̂₀(W)), which can be materially biased when both nuisances are estimated on the same sample and the outcome regressions are regularized. The DRψ̂is consistent if either the propensity or the pair(μ̂₁, μ̂₀)is correctly specified (standard double robustness).
Examples
>>> import numpy as np >>> from hapc import ate_hapc >>> rng = np.random.default_rng(0) >>> n = 200 >>> W = np.column_stack([rng.uniform(-2, 2, n), rng.normal(0, 0.5, n)]) >>> p = 1.0 / (1.0 + np.exp(-(W[:, 0] + 0.5 * W[:, 1]))) >>> A = rng.binomial(1, p, n) >>> Y = 2 * W[:, 0] + 0.5 + rng.normal(0, 0.5, n) # truth: ATE=0 >>> res = ate_hapc(W, Y, A, alpha=0.05, max_degree=2, npcs=50, ... grid_length_prop=4, grid_length_out=4, nfolds=3, ... norm="2") >>> bool(res.lower <= res.estimate <= res.upper) True
Result types¶
- class hapc.HazardResult(hazard: ndarray, ids: ndarray, times: ndarray, Y: ndarray, time_grid: ndarray, lambdas: ndarray, risk: ndarray, best_lambda: float, interior: bool, cv: CVResult, predict_hazard: ndarray | None, predict_survival: ndarray | None)[source]
Output of
hazard_hapc().- hazard
Estimated discrete hazard for each person-period row (aligned with
ids/times); the cross-validated predictions at the winning λ.- Type:
np.ndarray, shape (N,)
- ids
Subject index (0-based) for each person-period row.
- Type:
np.ndarray, shape (N,)
- times
Grid time for each person-period row.
- Type:
np.ndarray, shape (N,)
- Y
Binary hazard label for each person-period row.
- Type:
np.ndarray, shape (N,)
- time_grid
The discrete time grid used.
- Type:
np.ndarray, shape (K,)
- lambdas
CV λ grid.
- Type:
np.ndarray, shape (L,)
- risk
Mean cross-validated logistic deviance per λ.
- Type:
np.ndarray, shape (L,)
- best_lambda
Deviance-minimising λ.
- Type:
- interior
Truewhenbest_lambdais strictly inside the grid (not at either endpoint) — a basic check that the grid brackets the optimum.- Type:
- cv
The full underlying
hapc.cv_hapc()result.- Type:
- predict_hazard
Hazard surface for new subjects (only when
predictis supplied).- Type:
np.ndarray or None, shape (m, K)
- predict_survival
Survival curves
S(t|x) = prod_{g<=t}(1 - hazard(g|x))for new subjects (only whenpredictis supplied).- Type:
np.ndarray or None, shape (m, K)
- class hapc.CVResult(mses: ndarray, lambdas: ndarray, best_lambda: float, best_model_alpha: ndarray, predictions: ndarray | None)[source]¶
Output of all CV variants.
- mses¶
Mean cross-validated risk per λ. For binomial CV this is logistic deviance (the field is still called
msesfor API uniformity).- Type:
np.ndarray, shape (L,)
- lambdas¶
λ grid that was searched.
- Type:
np.ndarray, shape (L,)
- best_model_alpha¶
α refit on the full training data at
best_lambda.- Type:
np.ndarray, shape (k,)
- predictions¶
Predictions on
predict(if supplied). Forpcghal_cv_classithese are probabilities; forpcghal_cv/fasthal_cvthey match the underlying regression head (continuous).- Type:
np.ndarray or None
- class hapc.ATEResult(estimate: float, lower: float, upper: float)[source]¶
Output of
ate_hapc().- estimate¶
Doubly robust (AIPW-style) ATE at the undersmoothed outcome model:
mean(A/π̂·(Y-μ̂₁)+μ̂₁ - (1-A)/(1-π̂)·(Y-μ̂₀) - μ̂₀), matching the efficient influence function used for the Wald interval (see Notes).- Type:
Design & kernels¶
- hapc.design_hapc(X: ndarray, max_degree: int, npcs: int, center: bool = True) DesignOutput[source]¶
Compute the PC-HAL design ingredients
(H, U, d, V).Counterpart to R
design.hapc().- Parameters:
- Returns:
DesignOutput – Named tuple
(H, U, d, V).
Examples
>>> import numpy as np >>> from hapc import design_hapc >>> X = np.random.randn(50, 3) >>> des = design_hapc(X, max_degree=2, npcs=20) >>> des.U.shape (50, 20)
- hapc.kernel_hapc(X: ndarray, max_degree: int = 1, center: bool = True) ndarray[source]¶
HAL kernel matrix
K = H Hᵀ(computed implicitly).Counterpart to R
kernel.hapc().
Solvers & single-λ fits¶
- hapc.ridge_regression(Y: ndarray, U: ndarray, D2: ndarray, lambda_: float) ndarray[source]¶
Closed-form ridge in the rotated basis:
α = (D² + λ)⁻¹ D Uᵀ Y.Counterpart to R
ridge_call().- Parameters:
Y (np.ndarray, shape (n,)) – Response.
U (np.ndarray, shape (n, k)) – Left singular vectors.
D2 (np.ndarray, shape (k,)) – Squared singular values.
lambda (float) – Regularisation parameter.
- Returns:
np.ndarray, shape (k,) – Ridge coefficients on the PC basis.
- hapc.fast_pchal(U: ndarray, D2: ndarray, Y: ndarray, lambda_: float) ndarray[source]¶
Closed-form LASSO in the rotated basis (soft-thresholding).
Counterpart to R
fast_pchal_call().- Parameters:
U (np.ndarray, shape (n, k)) – Left singular vectors.
D2 (np.ndarray, shape (k,)) – Squared singular values.
Y (np.ndarray, shape (n,)) – Response.
lambda (float) – Regularisation parameter.
- Returns:
np.ndarray, shape (k,) – LASSO coefficients on the PC basis.
- hapc.single_pcghal(X: ndarray, Y: ndarray, max_degree: int, npcs: int, lambda_: float, predict: ndarray | None = None, center: bool = True, approx: bool = False, verbose: bool = False, max_iter: int = 5000, tol: float = 0.001, step_factor: float = 0.8, crit: str = 'grad', ini: str = '1') SinglePcghalResult[source]¶
Single-λ gaussian fit with PGD (
norm="sv").Calls C++
single_pcghal_fit, which initialises α with LASSO (ini="1", default) or ridge (ini="2"), then runs projected gradient descent.- Parameters:
X – See
hapc().Y – See
hapc().max_degree – See
hapc().npcs – See
hapc().lambda – See
hapc().predict – See
hapc().center – See
hapc().verbose – See
hapc().max_iter – See
hapc().tol – See
hapc().step_factor – See
hapc().crit – See
hapc().ini – See
hapc().approx (bool, default False) – Currently ignored (the C++ path always uses exact eigendecomposition).
- hapc.single_lambda_fit(X: ndarray, Y: ndarray, max_degree: int, npcs: int, lambda_: float, predict: ndarray | None = None, center: bool = True, approx: bool = False, l1: bool = False) SingleLambdaResult[source]¶
Closed-form gaussian fit for
norm="1"(l1=True) ornorm="2".- Parameters:
X (np.ndarray, shape (n, p)) – Features.
Y (np.ndarray, shape (n,)) – Response.
max_degree (int) – Max interaction order.
npcs (int) – Number of PCs.
lambda (float) – Regularisation parameter.
predict (np.ndarray, optional) – Test features. If supplied, predictions are returned.
center (bool, default True) – Centre H and Y.
approx (bool, default False) – (Currently ignored; reserved for future power-iteration eigensolver.)
l1 (bool, default False) –
Truefor LASSO (norm=”1”),Falsefor ridge (norm=”2”).
- Returns:
SingleLambdaResult
- hapc.single_pcghal_classification(X: ndarray, Y: ndarray, max_degree: int, npcs: int, lambda_: float, predict: ndarray | None = None, center: bool = True, verbose: bool = False, max_iter: int = 5000, tol: float = 0.001, step_factor: float = 0.8) SinglePcghalClassificationResult[source]¶
Single-λ binomial HAPC with
norm="sv"(logistic loss + PGD).Matches R
hapc(..., family="binomial", norm="sv"): logistic ridge initialisation for α, then the same projected-gradient optimiser aspcghal_classi_call.- Parameters:
X (np.ndarray, shape (n, p)) – Features.
Y (np.ndarray, shape (n,)) – Binary labels in
{0,1}or{-1,+1}.max_degree (int) – HAL interaction order.
npcs (int) – Number of PCs (same role as in
design_hapc()).lambda (float) – Regularisation strength (passed to the ridge initialiser and CV-style full-data refit used in the shared C++ layer).
predict (np.ndarray, optional, shape (m, p)) – Test features. When given, returns log-odds, probabilities, and thresholded classes in
{-1,+1}.center (bool, default True) – Centre the HAL design.
verbose (bool, default False) – Print PGD diagnostics.
max_iter (int, default 5000) – Maximum PGD iterations.
tol (float, default 1e-3) – Convergence tolerance for PGD.
step_factor (float, default 0.8) – PGD step-size factor.
- Returns:
SinglePcghalClassificationResult – Fields
alpha,risk,iter, and optionalpredictions/probabilities/predicted_classes.
- hapc.single_pcghal_classification_ridge_only(X: ndarray, Y: ndarray, max_degree: int, npcs: int, lambda_: float, predict: ndarray | None = None, center: bool = True, verbose: bool = False) SinglePcghalClassificationResult[source]¶
Binomial
norm="2": logistic ridge initialiser only (no PGD).Mirrors R
hapc(..., family="binomial", norm="2")and is intended as a fast linear-in-kernel baseline.- Parameters:
X – Same meaning as in
single_pcghal_classification(), except there is no iterative refinement anditeris always0.Y – Same meaning as in
single_pcghal_classification(), except there is no iterative refinement anditeris always0.max_degree – Same meaning as in
single_pcghal_classification(), except there is no iterative refinement anditeris always0.npcs – Same meaning as in
single_pcghal_classification(), except there is no iterative refinement anditeris always0.lambda – Same meaning as in
single_pcghal_classification(), except there is no iterative refinement anditeris always0.predict – Same meaning as in
single_pcghal_classification(), except there is no iterative refinement anditeris always0.center – Same meaning as in
single_pcghal_classification(), except there is no iterative refinement anditeris always0.verbose – Same meaning as in
single_pcghal_classification(), except there is no iterative refinement anditeris always0.
- Returns:
SinglePcghalClassificationResult
- hapc.single_pcghal_classification_lasso(X: ndarray, Y: ndarray, max_degree: int, npcs: int, lambda_: float, predict: ndarray | None = None, center: bool = True, verbose: bool = False, max_iter: int = 1000) SinglePcghalClassificationResult[source]¶
Single-λ binomial HAPC with
norm="1"— logistic LASSO.Solves the L1-penalised logistic regression in the PC basis using
sklearn.linear_model.LogisticRegressionwith theliblinearsolver. Mirrors Rhapc(family="binomial", norm="1"), which usesglmnet(..., family="binomial", alpha=1, intercept=FALSE)on the sameXtilde = U * ddesign.Lambda parameterisation. Sklearn’s L1 logistic objective with
liblinearismin_w ||w||_1 + C * sum_i log(1 + exp(-y_i x_i^T w)) (1)
while the package convention (matching
glmnet’s binomial loss) ismin_w (1/n) * sum_i log(1 + exp(-y_i x_i^T w)) + lambda * ||w||_1 (2)
Equating (1) and (2) up to scaling yields
C = 1 / (n * lambda), which is what we use here so that the samelambdaproduces matching coefficients in R (glmnet) and Python (sklearn).- Parameters:
X (np.ndarray, shape (n, p)) – Features.
Y (np.ndarray, shape (n,)) – Binary labels in
{0,1}or{-1,+1}.max_degree (int) – HAL interaction order.
npcs (int) – Number of PCs.
lambda (float) – L1 regularisation strength on the per-sample loss; must be > 0.
predict (np.ndarray, optional, shape (m, p)) – Test features.
center (bool, default True) – Centre the HAL design.
verbose (bool, default False) – Reserved for parity with sibling routines (sklearn is silent).
max_iter (int, default 1000) – Maximum iterations for
liblinear.
- Returns:
SinglePcghalClassificationResult –
alphais the LASSO coefficient on the PC basis,riskis the training logistic loss,iteris the iteration count returned byliblinear, andpredictions/probabilities/predicted_classesare populated whenpredictis supplied.
Cross-validation¶
- hapc.pcghal_cv(X: ndarray, Y: ndarray, max_degree: int, npcs: int, lambdas: ndarray | None = None, log_lambda_min: float = -5, log_lambda_max: float = -3, grid_length: int = 10, nfolds: int = 5, predict: ndarray | None = None, center: bool = True, approx: bool = False, verbose: bool = False, max_iter: int = 5000, tol: float = 0.001, step_factor: float = 0.8, crit: str = 'grad', ini: str = '1') CVResult[source]¶
k-fold CV for gaussian HAPC with
norm="sv".Per fold: initialise α with LASSO (
ini="1", default) or ridge (ini="2"), then projected gradient descent on squared error — same pipeline as Rcv.hapc(family="gaussian", norm="sv").Delegates to C++
pcghal_cv_fit(shared with R), including fold IDs.- Parameters:
X (np.ndarray, shape (n, p)) – Features.
Y (np.ndarray, shape (n,)) – Continuous response.
max_degree (int) – HAL interaction order.
npcs (int) – Number of PCs.
lambdas (np.ndarray, optional) – Explicit λ grid. If
None, built fromlog_*andgrid_length.log_lambda_min (float) – Log-space grid endpoints when
lambdasisNone.log_lambda_max (float) – Log-space grid endpoints when
lambdasisNone.grid_length (int, default 10) – Number of grid points when
lambdasisNone.nfolds (int, default 5) – CV folds.
predict (np.ndarray, optional, shape (m, p)) – Test matrix for predictions at the winning λ.
center (bool) – Passed through to design / kernel (
approxreserved).approx (bool) – Passed through to design / kernel (
approxreserved).verbose (bool) – Print PGD progress.
max_iter (float / int) – PGD budget and step-size factor.
tol (float / int) – PGD budget and step-size factor.
step_factor (float / int) – PGD budget and step-size factor.
crit ({"grad", "risk"}, default "grad") – PGD stopping rule (matches R default).
ini ({"1", "2"}, default "1") –
"1"= LASSO initialisation,"2"= ridge.
- Returns:
CVResult –
msesholds mean CV MSE per λ;best_model_alphais α refit on all data atbest_lambda.
- hapc.pcghal_cv_classi(X: ndarray, Y: ndarray, max_degree: int, npcs: int, lambdas: ndarray | None = None, log_lambda_min: float = -5, log_lambda_max: float = -3, grid_length: int = 10, nfolds: int = 5, predict: ndarray | None = None, center: bool = True, verbose: bool = False, max_iter: int = 5000, tol: float = 0.001, step_factor: float = 0.8, with_pgd: bool = True) CVResult[source]¶
k-fold logistic-loss CV for binomial HAPC.
Calls the shared C++ logistic CV used by R’s
cv.hapc(family="binomial"). Risk is always logistic deviance; we never use squared error on a binary response.- Parameters:
X (np.ndarray) – Features and 0/1 response.
Y (np.ndarray) – Features and 0/1 response.
with_pgd (bool, default True) –
True→norm="sv"(logistic ridge initialiser + projected gradient descent on logistic loss, per fold).False→norm="2"(logistic ridge only, no PGD), still scored with logistic deviance.max_degree – See
cv_hapc().npcs – See
cv_hapc().lambdas – See
cv_hapc().log_lambda_min – See
cv_hapc().log_lambda_max – See
cv_hapc().grid_length – See
cv_hapc().nfolds – See
cv_hapc().predict – See
cv_hapc().center – See
cv_hapc().verbose – See
cv_hapc().max_iter – See
cv_hapc().tol – See
cv_hapc().step_factor – See
cv_hapc().
- Returns:
CVResult –
msesholds mean per-fold logistic deviance.predictionsare probabilities onpredict.
- hapc.pcghal_cv_classi_lasso(X: ndarray, Y: ndarray, max_degree: int, npcs: int, lambdas: ndarray | None = None, log_lambda_min: float = -5, log_lambda_max: float = -3, grid_length: int = 10, nfolds: int = 5, predict: ndarray | None = None, center: bool = True, verbose: bool = False, max_iter: int = 1000) CVResult[source]¶
k-fold logistic-LASSO CV for binomial HAPC,
norm="1".Per fold, fits
hapc.single_pcghal_classification_lasso()(sklearn LogisticRegression withpenalty="l1",solver="liblinear",fit_intercept=False) on the training portion and scores the held-out portion with logistic deviance. Mirrors Rcv.hapc(family="binomial", norm="1"), which usesglmnetper fold.:param See
cv_hapc().:- Returns:
CVResult –
msesholds mean per-fold logistic deviance.best_model_alphais the LASSO α refit on the full data atbest_lambda.predictionsare probabilities atbest_lambdaonpredict.
- hapc.fasthal_cv(X: ndarray, Y: ndarray, npcs: int, lambdas: ndarray, nfolds: int = 5, predict: ndarray | None = None, max_degree: int = 1, center: bool = True, approx: bool = False, l1: bool = True) CVResult[source]¶
k-fold CV with closed-form LASSO or ridge in the PC basis per fold.
Used for gaussian
norm="1"/norm="2"and, in R, also forfamily="binomial"with those norms (squared-error CV onY).Calls the same C++ routine as R
fasthal_cv_callwith identical fold construction (std::mt19937(12345)shuffle), so CV scores match R.- Parameters:
X (np.ndarray, shape (n, p)) – Training features.
Y (np.ndarray, shape (n,)) – Response (continuous, or binary labels if using the binomial branch in
cv_hapc()).npcs (int) – Number of principal components.
lambdas (np.ndarray, shape (L,)) – Penalty grid (must be supplied explicitly;
cv_hapc()builds it).nfolds (int, default 5) – Number of CV folds.
predict (np.ndarray, optional, shape (m, p)) – Test features for out-of-sample predictions at the CV-winning λ.
max_degree (int, default 1) – HAL interaction order.
center (bool, default True) – Centre kernel / response like training.
approx (bool, default False) – If True, approximate top eigenvectors (same flag as R).
l1 (bool, default True) –
Truefor LASSO (norm="1"),Falsefor ridge (norm="2").
- Returns:
CVResult –
msesare mean squared errors on held-out folds.