tlars-python

A Python port of the tlars R package for Terminating-LARS (T-LARS) algorithm.

Overview

The Terminating-LARS (T-LARS) algorithm is a modification of the Least Angle Regression (LARS) algorithm that allows for early termination of the forward selection process. This is particularly useful for high-dimensional data where the number of predictors is much larger than the number of observations.

This Python package provides a port of the original R implementation by Jasin Machkour, maintaining the same functionality while providing a more Pythonic interface. The Python port was created by Arnau Vilella (avp@connect.ust.hk).

Installation

The package is available on PyPI for Windows, macOS, and Linux:

pip install tlars

This is the recommended installation method as it will automatically install pre-built wheels for your platform with all required dependencies.

Usage

import numpy as np
from tlars import TLARS, generate_gaussian_data

# Generate some example data using the built-in function
n, p = 100, 20
X, y, beta = generate_gaussian_data(n=n, p=p, seed=42)

# Alternatively, create your own data
X = np.random.randn(n, p)
beta = np.zeros(p)
beta[:5] = np.array([1.5, 0.8, 2.0, -1.0, 1.2])
y = X @ beta + 0.5 * np.random.randn(n)

# Create dummy variables
num_dummies = p
dummies = np.random.randn(n, num_dummies)
XD = np.hstack([X, dummies])

# Create and fit the model
model = TLARS(XD, y, num_dummies=num_dummies)
model.fit(T_stop=3, early_stop=True)

# Get the coefficients
print(model.coef_)

# Get other properties
print(f"Number of active predictors: {model.n_active_}")
print(f"Number of active dummies: {model.n_active_dummies_}")
print(f"R² values: {model.r2_}")

# Plot the solution path
model.plot(include_dummies=True, show_actions=True)

Library Reference

TLARS Class

Constructor

TLARS(X=None, y=None, verbose=False, intercept=True, standardize=True, 
      num_dummies=0, type='lar', lars_state=None, info=True)

X: numpy.ndarray - Real valued predictor matrix.
y: numpy.ndarray - Response vector.
verbose: bool - If True, progress in computations is shown.
intercept: bool - If True, an intercept is included.
standardize: bool - If True, the predictors are standardized and the response is centered.
num_dummies: int - Number of dummies that are appended to the predictor matrix.
type: str - Type of used algorithm (currently possible choices: 'lar' or 'lasso').
lars_state: object - Previously saved TLARS state to resume from.
info: bool - If True, information about the initialization is printed.

Methods

fit(T_stop=None, early_stop=True, info=True): Fit the TLARS model.
- T_stop: int - Number of included dummies after which the random experiments are stopped.
- early_stop: bool - If True, then the forward selection process is stopped after T_stop dummies have been included.
- info: bool - If True, informational messages are displayed during fitting.
plot(xlabel="# Included dummies", ylabel="Coefficients", include_dummies=True, show_actions=True, col_selected="black", col_dummies="red", ls_selected="-", ls_dummies="--", legend_pos="best", figsize=(10, 6)): Plot the T-LARS solution path.
- xlabel: str - Label for the x-axis.
- ylabel: str - Label for the y-axis.
- include_dummies: bool - If True, solution paths of dummies are added to the plot.
- show_actions: bool - If True, marks for added variables are shown above the plot.
- col_selected: str - Color of lines corresponding to selected variables.
- col_dummies: str - Color of lines corresponding to dummy variables.
- ls_selected: str - Line style for selected variables.
- ls_dummies: str - Line style for dummy variables.
- legend_pos: str - Position of the legend.
- figsize: tuple - Figure size.
get_all(): Returns a dictionary with all the results and properties.

Properties

coef_: numpy.ndarray - The coefficients of the model.
coef_path_: list - A list of coefficient vectors at each step.
n_active_: int - The number of active predictors.
n_active_dummies_: int - The number of active dummy variables.
n_dummies_: int - The total number of dummy variables.
actions_: list - The indices of added/removed variables along the solution path.
df_: list - The degrees of freedom at each step.
r2_: list - The R² statistic at each step.
rss_: list - The residual sum of squares at each step.
cp_: numpy.ndarray - The Cp-statistic at each step.
lambda_: numpy.ndarray - The lambda-values (penalty parameters) at each step.
entry_: list - The first entry/selection steps of the predictors.

Helper Functions

generate_gaussian_data(n=50, p=100, seed=789): Generate synthetic Gaussian data for testing.
- n: int - Number of observations.
- p: int - Number of variables.
- seed: int - Random seed for reproducibility.
- Returns: tuple - (X, y, beta) where X is the design matrix, y is the response, and beta is the true coefficient vector.

License

This project is licensed under the GNU General Public License v3.0 (GPL-3.0).

Acknowledgments

The original R package tlars was created by Jasin Machkour. This Python port was developed by Arnau Vilella (avp@connect.ust.hk).

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
carma @ 9c5d79c		carma @ 9c5d79c
examples		examples
pybind11 @ d422fda		pybind11 @ d422fda
src		src
tests		tests
tlars		tlars
.gitmodules		.gitmodules
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tlars-python

Overview

Installation

Usage

Library Reference

TLARS Class

Constructor

Methods

Properties

Helper Functions

License

Acknowledgments

About

Releases 4

Packages

Languages

License

ArnauVilella/tlars-python

Folders and files

Latest commit

History

Repository files navigation

tlars-python

Overview

Installation

Usage

Library Reference

TLARS Class

Constructor

Methods

Properties

Helper Functions

License

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases 4

Packages 0

Languages

Packages