Skip to content

Commit

Permalink
Version 1.6.0
Browse files Browse the repository at this point in the history
  • Loading branch information
pbreheny committed Apr 21, 2024
1 parent d47a2cc commit 9d68ff6
Show file tree
Hide file tree
Showing 34 changed files with 651 additions and 725 deletions.
2 changes: 1 addition & 1 deletion .version.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"schemaVersion": 1,
"label": "GitHub",
"message": "1.5.2.1",
"message": "1.6.0",
"color": "blue"
}
9 changes: 5 additions & 4 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: biglasso
Version: 1.5.2.1
Date: 2024-03-19
Version: 1.6.0
Date: 2024-04-21
Title: Extending Lasso Model Fitting to Big Data
Authors@R: c(
person("Yaohui", "Zeng", role = c("aut")),
Expand All @@ -15,8 +15,8 @@ Description: Extend lasso and elastic-net model fitting for ultra
lasso-fitting packages like 'glmnet' and 'ncvreg', thus allowing
the user to analyze big data analysis even on an ordinary laptop.
License: GPL-3
URL: https://yaohuizeng.github.io/biglasso/index.html, https://github.com/YaohuiZeng/biglasso, https://arxiv.org/abs/1701.05936
BugReports: https://github.com/YaohuiZeng/biglasso/issues
URL: https://pbreheny.github.io/biglasso/index.html, https://github.com/pbreheny/biglasso, https://arxiv.org/abs/1701.05936
BugReports: https://github.com/pbreheny/biglasso/issues
Depends: R (>= 3.2.0), bigmemory (>= 4.5.0), Matrix, ncvreg
Imports: Rcpp (>= 0.12.1), methods
LinkingTo: Rcpp, RcppArmadillo (>= 0.8.600), bigmemory, BH
Expand All @@ -28,5 +28,6 @@ Suggests:
survival,
knitr,
rmarkdown
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.1
Encoding: UTF-8
21 changes: 15 additions & 6 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,15 @@
# biglasso 1.6.0
* New: functions biglasso_fit() and biglasso_path(), which allow users to turn
off standardization and intercept

# biglasso 1.5.2
* Update coercion for compatibility with Matrix 1.5
* Now using GitHub Actions instead of Travis for CI

# biglasso 1.5.1
* Internal Cpp changes: initialize Xty, remove unused cutoff variable (#48)
* Eliminate CV test against ncvreg (the two packages no longer use the same approach (#47)
* Eliminate CV test against ncvreg (the two packages no longer use the same
approach (#47)

# biglasso 1.5.0
* Update headers to maintain compatibility with new version of Rcpp (#40)
Expand All @@ -13,14 +18,17 @@
* changed R package maintainer to Chuyi Wang (wwaa0208@gmail.com)
* fixed bugs
* Add 'auc', 'class' options to cv.biglasso eval.metric
* predict.cv now predicts standard error over CV folds by default; set 'grouped' argument to FALSE for old behaviour.
* predict.cv.biglasso accepts 'lambda.min', 'lambda.1se' argument, similar to predict.cv.glmnet()
* predict.cv now predicts standard error over CV folds by default; set
'grouped' argument to FALSE for old behaviour.
* predict.cv.biglasso accepts 'lambda.min', 'lambda.1se' argument, similar to
predict.cv.glmnet()

# biglasso 1.4-0
* adaptive screening methods were implemented and set as default when applicable
* added sparse Cox regression
* removed uncompetitive screening methods and combined naming of screening methods
* version 1.4-0 for CRAN submission
* removed uncompetitive screening methods and combined naming of screening
methods
* version 1.4-0 for CRAN submission

# biglasso 1.3-7
* update email to personal email
Expand All @@ -30,7 +38,8 @@

# biglasso 1.3-6
* optimized the code for computing the slores rule.
* added Slores screening without active cycling (-NAC) for logistic regression, research usage only.
* added Slores screening without active cycling (-NAC) for logistic
regression, research usage only.
* corrected BEDPP for elastic net.
* fixed a bug related to "exporting SSR-BEDPP".

Expand Down
47 changes: 22 additions & 25 deletions R/biglasso-package.R
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
#' Data in R. Version >= 1.2-3 represents a major redesign where the source
#' code is converted into C++ (previously in C), and new feature screening
#' rules, as well as OpenMP parallel computing, are implemented. Some key
#' features of \code{biglasso} are summarized as below: \enumerate{ \item it
#' features of `biglasso` are summarized as below: \enumerate{ \item it
#' utilizes memory-mapped files to store the massive data on the disk, only
#' loading data into memory when necessary during model fitting. Consequently,
#' it's able to seamlessly data-larger-than-RAM cases. \item it is built upon
Expand All @@ -38,57 +38,54 @@
#' additional 1.5x to 4x speedup. \item the implementation is designed to be as
#' memory-efficient as possible by eliminating extra copies of the data created
#' by other R packages, making it at least 2x more memory-efficient than
#' \code{glmnet}. \item the underlying computation is implemented in C++, and
#' `glmnet`. \item the underlying computation is implemented in C++, and
#' parallel computing with OpenMP is also supported. }
#'
#' \strong{For more information:} \itemize{ \item Benchmarking results:
#' \url{https://github.com/YaohuiZeng/biglasso}.
#' \item Tutorial:
#' \url{http://yaohuizeng.github.io/biglasso/articles/biglasso.html}
#' \item Technical paper:
#' \url{https://arxiv.org/abs/1701.05936} }
#' **For more information:**
#' * Benchmarking results: \url{https://github.com/pbreheny/biglasso}
#' * Tutorial: \url{https://pbreheny.github.io/biglasso/articles/biglasso.html}
#' * Technical paper: \url{https://arxiv.org/abs/1701.05936}
#'
#' @name biglasso-package
#'
#' @note The input design matrix X must be a \code{\link[bigmemory]{big.matrix}} object.
#' This can be created by the function \code{as.big.matrix} in the R package
#' @note The input design matrix X must be a [bigmemory::big.matrix()] object.
#' This can be created by the function `as.big.matrix` in the R package
#' \href{https://CRAN.R-project.org//package=bigmemory}{bigmemory}.
#' If the data (design matrix) is very large (e.g. 10 GB) and stored in an external
#' file, which is often the case for big data, X can be created by calling the
#' function \code{\link{setupX}}.
#' function [setupX()].
#' \strong{In this case, there are several restrictions about the data file:}
#' \enumerate{ \item the data file must be a well-formated ASCII-file, with
#' each row corresponding to an observation and each column a variable; \item
#' the data file must contain only one single type. Current version only
#' supports \code{double} type; \item the data file must contain only numeric
#' supports `double` type; \item the data file must contain only numeric
#' variables. If there are categorical variables, the user needs to create
#' dummy variables for each categorical varable (by adding additional columns).}
#' Future versions will try to address these restrictions.
#'
#' Denote the number of observations and variables be, respectively, \code{n}
#' and \code{p}. It's worth noting that the package is more suitable for wide
#' data (ultrahigh-dimensional, \code{p >> n}) as compared to long data
#' (\code{n >> p}). This is because the model fitting algorithm takes advantage
#' Denote the number of observations and variables be, respectively, `n`
#' and `p`. It's worth noting that the package is more suitable for wide
#' data (ultrahigh-dimensional, `p >> n`) as compared to long data
#' (`n >> p`). This is because the model fitting algorithm takes advantage
#' of sparsity assumption of high-dimensional data. To just give the user some
#' ideas, below are some benchmarking results of the total computing time (in
#' seconds) for solving lasso-penalized linear regression along a sequence of
#' 100 values of the tuning parameter. In all cases, assume 20 non-zero
#' coefficients equal +/- 2 in the true model. (Based on Version 1.2-3,
#' screening rule "SSR-BEDPP" is used)
#' \itemize{ \item For wide data case (\code{p > n}), \code{n = 1,000}:
#' \tabular{ccccc}{ \code{p} \tab 1,000 \tab 10,000 \tab 100,000 \tab 1,000,000
#' \cr Size of \code{X} \tab 9.5 MB \tab 95 MB \tab 950 MB \tab 9.5 GB \cr
#' \itemize{ \item For wide data case (`p > n`), `n = 1,000`:
#' \tabular{ccccc}{ `p` \tab 1,000 \tab 10,000 \tab 100,000 \tab 1,000,000
#' \cr Size of `X` \tab 9.5 MB \tab 95 MB \tab 950 MB \tab 9.5 GB \cr
#' Elapsed time (s) \tab 0.11 \tab 0.83 \tab 8.47 \tab 85.50 \cr }
#' %\item For long data case (\code{n >> p}), \code{p = 1,000}:
#' %\item For long data case (`n >> p`), `p = 1,000`:
#' %\tabular{ccccc}{
#' %\code{n} \tab 1,000 \tab 10,000 \tab 100,000 \tab 1,000,000 \cr
#' %Size of \code{X} \tab 9.5 MB \tab 95 MB \tab 950 MB \tab 9.5 GB \cr
#' %`n` \tab 1,000 \tab 10,000 \tab 100,000 \tab 1,000,000 \cr
#' %Size of `X` \tab 9.5 MB \tab 95 MB \tab 950 MB \tab 9.5 GB \cr
#' %Elapsed time (s) \tab 2.50 \tab 11.43 \tab 83.69 \tab 1090.62 \cr %}
#' }
#'
#' @author Yaohui Zeng, Chuyi Wang and Patrick Breheny
#' @author Yaohui Zeng, Chuyi Wang, Tabitha Peter, and Patrick Breheny
#'
#' Maintainer: Yaohui Zeng <yaohui.zeng@@gmail.com> and Chuyi Wang <wwaa0208@@gmail.com>
#' @references \itemize{ \item Zeng, Y., and Breheny, P. (2017). The biglasso
#' Package: A Memory- and Computation-Efficient Solver for Lasso Model Fitting
#' with Big Data in R. \url{https://arxiv.org/abs/1701.05936}. \item
Expand All @@ -104,7 +101,7 @@
#' 2137-2140). IEEE. \item Wang, J., Zhou, J., Liu, J., Wonka, P., and Ye, J.
#' (2014). A safe screening rule for sparse logistic regression. \emph{In
#' Advances in Neural Information Processing Systems}, pp. 1053-1061. }
#' @keywords package
#'
#' @examples
#' \dontrun{
#' ## Example of reading data from external big data file, fit lasso model,
Expand Down
Loading

0 comments on commit 9d68ff6

Please # to comment.