Skip to content

Commit

Permalink
cran 0.5.2
Browse files Browse the repository at this point in the history
  • Loading branch information
dselivanov committed Jul 3, 2024
1 parent d8a5393 commit 695d4eb
Show file tree
Hide file tree
Showing 6 changed files with 18 additions and 27 deletions.
14 changes: 7 additions & 7 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
Package: rsparse
Type: Package
Title: Statistical Learning on Sparse Matrices
Version: 0.5.1
Version: 0.5.2
Authors@R: c(
person("Dmitriy", "Selivanov", role=c("aut", "cre", "cph"), email="[email protected]",
person("Dmitriy", "Selivanov", role=c("aut", "cre", "cph"), email="[email protected]",
comment = c(ORCID = "0000-0001-5413-1506")),
person("David", "Cortes", role="ctb"),
person("Drew", "Schmidt", role="ctb", comment="configure script for BLAS, LAPACK detection"),
person("Wei-Chen", "Chen", role="ctb", comment="configure script and work on linking to float package")
)
Maintainer: Dmitriy Selivanov <[email protected]>
Maintainer: Dmitriy Selivanov <[email protected]>
Description: Implements many algorithms for statistical learning on
sparse matrices - matrix factorizations, matrix completion,
elastic net regressions, factorization machines.
Expand All @@ -27,7 +27,7 @@ Description: Implements many algorithms for statistical learning on
(2005, <doi:10.1145/1102351.1102441>)
3) Fast Truncated Singular Value Decomposition (SVD), Soft-Thresholded SVD,
Soft-Impute matrix completion via ALS - paper by Hastie, Mazumder
et al. (2014, <arXiv:1410.2596>)
et al. (2014, <doi:10.48550/arXiv.1410.2596>)
4) Linear-Flow matrix factorization, from 'Practical linear models for
large-scale one-class collaborative filtering' by Sedhain, Bui, Kawale et al
(2016, ISBN:978-1-57735-770-4)
Expand Down Expand Up @@ -55,7 +55,7 @@ Suggests:
testthat,
covr
StagedInstall: TRUE
URL: https://github.com/rexyai/rsparse
BugReports: https://github.com/rexyai/rsparse/issues
RoxygenNote: 7.2.1
URL: https://github.com/dselivanov/rsparse
BugReports: https://github.com/dselivanov/rsparse/issues
RoxygenNote: 7.3.1
NeedsCompilation: yes
2 changes: 1 addition & 1 deletion R/SoftALS.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
#' @description Fit SoftImpute/SoftSVD via fast alternating least squares. Based on the
#' paper by Trevor Hastie, Rahul Mazumder, Jason D. Lee, Reza Zadeh
#' by "Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares" -
#' \url{https://arxiv.org/pdf/1410.2596.pdf}
#' \url{http://arxiv.org/pdf/1410.2596}
#' @param x sparse matrix. Both CSR \code{dgRMatrix} and CSC \code{dgCMatrix} are supported.
#' CSR matrix is preffered because in this case algorithm will benefit from multithreaded
#' CSR * dense matrix products (if OpenMP is supported on your platform).
Expand Down
2 changes: 1 addition & 1 deletion R/model_ScaleNormalize.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
#' @description scales input user-item interaction matrix as per eq (16) from the paper.
#' Usage of such rescaled matrix with [PureSVD] model will be equal to running PureSVD
#' on the scaled cosine-based inter-item similarity matrix.
#' @references See \href{https://arxiv.org/pdf/1511.06033.pdf}{EigenRec: Generalizing PureSVD for
#' @references See \href{http://arxiv.org/pdf/1511.06033}{EigenRec: Generalizing PureSVD for
#' Effective and Efficient Top-N Recommendations} for details.
#' @export
ScaleNormalize = R6::R6Class(
Expand Down
23 changes: 7 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
# rsparse <img src='man/figures/logo.png' align="right" height="128" />
<!-- badges: start -->
[![R build status](https://github.com/rexyai/rsparse/workflows/R-CMD-check/badge.svg)](https://github.com/rexyai/rsparse/actions)
[![codecov](https://codecov.io/gh/rexyai/rsparse/branch/master/graph/badge.svg)](https://codecov.io/gh/rexyai/rsparse/branch/master)
[![R build status](https://github.com/rexyai/rsparse/workflows/R-CMD-check/badge.svg)](https://github.com/dselivanov/rsparse/actions)
[![codecov](https://codecov.io/gh/rexyai/rsparse/branch/master/graph/badge.svg)](https://app.codecov.io/gh/rexyai/rsparse/branch/master)
[![License](https://eddelbuettel.github.io/badges/GPL2+.svg)](http://www.gnu.org/licenses/gpl-2.0.html)
[![Project Status](https://img.shields.io/badge/lifecycle-maturing-blue.svg)](https://lifecycle.r-lib.org/articles/stages.html#maturing)
<a href="https://rexy.ai"><img src="https://s3-eu-west-1.amazonaws.com/rexy.ai/images/favicon.ico" height="32" width="32"></a>
<!-- badges: end -->

`rsparse` is an R package for statistical learning primarily on **sparse matrices** - **matrix factorizations, factorization machines, out-of-core regression**. Many of the implemented algorithms are particularly useful for **recommender systems** and **NLP**.
Expand All @@ -15,10 +14,10 @@ We've paid some attention to the implementation details - we try to avoid data c

### Classification/Regression

1. [Follow the proximally-regularized leader](http://proceedings.mlr.press/v15/mcmahan11b/mcmahan11b.pdf) which allows to solve **very large linear/logistic regression** problems with elastic-net penalty. Solver uses stochastic gradient descent with adaptive learning rates (so can be used for online learning - not necessary to load all data to RAM). See [Ad Click Prediction: a View from the Trenches](https://www.eecs.tufts.edu/~dsculley/papers/ad-click-prediction.pdf) for more examples.
1. [Follow the proximally-regularized leader](http://proceedings.mlr.press/v15/mcmahan11b/mcmahan11b.pdf) which allows to solve **very large linear/logistic regression** problems with elastic-net penalty. Solver uses stochastic gradient descent with adaptive learning rates (so can be used for online learning - not necessary to load all data to RAM). See [Ad Click Prediction: a View from the Trenches](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41159.pdf) for more examples.
- Only logistic regerssion implemented at the moment
- Native format for matrices is CSR - `Matrix::RsparseMatrix`. However common R `Matrix::CsparseMatrix` (`dgCMatrix`) will be converted automatically.
1. [Factorization Machines](https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf) supervised learning algorithm which learns second order polynomial interactions in a factorized way. We provide highly optimized SIMD accelerated implementation.
1. [Factorization Machines](https://cseweb.ucsd.edu/classes/fa17/cse291-b/reading/Rendle2010FM.pdf) supervised learning algorithm which learns second order polynomial interactions in a factorized way. We provide highly optimized SIMD accelerated implementation.

### Matrix Factorizations

Expand All @@ -32,14 +31,14 @@ See details in [Applications of the Conjugate Gradient Method for Implicit Feedb
* <img src="https://raw.githubusercontent.com/rexyai/rsparse/master/docs/img/WRMF.png" width="400">
1. **Linear-Flow** from [Practical Linear Models for Large-Scale One-Class Collaborative Filtering](http://www.bkveton.com/docs/ijcai2016.pdf). Algorithm looks for factorized low-rank item-item similarity matrix (in some sense it is similar to [SLIM](http://glaros.dtc.umn.edu/gkhome/node/774))
* <img src="https://raw.githubusercontent.com/rexyai/rsparse/master/docs/img/LinearFlow.png" width="300">
1. Fast **Truncated SVD** and **Truncated Soft-SVD** via Alternating Least Squares as described in [Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares](https://arxiv.org/pdf/1410.2596.pdf). Works for both sparse and dense matrices. Works on [float](https://github.com/wrathematics/float) matrices as well! For certain problems may be even faster than [irlba](https://github.com/bwlewis/irlba) package.
1. Fast **Truncated SVD** and **Truncated Soft-SVD** via Alternating Least Squares as described in [Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares](http://arxiv.org/pdf/1410.2596). Works for both sparse and dense matrices. Works on [float](https://github.com/wrathematics/float) matrices as well! For certain problems may be even faster than [irlba](https://github.com/bwlewis/irlba) package.
* <img src="https://raw.githubusercontent.com/rexyai/rsparse/master/docs/img/soft-svd.png" width="600">
1. **Soft-Impute** via fast Alternating Least Squares as described in [Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares](https://arxiv.org/pdf/1410.2596.pdf).
1. **Soft-Impute** via fast Alternating Least Squares as described in [Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares](https://arxiv.org/pdf/1410.2596).
* <img src="https://raw.githubusercontent.com/rexyai/rsparse/master/docs/img/soft-impute.png" width="400">
* with a solution in SVD form <img src="https://raw.githubusercontent.com/rexyai/rsparse/master/docs/img/soft-impute-svd-form.png" width="150">
1. **GloVe** as described in [GloVe: Global Vectors for Word Representation](https://nlp.stanford.edu/pubs/glove.pdf).
* This is usually used to train word embeddings, but actually also very useful for recommender systems.
1. Matrix scaling as descibed in [EigenRec: Generalizing PureSVD for Effective and Efficient Top-N Recommendations](https://arxiv.org/pdf/1511.06033.pdf)
1. Matrix scaling as descibed in [EigenRec: Generalizing PureSVD for Effective and Efficient Top-N Recommendations](http://arxiv.org/pdf/1511.06033)

*********************

Expand Down Expand Up @@ -84,14 +83,6 @@ By default, R for Windows comes with unoptimized BLAS and LAPACK libraries, and
**Note that syntax is these posts/slides is not up to date since package was under active development**

1. [Slides from DataFest Tbilisi(2017-11-16)](https://www.slideshare.net/DmitriySelivanov/matrix-factorizations-for-recommender-systems)
1. [Introduction to matrix factorization with Weighted-ALS algorithm](http://dsnotes.com/post/2017-05-28-matrix-factorization-for-recommender-systems/) - collaborative filtering for implicit feedback datasets.
1. [Music recommendations using LastFM-360K dataset](http://dsnotes.com/post/2017-06-28-matrix-factorization-for-recommender-systems-part-2/)
* evaluation metrics for ranking
* setting up proper cross-validation
* possible issues with nested parallelism and thread contention
* making recommendations for new users
* complimentary item-to-item recommendations
1. [Benchmark](http://dsnotes.com/post/2017-07-10-bench-wrmf/) against other good implementations

Here is example of `rsparse::WRMF` on [lastfm360k](https://www.upf.edu/web/mtg/lastfm360k) dataset in comparison with other good implementations:

Expand Down
2 changes: 1 addition & 1 deletion man/ScaleNormalize.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/soft_impute.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 695d4eb

Please sign in to comment.