Articles

These articles either focus on data.table (bold) or mention/use it (perhaps only briefly and you may need to search the article for "data.table"), ordered by date. If you know of an article that may be of interest to others, please add it here (**). You can also search all articles from the R blogosphere since c. 2009 on http://www.r-bloggers.com/. There is no filter applied: if the article exists and mentions data.table, positively or negatively, it is included on this page. Please watch out for benchmarks measured in milliseconds. Comparisons on such small scales often do not hold when scaled up to larger data because, for example, they over-represent call overhead and/or the dataset is so small it fits in CPU cache. A test repetition count (e.g. ntimes=) of 5 or more is often an indication that the test data size is too small. Please check that setkey() has been used and its time reported separately. Tutorials, slides and videos are over on the Videos & Slides page.

(**) all pages on this wiki have no write restrictions. You are encouraged to change content in this wiki yourself as you see fit. Changes will go live immediately with no oversight by any project member. If you spot any abuse, please check the edit history to see who made the edit and please inform us.

Link	Title	Author
2024.11	Syntax conversion: data.table vs. base vs. dplyr	Vincent Arel-Bundock
2024.11	Data wrangling with data.table	Stata2R: Kyle Butts, Nick Huntington-Klein, and Grant McDermott
2024.11	Julia `DataFrames.jl` comparison with `data.table`	authors of `DataFrames.jl` docs
2024.11	data.table.threads	Anirban Chetia
2024.10	Comparing `data.table` reshape to `duckdb` and `polars`	Toby Dylan Hocking
2024.10	Benchmarking rolling window functions in R	Mikkel Roald-Arbøl
2024.09	Mutation testing for `data.table`	Anirban Chetia
2024.08	Collapse reshape benchmark	Toby Dylan Hocking
2024.07	Benchmarking a change in data.table	Toby Dylan Hocking
2024.06	data.table for the Google Summer of Code 2024 (Joshua Wu)	Joshua Wu
2024.02	Column assignment and reference semantics in `data.table`	Toby Dylan Hocking
2024.02	NSF project activities	Anirban Chetia
2024.02	new programming with data.table	John MacKintosh
2024.02	more .I in data.table	John MacKintosh
2024.01	.I in data.table	John MacKintosh
2024.01	Reshape performance comparison	Toby Dylan Hocking
2023.12	Comparing data table to frame for row subset	Toby Dylan Hocking
2023.12	non-equi joins in data.table	John MacKintosh
2023.11	Some pedagogical elements of computer programming for data science: A comparison of three approaches to teaching the R language	David Shilane, Nicole Di Crecchio, Nicole L. Lorenzetti
2023.11	data.table CRAN diffs: Verifying consistency between CRAN and github	Toby Dylan Hocking
2023.10	data.table asymptotic timings	Toby Dylan Hocking
2023.03	A Coding Translation to Increase the Efficiency of Programmatic Data Analyses	David Shilane
2023.02	Pivoting data in R with tidyr and data.table	John MacKintosh
2022.11	dplyr 1.1.0 is coming soon	Davis Vaughan
2022.11	Handling larger than memory data with {arrow} and {duckdb}	David Lucey
2022.11	R Package Release History: Extracting and plotting data from CRAN web site	Toby Dylan Hocking
2022.10	Efficiency comparison of dplyr and tidyr functions vs base R	Manuel Teodoro Tenango
2022.08	modifying columns in datatable with lapply	John MacKintosh
2022.08	Simulating data from a non-linear function by specifying a handful of points	Keith Goldfeld
2022.06	Timing data.table Operations	Thomas Shafer
2022.06	Shuffling Columns With data.table	Thomas Shafer
2022.06	A quirk when using data.table?	Kenneth Tay
2022.05	Comparing performances of CSV to RDS, Parquet, and Feather file formats in R	Tomaž Kaštrun
2022.04	Loading a large, messy csv using data.table fread with cli tools	David Lucey
2022.04	Greatly revised edition of tidyverse skeptic _{Original 2019.07 below: Ctrl-F "matloff"}	Norm Matloff
2022.03	Shiny: Fast Data Loading with fst	Philipp Probst
2021.12	Optimising dplyr	Tom Jemmett
2021.11	Should I Move to a Database?	Roel M. Hogervorst
2021.10	Most Starred and Forked GitHub Repos for Data Science and R	Kenneth Leung
2021.10	fwf without the faff	John MacKintosh
2021.10	Simulating the Squid Game bridge scene in R	John Paul Helveston
2021.09	Calculating hotel occupancy with R	John MacKintosh
2021.08	Exploring Stock Market Listing Mortality since 1986	David Lucey
2021.08	Introducing the fastverse: An Extensible Suite of High-Performance and Low-Dependency Packages for Statistical Computing and Data Manipulation	Sebastian Krantz
2021.08	Well Well Well My Excel	John MacKintosh
2021.08	Cutting down code in dplyr and data.table	John MacKintosh
2021.08	Code performance in R: Working with large datasets	Mira Céline Klein
2021.07	Time Travel with py datatable 1.0	Gregory Kanevsky
2021.06	DTPlyr – easier data.table for DPLYR users	Gary Hutson
2021.06	Stress testing reshape operations on list columns	Toby Dylan Hocking
2021.06	Wide-to-tall Data Reshaping Using Regular Expressions and the nc Package	Toby Dylan Hocking
2021.05	Update about data reshaping and visualization in R and python	Toby Dylan Hocking
2021.05	Hamburg RUG: A professional trading research system in R	Daniel Brandt
2021.05	The new R pipe	Elio Campitelli
2021.04	10 Tips And Tricks For Data Scientists Vol.6	George Pipis
2021.04	Not data.table vs dplyr... data.table + dplyr!	Matt Dancho
2021.03	Some data.table tips	John MacKintosh
2021.03	Data.Table – everything you need to know to get you started in R	Gary Hutson
2021.02	I wrote one of the fastest DataFrame libraries (hacker news)	Ritchie Vink
2021.02	Joins vs case whens - speed and memory tradeoffs	Thomas Mock
2021.02	The unequalled joy of non-equi joins	David Selby
2021.02	Measuring and Monitoring Arrow's Performance: Some Updated R Benchmarks (response)	Jonathan Keane & Neal Richardson
2021.02	Bigger Data With Ease Using Apache Arrow, (response) (rebuttal)	Neal Richardson
2021.01	Fast and Easy Aggregation of Multi-Type and Survey Data in R	Sebastian Krantz
2021.01	How to create a stock screener	Martin Bel
2020.12	You only need `library(data.table)` / 你只需要`library(data.table)` (in Chinese)	Xianying Tan (@shrektan)
2020.11	Comparing Common Operations in dplyr and data.table	Martin Chan
2020.11	non-equi merge in data.table and epidemiology	Denis Mongin
2020.10	The ultimate R data.table cheat sheet	Sharon Machlis
2020.10	What is R data.table and Why is R data.table? (In Korean, 한국어)	HongDon Lee
2020.10	Solving small problems with data.table	John MacKintosh
2020.10	Python and R – Part 1: Exploring Data with Datatable	David Lucey
2020.10	Decomposition and Smoothing with data.table, reticulate, and spatstat	Tony ElHabr
2020.09	The Fastest Way To Read And Write Files In R	George Pipis
2020.09	The treedata.table Package	April Wright, Cristian Román-Palacios, Josef Uyeda
2020.09	Gotta go fast with "{tidytable}"	Bruno Rodrigues
2020.09	Task 2 - Retail Strategy and Analytics	Shrishti Vaish
2020.08	Solving small data problems with data.table	John MacKintosh
2020.08	Replicating .SD in Python Datatable	Samuel Oranyeli
2020.08	Let's Learn `data.table` (日本語)	Uryu Shinya
2020.08	87th TokyoR Meetup Roundup: {data.table}, Bioconductor, & more!	Ryo Nakagawara
2020.07	5 handy options in R data.table’s fread	Sharon Machlis
2020.07	Even more reshape benchmarks	Grant McDermott
2020.07	RvsPython #2: Pivoting Data From Long to Wide Form	Benjamin Smith
2020.06	A gentle introduction to data.table	@atrebas
2020.06	Reshape benchmarks	Grant McDermott
2020.06	Selecting and Grouping Data with Python Datatable	Samuel Oranyeli
2020.05	dtplyr speed benchmarks	Iyar Lin
2020.05	Creating a data.table from C++	David Zimmermann, Leonardo Silvestri, Dirk Eddelbuettel
2020.04	Data manipulation libraries: Translating between data.table, pandas, dplyr	Toby Dylan Hocking
2020.04	patientcounter	John MacKintosh
2020.04	Fastest data operations with least memory in tidy syntax	Tian-Yuan Huang
2020.04	W is for Write and Read Data – Fast	Sara Locatelli
2020.03	Use data.table the tidy way: An ultimate tutorial of tidyfst	Tian-Yuan Huang
2020.03	R data.table symbols and operators you should know	Sharon Machlis
2020.03	Variable name in functions, it's easy with datatable	Lino Galiana
2020.02	stringsAsFactors	Kurt Hornik
2020.01	Programming with data.table	John MacKintosh
2020.01	Blazing Fast Data Wrangling With R data.table	Thu Vu
2020.01	New Timings for a Grouped In-Place Aggregation Task	John Mount
2020.01	Base R, the tidyverse, and data.table: a comparison of R dialects to wrangle your data	Jason Mercer
2019.12	4 great free tools that can make your R work more efficient, reproducible and robust	Jozef Hajnala
2019.12	Why I don’t use the Tidyverse	Holger K. von Jouanne-Diedrich
2019.11	dtplyr 1.0.0	Hadley Wickham
2019.10	Using ggplot2 Inside data.table	John Lashlee
2019.10	Fast and Readable 'If Else' in R	Tysson Barrett
2019.10	Data Joins: Speed and Efficiency of dplyr and data.table	Tysson Barrett
2019.10	Comparing Efficiency and Speed of `data.table`: Adding variables, filtering rows, and summarizing by group	Tysson Barrett
2019.10	Columnar File Performance Check-in for Python and R: Parquet, Feather, and FST	Wes McKinney
2019.09	Selecting the max value from each group, a case study: data.table	Nathan Eastwood
2019.09	Sentiment analysis at the Fringe, part 1	Megan Stodel
2019.09	{disk.frame} is epic	Bruno Rodrigues
2019.08	A shallow benchmark of R data frame export/import methods	Julien Barnier
2019.08	The R Factor	Owen Jones
2019.08	Hydra Chronicles, Part V: Loose Ends	Brodie Gaslam
2019.08	Everyone’s Favorite Blogpost: CSV Benchmarks	Jacob Quinn
2019.08	No visible binding for global variable	Nathan Eastwood
2019.08	Why Machine Learning is more Practical than Econometrics in the Real World	Adrian Antico
2019.08	What’s next for the popular programming language R?	Dan Kopf
2019.08	Wrangling 4.6M Rows with dtplyr (the NEW data.table backend for dplyr)	Matt Dancho
2019.08	mlr3-0.1.0	Patrick Schratz
2019.07	Hydra Chronicles, Part IV: Reformulation of Statistics	Brodie Gaslam
2019.07	Multiple Columns to Multiple Colums at Once	Recle Etino Vibal
2019.07	Long to Wide and Wide to Long Format Conversion	Giovanni Pavolini
2019.07	fread-benchmarks-rsuite	Alfonso R. Reyes
2019.07	Bayesian Power Analysis with `data.table`, `tidyverse`, and `brms`	Tyson Barrett
2019.07	Making .SD your best friend	José Morales
2019.07	data.table's `cube` function	Giovanni Pavolini
2019.07	How to use .SD in the data.table package	Sharon Machlis
2019.07	Why I Chose to Learn data.table (and such related things)	Tyson Barrett
2019.07	What R’s most popular tools say about the state of data science	Dan Kopf
2019.07	data.table and Text Analysis: Analyzing the Four Gospels	Tyson Barrett
2019.07	Analyzing data with data.table	Giovanni Pavolini
2019.07	Why I love data.table	Elio Campitelli
2019.07	Why I like the Tidyverse	Chris Muir
2019.07	An opinionated view of the Tidyverse "dialect" of the R language, and its promotion by RStudio _{Circa this revision on GitHub was in effect at the time and widely shared; e.g. HackerNews. Revision announced 2022.04.}	Norm Matloff
2019.06	Learning Japanese with data.table and ggplot2	Atrebas
2019.06	data.table by a dummy	John MacKintosh
2019.06	My Favorite data.table Feature	John Mount
2019.06	Coke vs. Pepsi? data.table vs. tidy? Part 2)	Beth Milhollin, Russell Zaretzki, and Audris Mockus
2019.06	The Psychology of Flame Wars	Edwin Thoen
2019.06	data.table is Much Better Than You Have Been Told	John Mount
2019.06	data.table is expressive and powerful	Michael Frasco
2019.06	How data.table's fread can save you a lot of time and memory, and take input from shell commands	Jozef Hajnala
2019.06	Hydra Chronicles, part III: Catastrophic Imprecision	Brodie Gaslam
2019.06	Hydra Chronicles, part II: beating data.table at its own game	Brodie Gaslam
2019.06	An Overview of Python's Datatable package	Parul Pandey
2019.06	For and Against data.table	Aaron Jacobs
2019.05	Three reasons why I use data.table	Megan Stodel
2019.05	Timing Working With a Row or a Column from a data.frame	John Mount
2019.05	Using Data Cubes with R	Kristian Larsen
2019.05	cranlogs 2.1.1 is on CRAN!	R-hub blog
2019.05	R package installation on windows considered harmful	Toby Dylan Hocking
2019.05	Hydra Chronicles, part I: Pixie Dust	Brodie Gaslam
2019.04	Using data.table with magrittr pipes: best of both worlds	Martin Chan
2019.04	What are the Popular R Packages?	John Mount
2019.04	Coke vs. Pepsi? data.table vs. tidy? Examining Consumption Preferences for Data Scientists	Audris Mockus
2019.03	A data.table and dplyr tour	Atrebas
2019.03	Dependencies. Now with badges!	Dirk Eddelbuettel
2019.03	Unit Tests in R	John Mount
2019.03	Creating blazing fast pivot tables from R with data.table - now with subtotals using grouping sets	Jozef Hajnala
2019.02	A strategy for faster group statistics	Brodie Gaslam
2019.02	Verbose data.table and uncovering hidden cedta's data table awareness decisions	Jozef Hajnala
2018.12	Timing Grouped Mean Calculation in R	John Mount
2018.12	How to sort data by one or more columns with base R, dplyr and data.table	Jozef Hajnala
2018.12	Smartly select and mutate data frame columns, using dict	Roman Pahl
2018.11	Statistics Sunday: Reading and Creating a Data Frame with Multiple Text Files	Sara Locatelli
2018.11	Wrangling and Manipulation of Monthly Philippine Consumer Price Index	Recle Vibal
2018.10	Now "fread" from data.table can read "gz" and "bz2" files directly	Pradeep Mavuluri
2018.10	How to perform merges (joins) on two or more data frames with base R, tidyverse and data.table	Jozef Hajnala
2018.10	How to import a directory of csvs at once with base R and data.table. Can you guess which way is the fastest?	Jozef Hajnala
2018.10	Some R Guides: tidyverse and data.table Versions	John Mount
2018.10	Running the Same Task in Python and R	John Mount
2018.10	Limiting dependencies in R package development	Scott Chamberlain
2018.09	R Tip: Give data.table a try	John Mount
2018.08	Timings of a Grouped Rank Filter Task	John Mount
2018.08	R Tip: Consider Radix Sort	John Mount
2018.08	Meta-packages, nails in CRAN’s coffin	John Mount
2018.07	EARL London interviews – Patrik Punco, NOZ Medien	Mango Solutions
2018.07	Speed up your R Work	John Mount
2018.06	Python for data analysis… is it really simple?!?	Ferenc Bodon
2018.06	R and Data – When Should we Use Relational Databases?	Claude Seidman
2018.06	Re-referencing factor levels to estimate standard errors when there is interaction turns out to be a really simple solution	Keith Goldfeld
2018.06	Most Starred R Packages on GitHub	Steven Mortimer
2018.06	Melt and Cast The Shape of Your Data-Frame: Exercises	sindri
2018.06	Sharpening The Knives in The data.table Toolbox: [Exercises] [Solutions]	sindri
2018.06	rqdatatable: rquery Powered by data.table	John Mount
2018.04	An R vlookup? Not so silly idea	Hanjo Oden
2018.04	Benchmarking the six most used manipulations for data.tables in R	Opremic
2018.04	Down the AUC Rabbit Hole and into Open Source: Part 2	Michael Frasco
2018.04	Down the AUC Rabbit Hole and into Open Source: Part 1	Michael Frasco
2018.04	Quick R Tutorial	Frank Erickson
2018.03	pandas vs. data.table – A study of data-frames – Part 2	Tobias Krabel
2018.02	Retail analytics: from hours to seconds using R	Bharani Subramaniam
2018.02	pandas vs. data.table – A study of data-frames	Christian Moreau
2018.02	Julia vs R vs Python: string-sort performance + an unfinished journey to optimizing Julia's performance	ZJ
2018.02	dplyr, (mc)lapply, for-loop and speed	Mike Spencer
2018.02	Speeding up spatial analyses by integrating `sf` and `data.table`: a test case	Lorenzo Busetto
2018.02	Packages for Getting Started with Time Series Analysis in R	Abraham Mathew
2018.02	DataExplorer: Fast Data Exploration With Minimum Code	Boxuan Cui
2018.01	Supercharge your R code with wrapr	John Mount
2018.01	Tidyverse and data.table, sitting side by side… and then base R walks in	Iñaki Úcar
2018.01	Tidyverse and data.table, sitting side by side (Part 1)	Dirk Eddelbuettel
2018.01	Base R can be Fast	John Mount
2018.01	Lightning fast serialization of datasets using the fst package	Mark Klik
2018.01	rquery: Fast Data Manipulation in R	John Mount
2017.12	A tour of the data.table package by creator Matt Dowle	David Smith
2017.12	More Pipes in R	John Mount
2017.12	Team Rtus wins Munich Re Datathon with mlr	Jann Goschenhofer
2017.12	Correlated log-normal chain-ladder model	Markus Gesmann
2017.11	How we built a Shiny App for 700 users	Olga Mierzwa-Sulima
2017.11	An empirical study of group-by strategies in Julia	ZJ
2017.11	Using data.table and Rcpp to scale geo-spatial analysis with sf	Tim Appelhans
2017.11	Creating integer64 and nanotime vectors in C++	Dirk Eddelbuettel
2017.10	The Impressive Growth of R	David Robinson
2017.10	Data.Table by Example – Part 3	atmathew
2017.09	Speed of data manipulations in Julia vs R	ZJ
2017.09	Data.Table by Example – Part 2	atmathew
2017.09	Data.Table by Example – Part 1	atmathew
2017.09	Beyond the basics of data.table: Smooth data exploration	Sindri
2017.09	Strategies to Speed-up R Code	Selva Prabhakaran
2017.08	Is the Hadleyverse the only option?	Billy Fung
2017.08	Basics of data.table: Smooth data exploration	Sindri
2017.08	Polygenic Risks Scores with data.table in R	Sahir Rai Bhatnagar
2017.08	July(ish) Update	John MacKintosh
2017.08	R for System Adminstration	Dirk Eddelbuettel
2017.07	Compare data.table pipes and magrittr pipes	Guanglai Li
2017.06	data.table tutorial (with 50 examples)	Deepanshu Bhalla
2017.06	The data.table R Package Cheat Sheet	Karlijn Willems
2017.06	Data Manipulation with data.table (part 2)	Biswarup Ghosh
2017.06	R in pRoduction: theRe be dRagons!	Tim Sweetser and Kyle Schmaus
2017.06	Improving Zillow’s Zestimate with 36 Lines of Code	Eduardo Ariño de la Rubia
2017.06	Data Manipulation with data.table (part 1)	Biswarup Ghosh
2017.05	plotly 4.7.0 now on CRAN	Carson Sievert
2017.05	R⁶ — Idiomatic (for the People)	Bob Rudis
2017.05	Reading/writing biggish data, revisited	Karl Broman
2017.05	dplyr in context	John Mount
2017.05	Everyone knows that loops in R are to be avoided but vectorization is not always possible	Keith Goldfeld
2017.04	R code to accompany Real-World Machine Learning (Chapter 6): Exploring NYC Taxi Data	Paul Adamson
2017.04	Fast data loading from files to R	Olga Mierzwa-Sulima
2017.03	Data Manipulation with Python Pandas and R Data.Table	Fisseha Berhane
2017.03	Fast data lookups in R: dplyr vs data.table	Marek Rogala
2017.02	Fitting logistic regression on 100gb dataset on a laptop	Dmitriy Selivanov
2017.02	Large data, feature hashing and online learning	Dmitriy Selivanov
2017.02	Moving largish data from R to H2O - spam detection with Enron emails	Peter Ellis
2017.01	Discover your data (XGBoost vignette)	Tianqi Chen, Tong He, Michaël Benesty, Yuan Tang
2017.01	fst: Fast serialization of R data frames	David Smith
2017.01	fst: Lightning Fast Serialization of Data Frames	Mark Klik
2017.01	R to the Rescue	John Mackintosh
2016.12	Using R to prevent food poisoning in Chicago	David Smith
2016.12	Behind the scenes of CRAN	Matt Dowle
2016.12	nanotime 0.0.1: New package for Nanosecond Resolution Time for R	Dirk Eddelbuettel
2016.12	Does replyr::let work with data.table?	John Mount
2016.12	data.table: Where Have You Been All My Life?	JoAnn Rudd Alvarez
2016.12	Organize your data manipulation in terms of “grouped ordered apply”	John Mount
2016.12	Comparing a MySQL Query with a Data Table in R	Douglas Rice
2016.11	data.table: squeeze the maximum speed when using data in R	Stanislav Chistyakov
2016.10	Data Wrangling: Quick Guide for dplyr, data.table and R build-in data.frame	Vincent Cao
2016.09	This Machine Learning Project on Imbalanced Data Can Add Value to Your Resume	Manish Saraswat
2016.09	Rolling a join	Will Rogers
2016.07	Winning approach of the Facebook V Kaggle competition	Tom Van de Wiele
2016.07	New release of partools package	Norm Matloff
2016.07	Bad Coder, Bad Coder!	Norm Matloff
2016.06	Intro to the data.table package	Steve Pittard
2016.06	Boost Your Data Munging with R	Jan Gorecki
2016.06	Improving Season on Season	James P. Curley
2016.06	Understanding data.table Rolling Joins	Robert Norberg
2016.05	From a (set.)seed grows a mighty dataset	Jonathan Carroll
2016.05	Feather: fast, interoperable data import/export for R	David Smith
2016.05	Best packages for data manipulation in R	Fisseha Berhane
2016.05	My Two favorite Packages for Data Manipulation in R	Fisseha Berhane
2016.05	Use H2O and data.table to build models on large data sets in R	Manish Saraswat
2016.05	The R Data I/O Shootout	Eduardo Ariño de la Rubia
2016.05	Red herring bites	Matt Dowle
2016.05	data.table() vs data.frame() – Learn to work on large data sets in R	Manish Saraswat
2016.04	Feather: it's about metadata	Wes McKinney
2016.04	Fast csv writing for R	Matt Dowle
2016.04	I'll Keep Using R	Michael Ekstrand
2016.04	data.table objects should not be considered data.frame instances in R [retracted]	John Mount
2016.04	Learning R in Seven Simple Steps	Martijn Theuwissen
2016.04	Collapsing lists of data.frames with data.table	Steph Locke
2016.04	Working with databases in R	Fisseha Berhane
2016.03	Data table exercises: keys and subsetting	Han de Vries
2016.03	Performing SQL selects on R data frames	Fisseha Berhane
2016.02	Read from hdfs with R. Brief overview of SparkR	Dmitriy Selivanov
2016.02	Up to code? An algorithm is helping Chicago health officials predict restaurant safety violations (featured on TV at 06:40). [Tweet] [Code]	PBS NewsHour
2016.01	Strategies to Speedup R Code	Selva Prabhakaran
2015.12	Our R package roundup 2015	Christoph Safferling
2015.12	Who’s downloading the forecast package?	Rob J Hyndman
2015.12	Solve common R problems efficiently with data.table	Jan Gorecki
2015.11	Efficient aggregation (and more) using data.table	David Kun
2015.11	Scaling data.table with index	Jan Gorecki
2015.11	H2O World 2015 – Day 2 Highlights	Anmol Rajpurohit, KDnuggets
2015.11	H2O World 2015	Joseph Rickert
2015.11	H2O.ai raises $20m series B to capitalize on rapid open source machine-learning growth	Matt Aslett, 451 Research
2015.10	R and Impala: it's better to KISS than using Java	Gergely Daroczi
2015.10	R: data.table – Finding the maximum row	Mark Needham
2015.09	Querying a 20 million line CSV file – data.table vs data frame	Mark Needham
2015.09	Data ergonomics with data.table, iHub Nairobi, with supporting materials	Henk Harmsen
2015.09	R Stories from the Trenches [Video] [Slides]	Szilard Pafka
2015.09	Advanced Tips and Tricks with data.table	Andrew Brooks
2015.08	data.table cookbook	Steph Locke
2015.07	Overlap joins in R: a speed comparison with packages sqldf and data.table	Zev Ross
2015.06	Data Warehousing with R	Jan Gorecki
2015.06	Auditing data transformation	Jan Gorecki
2015.06	Back from R/Finance in Chicago	Markus Gesmann
2015.05	Fast data munging in R	Alexander Konduforov
2015.05	No THIS Is How You dplyr and data.table!	Jeffrey Horner
2015.05	Comparing data frames, data.table and dplyr with random walks	David Smith
2015.05	Working with "large" datasets, with dplyr and data.table	Arthur Charpentier
2015.04	Comparing the execution time between foverlaps and findOverlaps [data.table vs GenomicRanges]	Katarzyna Wręczycka
2015.04	Open Source Business Intelligence: Then and Now	Steve Miller
2015.04	Mapping Flows in R with data.table and lattice	Oscar Perpiñán Lamigueiro
2015.03	Need for Processing Speed: data.table	OpenAnalytics
2015.03	Getting Data From An Online Source	Robert Norberg
2015.02	A data.table R tutorial by DataCamp: intro to DT[i, j, by]	DataCamp
2015.02	Minimal example for joining data.tables	Markus Gesmann
2015.01	Using the microbenchmark package to compare the execution time of R expressions	Stephen Turner
2015.01	Sessionizing Log Data Using data.table	Randy Zwitch
2015.01	R in Business Intelligence	Jan Gorecki
2014.12	dplyr and a very basic benchmark	Szilard Pafka
2014.12	JOINing data in R using data.table	Ronald Stalder
2014.12	Cheat Sheets for Data Science	Steve Miller
2014.11	Partying R Style with Sqor Sports, R on Azure, and data.table	Joseph Rickert
2014.11	The data.table Cheat Sheet	DataCamp
2014.11	Some R Highlights from H20 World	Joseph Rickert
2014.10	Complete data.table tutorial: data analysis the data.table way	DataCamp
2014.10	data.table University	Steve Miller
2014.10	Visualising the seasonality of Atlantic windstorms	Markus Gesmann
2014.08	Scaling up data frames	Ben Lorica
2014.08	data.table for R	Grant Rettke
2014.08	MongoDB – State of the R	Raffael Vogler
2014.08	VIDEO Matt Dowle's data.table talk from useR! 2014	Eduardo Ariño de la Rubia
2014.08	Pro Grammar and Devel Hoper	Romain Francois
2014.08	Faster CSV Import with R	Phill Clarke
2014.07	10 R Packages to Win Kaggle Competitions	Xavier Conort
2014.07	R – Data.Table Rolling Joins	Ben Gorman
2014.07	Dependencies of popular R packages	Andrie de Vries
2014.07	2014 useR! conference, days 1-2	Karl Broman
2014.06	The joy of joining data.tables	Markus Gesmann
2014.06	Concatenating a list of data frames	Andrew
2014.05	R/Finance 2014	Steve Miller
2014.05	Working with large data sets in R - data.table and dcast	Kamil Bartocha
2014.05	Reading large data tables in R	Fabio Marroni
2014.04	Exploring US healthcare data	Vik Paruchuri
2014.04	data.table vs dplyr in split apply combine style analysis	Brodie G
2014.02	Dueling R and Python Followup	Steve Miller
2014.02	Efficiency of Importing Large CSV Files in R	statcompute
2014.01	Benchmark on baseball data: dplyr (0.1) and data.table (1.8.10) [tweet]	Arun Srinivasan and Matt Dowle
2014.01	R: the good parts	Jose Quesada
2014.01	Two of my favorite data.table features	Brandon Le Beau
2014.01	When I use plyr/dplyr/data.table	Educate-R
2013.12	Review: Kölner R Meeting 13 December 2013	Markus Gesmann
2013.09	A speed comparison of plyr, data.table and dplyr	Jake Russ
2013.08	An R function like “order” from Stata	Ananda Mahto
2013.07	Fig Data: 11 Tips on How to Handle Big Data in R (and 1 Bad Pun)	Ulrich Atz
2013.07	A Bottom-up Start on Big Data Analytics	Steve Miller
2013.06	Simulating Map-Reduce in R for Big Data Analysis Using Flights Data	Jitender Aswani
2013.06	Improve The Efficiency in Joining Data with Index	statcompute
2013.04	FasteR! HigheR! StrongeR! – A Guide to Speeding Up R Code for Busy People	Noam Ross
2013.04	Using data.table for binning	Oscar Perpiñán Lamigueiro
2013.03	RMark: data.table merge vs core merge	Xachriel
2013.02	data.table or data.frame?	DataParadigms
2013.01	Another Benchmark for Joining Two Data Frames	statcompute
2013.01	Efficiecy of Extracting Rows from A Data Frame in R	statcompute
2013.01	Efficiency in Joining Two Data Frames	statcompute
2012.12	Surprising Performance of data.table in Data Aggregation	Wensui Liu
2012.11	Data.table rocks! Data manipulation the fast way in R	Markus Gesmann
2012.10	Generate a panel data.table or data.frame to fill with data	Thiemo Fetzer
2012.06	Transforming subsets of data in R with by, ddply and data.table	Markus Gesmann
2012.06	Access data quickly and easily: data.table package	Anna Longari
2012.05	data.table 1.8.1 - Now allows numeric columns and big-number (via bit64) in keys!	Branson Owen
2012.03	R code for Chapter 2 of Non-Life Insurance # with GLM	Allan Engelhardt
2012.02	Elegant & fast data manipulation with data.table	Carl Boettiger
2012.01	Say it in R with "by", "apply" and friends	Markus Gesmann
2011.08	Comparison of ave, ddply and data.table	Paul Hiemstra
2011.04	Data Aggregation in R: plyr, sqldf and data.table	Hayward Godwin
2011.03	Applying functions on groups: sqldf, plyr, doBy, aggregate or data.table ?	altuna
2011.03	Fast(ish) extraction of exon locations from a BED12 file using data.table	altuna
2011.03	data.table: an R package everyone should use	Jason
2011.02	By-Group Processing, the R data.table and the Power of Open Source	Steve Miller

Wiki Home
Getting started
Events: Videos & Slides
Articles
Installation
Support
Revdep checks
?data.table ?fread ?fwrite
fread for small data
Do's and Don'ts
Performance Testing
Triage Management
Translations
Hindi translations planning
#rdatatable

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Articles

Clone this wiki locally