Skip to content

Latest commit

 

History

History
23 lines (17 loc) · 1.01 KB

README.md

File metadata and controls

23 lines (17 loc) · 1.01 KB

DataBench - a Julia vs R data manipulation benchmark suite

A comparison of data manipulation prowess using synthetic data and the GE Flight Quest data

Set up instructions

# Pkg.add("DataBench")
  1. Change the settings.csv's data_path to a path that you can write to
  2. Download the 7z file (https://www.kaggle.com/c/flight/download/InitialTrainingSet_rev1.7z) and
  3. Extract it into the folder data_path/InitialTrainingSet_rev1

Synthetic benchmarks

Adapted from data.tables' official benchmarks

"Real-life" benchmarks

Uses GE Flight Quest data, the largest tabular dataset on Kaggle at the time of writing

Companion post

Speed of data manipulations in Julia vs R

Similar repos

https://github.com/szilard/benchm-databases