The Data Mining Cup is an international competition hosted by a German analytics company, prudsys AG. In 2015 almost 200 teams from about 150 universities in 48 countries took part in the competition. The best teams were invited to Berlin for the awards ceremony at the prudsys personalization summit (for retail).
Iowa State has a rich history of performing very well in DMC. Rick Zhou led ISU to fifth place in the 2013 competition, Cory Lanker led to first place in 2014, and Ian Mouzzon led to 2nd and 3rd place in 2015.
April 6 :: Competition officially starts and data is released
May 18, 6:00 AM (14:00 CEST) :: End of competition and deadline to submit predictions
June 28 - 29 :: prudsys personalization summit and award ceremony in Berlin
If you are not familiar with Git and GitHub, please read through the following links:
If you are used to working from the command line then Git should be easy for you. If not, there is a great GUI for GitHub that is supported by both Mac and Windows: https://desktop.github.com/.
There is also a YouTube Channel dedicating to using GitHub: https://www.youtube.com/playlist?list=PL0lo9MOBetEHhfG9vJzVCTiDYcbhAiEqL.
Keep in mind that GitHub is not designed to be a database, and generally the size of a repository is capped at 1 GB. However, we will be using the Git Large File Storage (Git LFS) system which is supported by GitHub in order to store datasets. With Git LFS we get up to 50 GB of storage which should be much more than enough for the competition. Using Git LFS is extremely easy. You can learn about it here:
- Getting started with Git LFS
- Short video tutorial on Git LFS (and I really do mean short)
- Installing Git LFS
Remember, please do not ever push large data files unless they are tracked by Git LFS.
The typical workflow when using Git goes like this:
1️⃣ Pull latest work from the remote repository
From command line:
> git pull
From GitHub app: just press the 'sync 🔁' button
2️⃣ Do some coding...
3️⃣ Commit changes
From command line:
# First add files to staging area
> git add -u # to commit changes to files that are already being tracked by git, or use
> git add [file] # to only commit changes for a specific file, or use
> git add -A # to commit changes for all files that you modified
# Not commit changes
> git commit -m "updates"
From GitHub app: press the 'commit' button
4️⃣ Push changes back to the remote repository
From command line:
> git push
From GitHub app: press the 'sync 🔁' button