This repository contains the dataset and code used to analyze the impact of OpenML. The results are included in OpenML cells paper. The analysis focuses on research papers citing the core OpenML paper, Python and R connectors, and benchmarking suite papers.
- Data:
data/collected_papers.csv
: Contains the originally collected data on 1719 papers from Google Scholar.data/Final_survey_data.csv
: The cleaned dataset (after filtering papers based on availability, language, and other criteria) with review results. - Code:
scripts/analysis.py
: Python scripts used to clean the data, run statistical analyses, and generate figures/tables for the paper. - Documentation:
docs/methodology.md
Details of the review methodology and questionnaire used for the analysis.