These materials were originally created by Bradley McDonnell, Na-Rae Han, and Eve Okura Koller for a 2019 workshop; the original materials are here. The repository you're looking at right now contains these materials as lightly adapted by Dan Villarreal. Thanks to the original authors for making these materials available to be reused.
- Date: Thursday, January 3, 2019
- Time: 9:00 AM - 3:30 PM
- Location: Flatiron Room, Sheraton New York Times Square
- Bradley McDonnell, University of Hawai‘i at Mānoa (mcdonn@hawaii.edu)
- Na-Rae Han, University of Pittsburgh (naraehan@pitt.edu)
- Eve Okura Koller, University of Hawai‘i at Mānoa (ekoller@hawaii.edu)
Reproducible research—the concept that data and any code for analysis should be published alongside the research results so that others can validate and/or build upon the claims of the research—has gained real traction in the social sciences in recent years, and with the development of several open-source digital tools, conducting research in a reproducible research is more accessible than ever. While some sub-disciplines of linguistics have advocated for reproducible research and individual linguists have implemented reproducible methodologies into their research, many linguists lack the knowledge and/or practical training to make their research reproducible.
The aim of this workshop, then, is to provide a conceptual foundation for reproducible research in linguistics alongside practical hands-on training in implementing best practices in reproducible research through the use of several open-source digital tools. These include topics such as, (i) versioning with git and publishing and collaborating on versioned research outputs using web-based platforms, such as GitHub, and (ii) ‘notebooks’ or ‘dynamic documents’ that directly link the data and code for analysis to the prose of a research report using Jupyter Notebooks with Python and RStudio with R, which allow for various outputs (e.g., pdf, HTML) that contain both computer code and rich text elements (paragraphs, equations, figures, links, etc.).
By the end of the workshop, participants will be able to:
- Initiate git repo using the command line
- Use basic
git
commands (e.g., commit, status) - Create a repo on GitHub
- Work collaboratively on GitHub Repo
- Integrate Markdown and Python in Jupyter Notebooks
- Integrate R code and Markdown for dynamic (
knitr
) documents
This workshop is primarily aimed at early- and mid-career researchers, but will be accessible to both graduate students and senior researchers. Basic knowledge of either R or Python will be very beneficial but is not required. Participants will not be asked to write their own code; exercises will contain the necessary Python and R code.
Participants will be required to bring laptop computers to the workshop running OS-X (Mac) or Windows (mobile systems such as iPads, Android tablets, and Chromebooks are not suitable for the workshop). Prior to the workshop, the instructors will send out instructions for installing all of the necessary software, which include git, R, and Python 3.
Time | Topic | Presenters |
---|---|---|
9:00 - 9:15 | Introductions | |
9:15 - 9:30 | Overview of workshop | Brad |
9:30 - 10:45 | Introduction to git | Na-Rae |
10:45 - 11:00 | Break | |
11:00 - 12:30 | Linking git and GitHub | Na-Rae, Brad |
12:30 - 1:30 | Lunch | |
1:30 - 1:45 | RStudio with R demo | Brad |
1:45 - 2:00 | Jupyter Notebooks with Python demo | Na-Rae |
2:00 - 3:15 | Break out groups with RStudio and Jupyter Notebooks | |
3:15 - 3:30 | Concluding remarks |
This material is based upon work supported by the National Science Foundation under grant SMA-1745249 to the University of Hawai‘i at Mānoa. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.