Combining data, computation, and inferential thinking, data science is redefining how people and organizations solve challenging problems and understand their world. This intermediate-level class bridges between Data 8 and upper-division computer science and statistics courses as well as methods courses in other fields. In this class, we explore key areas of data science, including question formulation, data collection and cleaning, visualization, statistical inference, predictive modeling, and decision-making. Through a strong emphasis on data-centric computing, quantitative critical thinking, and exploratory data analysis, this class covers key principles and techniques of data science. These include languages for transforming, querying, and analyzing data; algorithms for machine learning methods, including regression, classification, and clustering; principles behind creating informative data visualizations; statistical concepts of measurement error and prediction; and techniques for scalable data processing.
This organization houses various course materials and websites for Data 100 at UC Berkeley.