Skip to content

Lecture 3

chris wiggins edited this page Feb 10, 2019 · 21 revisions

Jan 29, 2019

readings

2019:

Jan 29, 2019:

readings:

  1. Wallach, H. (2014, December). Big data, machine learning, and the social sciences: Fairness, accountability, and transparency. In NeurIPS Workshop on Fairness, Accountability, and Transparency in Machine Learning. Available via https://medium.com/@hannawallach/big-data-machine-learning-and-the-social-sciences-927a8e20460d . Dr. Wallach ( http://dirichlet.net/about/ ) is a former CS Professor now working in NYC at Microsoft Research. She’s been a leader both in machine learning research and the emerging discipline of computational social science. This piece is an early example of technologists begining to question data and propose a new research field.

  2. Boyd, Danah, and Kate Crawford. “Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon.” Information, communication & society 15, no. 5 (2012): 662-679. Available via https://www.tandfonline.com/doi/abs/10.1080/1369118X.2012.678878 .

  1. 14 very readable pages from 2014 by Zeynep Tufekci on tech & politics. This will wrap up our “setting the stakes” readings on our data-driven present.

Tufekci, Zeynep. “Engineering the public: Big data, surveillance and computational politics.” First Monday 19, no. 7 (2014).

discussion

for easy reference, the 6 sections of boyd+Crawford:

  1. Big Data changes the definition of knowledge
  2. Claims to objectivity and accuracy are misleading
  3. Bigger data are not always better data
  4. Taken out of context, Big Data loses its meaning
  5. Just because it is accessible does not make it ethical
  6. Limited access to Big Data creates new digital divides

for easy refernece, the 6 sections of Tufekci

  1. rise of big data
  2. shift away from demographics to individualized targeting
  3. opacity and power of computational modeling
  4. use of persuasive behavioral science
  5. digital media enabling dynamic real time experimentation
  6. growth of new power brokers who own the data or social media environments

for easy reference, the 4 sections of Wallach:

  1. Data
  2. Questions
  3. Models
  4. Findings

2018:

Jan 29, 2018

census, statistics, and "computational politics"

  1. 14 very readable pages from 2014 by Zeynep Tufekci on tech & politics. This will wrap up our "setting the stakes" readings on our data-driven present.

Tufekci, Zeynep. "Engineering the public: Big data, surveillance and computational politics." First Monday 19, no. 7 (2014).

https://data-ppf.slack.com/files/U3SJU2P6W/F8Z130KH9/tufekci.pdf

  1. 22 also very readable pages from 2007 by Sarah Igo. This will set the data (about people) in a historical context, looking at how it came to be that we collect and use data about people, for policy as well as marketing and other commercial ends.

Igo, Sarah Elizabeth. The averaged American: Surveys, citizens, and the making of a mass public. Harvard University Press, 2007. (Introduction)

https://data-ppf.slack.com/files/U3SJU2P6W/F8YHB0VRP/igo_the_averaged_american_introduction.pdf

  1. 29 frankly not-as-breezy-to-read pages from 1998 by Alain Desrosières.

This is the moment in our class when we take the most ancient step back in time, to a time before "Statistics" as a word had anything to do with numbers. The excerpt is Chapter 1 of "The Politics of Large Numbers: A History of Statistical Reasoning" (2002 edition), an excellent and scholarly book on how statistical thinking came to be. We'll try to emulate the context-awareness of this history throughout the class, though most of the readings will be less "scholarly" and more readable.

Desrosières, Alain. The politics of large numbers: A history of statistical reasoning. Harvard University Press, 2002. (Ch 1)

https://data-ppf.slack.com/files/U3SJU2P6W/F8YHB5KAM/desrosieres_the_politics_of_large_numbers_ch01.pdf

2017 discussion:

  • vulcans, martians, and domain expertise
  • election of 2016: marketing and polling
  • A/B testing and causality