Skip to content

Latest commit

 

History

History
26 lines (23 loc) · 1.89 KB

README.md

File metadata and controls

26 lines (23 loc) · 1.89 KB

About

Just like the stock index, the pandemindex reflects global sentiment about the COVID-19 pandemic.
The pandemindex is calculated from 200 million tweets about the COVID-19 pandemic.
Slide along the pandemindex and see the most frequent words, hashtags, emojis and symptoms used in the tweets of that day. https://twitterpandemicindex.netlify.app/ screenshot As examples, the pandemindex follows the ups and downs of several milestone events (CNN timeline):

  • 2020-02-02: first death outside China in the Phillipines
  • 2020-02-07: the world mourns the death of Dr Li Wenliang whose early warning about the coronavirus was silenced by China
  • 2020-02-29: first death in the US (Washington state)
  • 2020-03-03: Federal Reserve drops interest rate by 0.5%
  • 2020-03-11: WHO declares COVID-19 a pandemic
  • 2020-03-13: US announces relief package

How we built it

architecture_png

What this repo does

  1. Main source of tweets is the Large Scale COVID-19 Twitter Chatter dataset for Open Scientific Research by Banda et al
  2. Tweets can also be collected via Twitter Streaming API or Search API into .txt (text_preprocess/tweetstream_txt.py)
  3. Annotate tweets with symptoms, places (see repo)
  4. /data_preprocess reads in datasets from covid19.ascend.io (read_ascend.py) or Google BigQuery (read_bigquery.py)
  5. Interpret and visualize dynamic lda (lda/interpret_ldaseq.py)

Amend config_copy.py

Fill in with your credentials and rename file to config.py