Skip to content

Natural language processing applied to social media, looking for new trends in health related posts

Notifications You must be signed in to change notification settings

haeste/SocialMediaScanning

Repository files navigation

Mining social media for citizen insight

Social media sites such as Twitter, Reddit, and Mumsnet can be an important source of information for health research. They provide an archive of the thoughts, feelings, and concerns of large parts of the population on a wide range of topics. This can be used to explore citizen sentiment towards a topic, track changes over time, and reveal new bodies of concern that traditional research methods may miss. We can scrape data from websites such as Mumsnet or use the Twitter Academic API to search for tweets relevant to our research question. Natural language processing methods such as Latent Dirichlet Allocation can then be used to structure the collections of tweets or posts into topics. Qualitative methods can also be used to interpret the topics found, or independently applied to generate an understanding of small numbers of posts or tweets. Steps:

  • Identify sources of interest, e.g. the forum or social media you wish to search.
  • Formulate a search strategy: search terms and a method for applying them.
  • Collate your posts or tweets.
  • Clean and pre-process your posts or tweets.
  • Fit an LDA topic model.
  • Interpret the topics found.

About

Natural language processing applied to social media, looking for new trends in health related posts

Resources

Stars

Watchers

Forks

Packages

No packages published