Skip to content

A text mining course for social scientists and digital humanists

License

Notifications You must be signed in to change notification settings

eisioriginal/tm4ss.github.io

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tm4ss - Text Mining for Social Scientists and Digital Humanists

This course consists of 8 tutorials written in R-markdown and further described in this paper.

You can use knitr to create the tutorial sheets as HTML notebooks from the R-markdown source code.

In the /docs folder, you have access to the rendered tutorials.

Tutorials

  1. Data import and web scraping
  2. Text as data
  3. Frequency analysis
  4. Key term extraction
  5. Co-occurrence analysis
  6. Topic models (LDA)
  7. Text classification
  8. Part-of-Speech tagging / Named Entity Recognition

Click here for the rendered tutorials.

License & Citation

This course was created by Gregor Wiedemann and Andreas Niekler. It was freely released under GPLv3 in September 2017. If you use (parts of) it for your own teaching or analysis, please cite

Wiedemann, Gregor; Niekler, Andreas (2017): Hands-on: a five day text mining course for humanists and social scientists in R. Proceedings of the 1st Workshop Teaching NLP for Digital Humanities (Teach4DH), GSCL 2017, Berlin.

Bibtex

@inproceedings{WN17,
  author    = {Gregor Wiedemann and Andreas Niekler},
  title     = {Hands-On: {A} Five Day Text Mining Course for Humanists and Social Scientists in {R}},
  booktitle = {Proceedings of the Workshop on Teaching {NLP} for Digital Humanities
               (Teach4DH) 2017, Berlin, Germany, September 12, 2017.},
  pages     = {57--65},
  year      = {2017},
  crossref  = {DBLP:conf/gldv/2017teach4dh},
  url       = {http://ceur-ws.org/Vol-1918/wiedemann.pdf},
}

About

A text mining course for social scientists and digital humanists

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HTML 90.2%
  • TeX 5.9%
  • R 3.9%