NLP-text-corpora-build

Hi, In this I have used two corpora :

part of coca corpus (Corpus of Contemporary American English) it is an english language Corpus and
corona virus corpus

It has 11946296 words

I performed these analyses :

1)Word frequency analysis 2)Parts of Speech tagging 3)chunking and chinking 4)Word feature extraction 5)ngrams 6)Named Entity Recognition

The outputs are attached under outputs folder The codes are attached under codes folder The corpora are attached under new corpus folder

This is the directory structure in which these are the subfolders:

*new corpus -     consists of all the .txt files of the corpus
*code       -     consists of all .py files 
*outputs    -     consists of all outputs of .py files

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
20MCMI19_NLPAssignment		20MCMI19_NLPAssignment
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP-text-corpora-build

About

Releases

Packages

Languages

its-me-anvesh-var/NLP-text-corpora-build

Folders and files

Latest commit

History

Repository files navigation

NLP-text-corpora-build

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages