Tweet_normalizer

The following program is a normalization system which helps to convert raw English tweets into (partially) normalized tweets, suitable for further NLP processing

Requirements

Python 2.7
Chainer 1.7 (chainer)
NLTK 3.0
context2vec
If not installed the NLTK Brown corpus will be downloaded when running the script

To run:

sh normalize_tweets.sh -f true arguments are:

• -c: path for context2vec model

• -i: path for the input raw tweets

• -o: path for the output file (optional)

• -f: use Damereau-Levenshtein fast computainon (optional, dafault=true)

$ sh normalize_tweets.sh -c context2vec.ukwac.model.package/context2vec.ukwac.model.params -i CorpusBataclan_en.1M.raw.tx -f true

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
normalize_tweets.py		normalize_tweets.py
normalize_tweets.sh		normalize_tweets.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tweet_normalizer

Requirements

To run:

About

Releases

Packages

Languages

DavidBert/Tweet_normalizer

Folders and files

Latest commit

History

Repository files navigation

Tweet_normalizer

Requirements

To run:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages