Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Relevant tweets #5

Open
audrism opened this issue Feb 14, 2019 · 2 comments
Open

Relevant tweets #5

audrism opened this issue Feb 14, 2019 · 2 comments
Assignees
Milestone

Comments

@audrism
Copy link
Contributor

audrism commented Feb 14, 2019

No description provided.

@audrism audrism added this to the Sprint 1 milestone Feb 14, 2019
@abhidya
Copy link
Contributor

abhidya commented Mar 6, 2019

https://github.com/DisasterMasters/TweetAnalysis/blob/master/src/results/Relevance%20Preprocessing.ipynb
Best Text Preprocessing for Doc2vec is simply distributed bag of words + punctuation removal
Tried combos of
distributed memory
distributed bag of words
LowerCase
Removal of Stop Words
Rare words removal
Spelling correction
punctuation removal

@audrism
Copy link
Contributor Author

audrism commented Mar 13, 2019

@abhidya what are the datasets you train relevant/irrelevant tweets for irma? Also is the code link above the right one. @nwest13

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants