Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Does Tweet contain emotion? #6

Open
audrism opened this issue Feb 14, 2019 · 3 comments
Open

Does Tweet contain emotion? #6

audrism opened this issue Feb 14, 2019 · 3 comments
Assignees
Milestone

Comments

@audrism
Copy link
Contributor

audrism commented Feb 14, 2019

No description provided.

@mickidymick mickidymick added this to the Sprint 1 milestone Feb 14, 2019
@syd-shelby
Copy link
Contributor

Find the best way to filter tweets into emotion/no-emotion categories. Apply sentiment analysis on emotion tweets and see results.

@audrism
Copy link
Contributor Author

audrism commented Feb 21, 2019

Older datasets have only relevant tweets
newer tweets mark 0 when irrelevant
model d2v has 87%

syd-shelby added a commit that referenced this issue Feb 26, 2019
basic d2v emotion classifier. 86% cross val score
@syd-shelby
Copy link
Contributor

Here are the things I did today:
-Coded 450 more tweets
-Added all new articles to google drive and started updating their descriptions. I still have a handful more to read through.
-I haven't been able to find any corpora that defines whether or not a text contains emotion so I tried classifying from the tweets I have hand coded (only 105 are relevant so this is a really small test case) Initially I'm getting poor results, but it kind of makes sense. In the training data, only 275/989 tweets are labeled with emotion, and my classifier labels most of the data as not containing emotion. However, in the new data set I marked 62/105 with emotion. (the code is posted in github under src/EmotionDetection.ipynb and is tied to my github issue)

From here on the emotion detection task I will play around with removing stop words and some other parameters to see if I can get the accuracy up. I'm also going to look into some alternative ways to test the classifier. One of the papers I read used tweets with emojis to reflect ones as subjective and tweets from popular newspapers such as NYT to reflect objective tweets. I might try to gather similar tweets so I have a more robust corpora for testing my emotion detector.

syd-shelby added a commit that referenced this issue Mar 14, 2019
syd-shelby added a commit that referenced this issue Mar 26, 2019
This classifier is trained on several data sets that contain text and a true or false value  that mark whether or not the text contains a emotion. These sets are news headlines (all false), Quotes from the tv show friends pulled from another study (these were all true), and some of our own labeled data (mixed true and false). This had a cross val score of ~91%. 

I applied this to some of my own hand labeled data (111 true tweets, 12 negative). At first this was receiving a high precision, but very low accuracy. This is due to how unbalanced this data is. However, when re sampling the training data and retraining the classifier on a ratio more similar to that of my test data, the accuracy rose to 90%
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants