Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Coding #9

Open
audrism opened this issue Feb 21, 2019 · 9 comments
Open

Coding #9

audrism opened this issue Feb 21, 2019 · 9 comments

Comments

@audrism
Copy link
Contributor

audrism commented Feb 21, 2019

Sai did batch3 which (see data channel) (emotion+sentiment+8 types of emotion): 180

Alexa did two classes 1-8 and 20-29 : 130 tweets first week

@saithat
Copy link

saithat commented Feb 26, 2019

Did some more coding on 2/25: at 220 now
Did more coding on 2/27: number 501-729, emotion+sentiment
3/1 and 3/4: tweets 1-223, 501-600 coded for relevance, emotion, sentiment, content
tweets 601 - 729 coded for relevance, emotion, sentiment
3/6 and 3/8: As of now relevance has been coded for tweets 1-746, emotion+sentiment+content for tweets 1-249 and 501 - 729
3/11: As of now relevance has been coded for tweets 1-746, emotion+sentiment+content for tweets 1-320 and 501 - 729
3/13: relevance has been coded for tweets 1-912, emotion+sentiment+content for tweets 1-320 and 501 - 729. Changed some of the content coding to match minor changes made to categories.
3/25: relevance coded for all tweets, emotion+sentiment+content for tweets 1-354 and 501 - 729
3/27: relevance coded for all tweets, emotion+sentiment for tweets 1-729 and content for tweets 1-398 and 501 - 729
3/29: relevance coded for all tweets, emotion+sentiment+content for tweets 1-729

@atipton
Copy link
Contributor

atipton commented Feb 26, 2019

Coded around 50 tweets on content analysis on 2/25

@atipton
Copy link
Contributor

atipton commented Feb 28, 2019

Did some manual coding for content, about 200 tweets on 2/26.

I coded >200 for relevant or irrelevant and most were irrelevant, about 80%. Then I went back and coded only the relevant ones for content

@syd-shelby
Copy link
Contributor

Coded 550 Tweets. Coding for relevance, then emotion, sentiment, and opinion. Only about 20% of these have been relevant

@audrism
Copy link
Contributor Author

audrism commented Feb 28, 2019

Please do cross-rater reliabaility for coding

Add comments (eg., this cat sucks)

Report percentages

@audrism
Copy link
Contributor Author

audrism commented Feb 28, 2019

Alexa will use Manny's relevant model totrain on her and mannys data

syd-shelby added a commit that referenced this issue Mar 5, 2019
syd-shelby added a commit that referenced this issue Mar 5, 2019
syd-shelby added a commit that referenced this issue Mar 5, 2019
@syd-shelby
Copy link
Contributor

I took 699 tweets coded by Faiza and myself and compared how closely we coded relevance, whether or not it contained emotion, and the specific emotion.
For ~81.3% of the tweets, we agreed on relevance.
For ~85.7% of the tweets that we agreed had relevance, we agreed on whether or not it had emotion.
For ~44.3% of the tweets that we agreed were relevant and had emotion, we agreed on the specific emotion.
This last number is a bit low, but it makes sense for a few reasons. First of all, i tdoesn't take into account multiple labels. For example, if I marked a tweet as angry, and she marked it as angry and sad it wouldn't count. Second of all, with some of the emotions that are similar in nature we both tend to use one more often. For example, I was more likely to label an emotion as angry while she was more likely to label one as disappointing.

@atipton
Copy link
Contributor

atipton commented Mar 5, 2019

Finished coding a batch of 1800 tweets for relevance, also did a little bit (~90 tweets) for Xiaojing on content. Friday (03/01) I worked with Manny on editing code specifically to train on these tweets to find relevance, so I will work on testing that tomorrow.

atipton added a commit that referenced this issue Mar 6, 2019
#9   This is pretty specific for the file I was using, which I will upload under sentiment_coding_batch_3.csv. In here it is called tweets.csv.  There was a fair amount of cleanup done for the file. For example, I initially put a 0 when it was irrelevant, then eventually only put a 1 for relevance. I had to go back and insert 0s in the blanks. I did not use any of the categories in the file except for relevance.
atipton added a commit that referenced this issue Mar 6, 2019
#9 these are the tweets used for relevance_training.py
@atipton
Copy link
Contributor

atipton commented Mar 6, 2019

Right now I am getting an accuracy of about ~80% on relevance training so I am working with Manny on improving accuracy.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants