Authorship Identification determines the likelihood of a piece of writing to be produced by a particular author by examining other writings by that author. It is a classification problem whose complexity level can be determined by several parameters such as the kind of feature set used, the size of the training data, the number of authors considered, the number of writings per author, the type of classification model used etc. This project was developed to perform the task of classifying authors of online messages taken from Reddit and Enron Email Dataset. A 2-way SVM classifier was developed which achieved an accuracy of 83.5% on the Enron Email Dataset and an accuracy of 74% on the Reddit Dataset. The classification parameters were then altered to compare the effect of these parameters on classification accuracies.
-
Notifications
You must be signed in to change notification settings - Fork 0
Rakshitha03/AuthorIdentification
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published