Skip to content

AryanRajani/sentiment-analysis-using-word2vec

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

sentiment-analysis-using-word2vec

Steps:

  1. The dataset is first read using pandas library.
  2. Reviews are then extracted from the dataset.
  3. Each review is then preprocessed and cleaned (html, integers, and punctuation marks are removed).
  4. Reviews are divided into training set and test set (75% reserved for training and 25% reserved for testing).
  5. Word Embeddings are then learned using Gensim Word2Vec on training data.
  6. For sentiment analysis, the reviews in both training data and test data are converted into a numeric vector as follows: a. The embeddings vector of each word in a review is extracted from word2vec model b. The embeddings are then added and divided by the number of words in that review. This gives us an average vector of the review.
  7. The numeric vectors of all reviews in the training set are then fed to different machine learning algorithms for training.
  8. The accuracy of models is then measured using the vectors of test set.

Test Accuracies:

  1. Naïve Bayes: 63.26%
  2. Neural Network: 81.42%
  3. Random Forest: 76.69%
  4. KNN: 69.25%

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 94.2%
  • Python 5.8%