Welcome to this repository and dashboard about using NLP (natural language processing) to summarize YouTube comments from one of our favorite shows! The Bigger Pockets is a producer of a Youtube channel, Podcast, and other products focusing on real estate investments and financing; it is a roadmap for financial freedom through real estate investment. This is one of our favorite shows! The YouTube channel of Bigger Pockets (https://www.youtube.com/c/biggerpockets) has been producing informational videos since 2016, with 1.16M subscribers and over 3K videos released so far! As fans of the channel, we utilized NLP and sentiment analyses to draw insights into the audience's responses to each episode of the Bigger Pockets release.
This is a collaborative project featuring two data scientists: Dr. David Henderson (see profile here - https://github.com/HD013) & Dr. Yingtong "Amanda" Wu (see profile here - https://github.com/YingtongAamandaWu)
01_Codes: a folder containing jupyter notebooks of python codes showing the intermediate steps and exploratory data analyses
This is an interactive figure from "Video_polarity_mean_max_min_plotly.html", where you can hover over the html file and see the mean, max, and min comment polarity on every video of Bigger Pocket youtube. Note: we excluded youtube videos with less than 30 mins and with less than 10 comments for this analysis.
Highlight 2 - How long of a video should BiggerPockets make to achieve the highest cost-effectiveness (view counts = profits)?
We found that a video length of 260 seconds (4 min 20 secs) has the most view count per second -- This seems to be a "sweet spot" of attracting view counts with the minimum efforts spent on making video productions, without compromising the view counts and contents :)
As a case study, we produced a wordcloud image based on 124 Youtube comments from this recent video "New Rental Property Mortgages with 3% Interest Rates, 5% Down" (Link https://www.youtube.com/watch?v=IVK5vQg1UvY). Most of the comments are positive, as you can in the figure below: most comments show polarity values over zero. From the wordcloud image above, "Thank" and "Great" are two main keywords that popped up consistently in the comments -- meaning that the audience is very thankful for the information shared about low-interest rate mortgages in 2023!