Skip to content

Heeran-cloud/Movie_EDA

Repository files navigation

EDA

Analyzing audience and Sales data of Korean Movie

Various variables such as number of audience and total sales were studied in the project. Our goal was to identify variables that affect the success of a movie.

Prerequisites:

  • Jupyter Notebook
  • Python3
  • Anaconda

Getting Started

Packages to install
  • matplotlib.pyplot
  • seaborn
  • warnings
  • font_manager (matplotlib) - for Korean font
  • pandas
Dataset
  • (536rowsx11columns)
  • genre
  • release date (month,year, season)
  • total screen number
  • total audience
  • total sales
  • point
  • rate
  • actors

Data Exploration:

I. Data Cleansing

  • Korean movies from 2013~2020 were used.
  • Eliminated movies rated 'Adult'.
  • 'Total audience' was converted to thousands.
  • Actors with the same name were removed from the list.

II. Data Visualization

  1. image
  • Expected that the number of movies released would have decreased in 2020 because of the pandemic.
  • Even if year 2020 did not end and more movies may be released, still the number compared to last year has a big difference.
  1. image
  • Expected that, more movies released in a year would mean more number of screens and more audience.
  • But instead found that sales and audience increase and decrease in the same movement while number of screens and movie released do not.

3. - Visualization of sales, audience, number of screens, and point for each top 10 movies to see the movement of the variables in a closer view - In the closer view, like '7번 방의 선물' more audience does not mean more sales. On the other hand, a similar movement of increase and decrease is observed between the sales and number of screens.
  1. image
  • Sorted by Top 10 Actors who have starred in the most movies during the period mentined above. Total credit count on Top 10 actors is 183.
  • Top 10 Actors have appeared on 34.14% of all the movies during the period.
  • Top 10 Actors performed remarkably in the Genre of Drama, Crime which is counted above 30, respectively.
  1. image
  • When Top 10 Actors appeared in the movie, it had better performance in the way of number of the Audience and the Screen than the other movies.
  • Apparently it doesn't mean that it deserved better points than the other movies.
  1. image
  • As KOBIS has announced earlier, the Movie above Audience 7,000K is so-called "Box-office bomb", which is so rare case for the most of the actors. It turned out all 10 Actors made a huge success more than once during 2013-2020.

Built with

Acknowledgements

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published