Instructions
Requirements and Specifications
Source Code
import pandas as pd
import matplotlib.pyplot as plt
## Read Dataset and display first rows
data = pd.read_csv('video_games.csv')
data.head()
## Histogram Analysis
### For this analysis, we will show a histogram for the Review Score
data_publishers = data.groupby(['Metrics.Review Score']).size().reset_index().rename(columns={0: 'count'})
# Now plot
x = data_publishers['Metrics.Review Score']
y = data_publishers['count']
plt.figure()
plt.hist(x, weights = y, bins=30)
plt.xlabel('Review Score (%)')
plt.xticks(rotation=45)
plt.ylabel('Count')
plt.title('Number of games and their Review Scores')
plt.show()
From the above graph, we can see the distribution of scores for all the games in the dataset. We can see that the mean score is between 70% and 80%
## Secondary Analysis
### For this analysis, we will display a bar plot of Review Score for each Different Genre
data_genres = data.groupby(['Metadata.Genres']).mean().reset_index().rename(columns={0: 'count'})
# Now plot
x = data_genres['Metadata.Genres']
y = data_genres['Metrics.Review Score']
plt.figure(figsize=(25,10))
plt.bar(x, y, align='center')
plt.xlabel('Genre')
plt.xticks(x, rotation=90)
plt.ylabel('Review Score (%)')
plt.title('Genres and their Reivew Scores')
plt.show()
From the above graph, we can see that the Videogame Genre with the highest Review Score is **Adventure.Educational.Strategy**
## Stakeholders
In this project we have been working with a dataset related to video games. The dataset contains information on video games released for the different consoles and the studios that developed them, as well as information on sales numbers, scores, etc.
One of the stakeholders interested in these analyzes is a game design company, which can focus on the analysis of the number of units sold for each genre, in order to choose the type of game that is going to be designed since this genre can guarantee success.
Another Stakeholder interested in the data analysis carried out in this project may be an analyst interested in estimating the number of units sold for a video game based on its Review Score. This analyst could be a person who works in data science interested in designing a regression model to estimate/predict the number of sales that a video game will have based on the score obtained on the day of its release.