# Python Program to Implement Data Visualization Assignment Solution.

## Instructions

Objective

If you're seeking assistance with a Python assignment, particularly one related to data visualization, you're in the right place! Writing a program to implement data visualization in Python can be both educational and impactful. Python offers various libraries such as Matplotlib, Seaborn, and Plotly that can be incredibly useful in creating visual representations of data. These libraries provide tools to generate graphs, charts, and plots that convey insights from your data effectively.

## Requirements and Specifications Source Code
```# MANOVA example dataset https://www.statsmodels.org/dev/generated/statsmodels.multivariate.manova.MANOVA.html Suppose we have a dataset of various plant varieties (plant_var) and their associated phenotypic measurements for plant heights (height) and canopy volume (canopy_vol). We want to see if plant heights and canopy volume are associated with different plant varieties using MANOVA. ### Load dataset import pandas as pd df=pd.read_csv("https://reneshbedre.github.io/assets/posts/ancova/manova_data.csv") df.head(5) ### Summary statistics and visualization of dataset Get summary statistics based on each dependent variable [df.groupby("plant_var")["height"].mean(),df.groupby("plant_var")["height"].count(),df.groupby("plant_var")["height"].std()] [df.groupby("plant_var")["canopy_vol"].mean(),df.groupby("plant_var")["canopy_vol"].count(),df.groupby("plant_var")["canopy_vol"].std()] ### Visualize dataset import seaborn as sns import matplotlib.pyplot as plt fig, axs = plt.subplots(ncols=2) sns.boxplot(data=df, x="plant_var", y="height", hue=df.plant_var.tolist(), ax=axs) sns.boxplot(data=df, x="plant_var", y="canopy_vol", hue=df.plant_var.tolist(), ax=axs) plt.show() ### Perform one-way MANOVA from statsmodels.multivariate.manova import MANOVA fit = MANOVA.from_formula('height + canopy_vol ~ plant_var', data=df) print(fit.mv_test()) ### Make a Conclusion The Pillai’s Trace test statistics is statistically significant [Pillai’s Trace = 1.03, F(6, 72) = 12.90, p < 0.001] and indicates that plant varieties has a statistically significant association with both combined plant height and canopy volume. ## Your Task 1 Suppose we have gathered the following data on female athletes in three sports. The measurements we have made are the athletes' heights and vertical jumps, both in inches. The data are listed as (height, jump) as follows: Basketball Players: Track Athletes: Softball Players: (66, 27), (65, 29), (68, 26), (64, 29), (67, 29) (63, 23), (61, 26), (62, 23), (60, 26) (62, 23), (65, 21), (63, 21), (62, 23), (63.5, 22), (66, 21.5) Use statsmodels.multivariate.manova Python to conduct the MANOVA F-test using Wilks' Lambda to test for a difference in (height, jump) mean vectors across the three sports. Make sure you include clear command lines and relevant output/results with hypotheses, test result(s) and conclusion(s)/interpretation(s) # YOUR CODE here # Define your dataframe # Check data # Define a list with the data data_lst = [ ['Basketball Players', 66,27], ['Basketball Players', 65,29], ['Basketball Players', 68,26], ['Basketball Players', 64,29], ['Basketball Players', 67,29], ['Track Athletes', 63,23], ['Track Athletes', 61,26], ['Track Athletes', 62,23], ['Track Athletes', 60,26], ['Track Athletes', 62,23], ['Softball Players', 65,21], ['Softball Players', 63,21], ['Softball Players', 62,23], ['Softball Players', 63.5,22], ['Softball Players', 66,21.5]] # Define column names columns = ['Type', 'Height', 'Jump'] # Constructo dataframe data = pd.DataFrame(data = data_lst, columns = columns) data.head() # Conduct the MANOVA F-test fit = MANOVA.from_formula('Height + Jump ~ Type', data=data) print(fit.mv_test()) From Wilk's lambda we can see that the p-value is < 0.05 so we reject the null Hyptothesis, meaning that the Height and Jump are not related to the Type of Athelete. ## Your Task 2 (bonus and optional) For the above problem, try to use non-built-in function in Python to calculate F score and check with your built-in function output above # YOUR CODE HERE def F_score(prec, recall): return 2*(prec*recall)/(prec+recall)```