Instructions
Requirements and Specifications
Source Code
# **APCO 1P93: Applied Programming (for Data Science)**
### Winter, 2022
### Instructor: Yifeng (Ethan) Li
### Department of Computer Science, Brock University
### Email:
### TA: Tristan Navikevicius:
---
## **Assignment as Final Exam**
## **Due Date: 10:00pm, Tuesday, April 26th, 2022 **
###**Plagiarism = Severe Consequence **
###**Place your work in a zipped folder named _Firstname_Lastname_StudentNumber_ for your submission. **
## **Question 1** (30 points)
In this question, your job is to define a class named `Hydro` to process and visualize a small data set from Alectra Utilities. For your convenience, the structure of this class is given below. You will need to do the following tasks.
* Define a method named `read_data` within the `Hydro` class to load the data from the given text file named `hydro_28-Mar-2022.csv` using the `loadtxt` function in `numpy` (). The first row of the data should be assigned to `self.header` and the rest should be assigned to `self.data`. **(3 marks)**
* Define a method named `sort_data` within the `Hydro` class to sort the rows in `self.data` according the first column (Reading Date) in incremental order. You need to use function `numpy.argsort`(). Note, the sorted data should still be in `self.data`. **(3 marks)**
* Define a method named `add_temperature` within the `Hydro` class to load the climate data from the given text file named `climate.txt` and add a new column to `self.data` with corresponding Daily Mean Temperature in the corresponding months. For example, for `'2021-04-16'` (an element in the first column of self.data), `'7.4'` (Daily Mean Temperature for `'Apr'` in climate.txt) should be the corresponding value in the new column. Note, you should also add a new string element `'Daily Mean Temperature'` to `self.header`. **(5 marks)**
* Define a method named `save_data` within the `Hydro` class to concatenate `self.header` and `self.data` and save it to a csv file. Note, use `numpy.savetxt` function (). **(2 marks)**
* Define a method named `draw_plots` within the `Hydro` class to draw two subplots. **(12 marks)**
* The first subplot is a mixture of bars (for `Average KWH/Day` in `self.data`) and a line/curve (for `Daily Mean Temperature` in `self.data`) with shared/twin x-axis but different y-axis. This subplot has a left y-axis and a right y-axis. The left axis is for `Average KWH/Day` and the right-axis is for `Daily Mean Temperature`.
* The second subplot is a line plot to visualize the `Current Reading` column in `self.data`.
* The instructor's plot, as a pdf file, is provided with this assignment. You will have to reproduce it as precisely as possible. You may find the following materials helpful: , (particularly `set_rotation`, `set_fontsize`, and `set_color` methods for a text object), , , .
* The figure should be saved to a pdf file.
* After defining the class and methods described above, create an instance of the `Hydro` class. Call the following methods of this instance in sequential order: `read_data`, `sort_data`, `add_temperature`, `draw_plots`, `save_data`. **(5 marks)**
import numpy as np
import matplotlib.pyplot as plt
# define your Hydro class here
class Hydro:
def __init__(self):
self.header = None # will be 1d array of str type, length 8 or 9 (after adding a new field for temperature)
self.data = None # will be 2d array of str type, shape (23,8) or (23,9)
def read_data(self, filename = './hydro_28-Mar-2022.csv'):
"""
Read the provided hydro data as string data type.
Assign the header info (first row of the text file) to self.header.
Assign the rows after the first row of the text file to self.data.
INPUTS:
filename: string, file name for the given data set.
"""
self.header = np.empty((1,8))
self.data = np.empty((0,8),dtype=str)
# (3 marks)
# Open the file
with open(filename, 'r') as f:
# Read all lines
lines = f.readlines()
# The first line contains the header
self.header = np.array(lines.pop(0).strip().split(','))
# The rest contains the data
print(len(lines))
for i, line in enumerate(lines):
# Split and append to data
row = line.strip().split(',')
#self.data[i,:] = np.array(row)
self.data = np.vstack([self.data, row])
print('Data from {0} has been successfully loaded.'.format(filename))
print('The data has header:\n{0}'.format(self.header))
print('There are {0} rows (excluding the header) in the data set.'.format(self.data.shape[0]))
def sort_data(self):
"""
Sort the rows of self.data based on the first field "Reading Date" in increamental order.
https://numpy.org/doc/stable/reference/generated/numpy.argsort.html
"""
# (3 marks)
# Sort data
self.data = np.array(sorted(self.data, key = lambda row: row[0]))
print('The data has been sorted in increamental order.')
def add_temperature(self, filename='./climate.txt'):
"""
Load the climate data and add the Daily Mean Temperature for corresponding months as a new column to self.data.
Of course, a new string element `'Daily Mean Temperature'` should be added to self.header.
"""
# Create a list with months. This list will help us to map the month with its id
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
# (5 marks)
with open(filename, 'r') as f:
# Read lines
lines = f.readlines()
# Skip header
lines.pop(0)
# Skip last line
lines.pop(len(lines)-1)
# Add the new column to the header
if 'Daily Mean Temperature' not in self.header:
self.header = np.hstack([self.header, 'Daily Mean Temperature'])
# Extend the data array by adding a new column
column = np.zeros((self.data.shape[0],1))
self.data = np.append(self.data, column, axis=1)
# Now, for each line, split and take the second element and convert it to float
for line in lines:
row = line.strip().split('\t')
month = row[0]
# Get month number
month_n = months.index(month)+1
# Get temp
temp = row[1]
# Now add to data where the recorded date is for the given month
for i in range(len(self.data)):
# Take the date and get the month number
date = self.data[i][0]
month_number = int(date.split('-')[1])
# If the month from the current row of climate data is equal to the month in the main data, then
# append the value of temperature
if month_number == month_n:
self.data[i,-1] = temp
def save_data(self, filename = './hydro_temp.txt'):
"""
This function saves the data to the fiven filename. The first line in the file will contain the header
while the rest of lines contains the data split by commas
"""
#(2 marks)
with open(filename, 'w+') as f:
# Write header
f.write(','.join(self.header) + '\n')
# Now, write the rest
for row in self.data:
f.write(','.join(row) + '\n')
print('Data saved to a text file.')
def draw_plots(self):
fig, (ax1, ax2) = plt.subplots(2,1)
fig.set_size_inches((12,6))
fig.set_tight_layout(tight=True)
# draw bar subplot (4 marks)
# First, get the average KWH/day into a list
avg_kwh = self.data[:,7].astype('float')
# Get daily mean temp
daily_temp = self.data[:,-1].astype('float')
# Get dates
dates = self.data[:,0]
# Now plot
ax1.bar(dates, avg_kwh, color = 'green')
ax1.set_ylabel(self.header[7])
ax1.tick_params(axis='x', rotation=45)
# Display values at the top of each bar
# add a line for temperature to ax1, (3 marks)
ax11 = ax1.twinx() # instantiate a second axes that shares the same x-axis
ax11.plot(range(len(dates)), daily_temp, color = 'orange', marker = 'D')
ax11.set_ylabel(self.header[-1])
ax1.set_title('Daily Average Electricity Usage')
ax1.set_xlabel('Date')
# draw line subplot (4 marks)
# Get consumption values
consumption = self.data[:,6].astype('float')
for i, v in enumerate(avg_kwh):
ax1.text(i-0.5, v+0.1, str(v))
# Plot
ax2.plot(range(len(consumption)), consumption, color = 'red', marker = 'o')
ax2.set_xticklabels(dates)
ax2.tick_params(axis='x', rotation=45)
ax2.set_title('Electricity Consumption')
ax2.set_xticks(range(len(dates)))
ax2.set_ylabel(self.header[6])
ax2.set_xlabel('Date')
# save the drawn figure to a pdf file (1 mark)
plt.savefig('fig.pdf')
plt.show()
# creat instance h1 of class Hydro and call its methods
# (5 marks)
h1 = Hydro()
h1.read_data()
h1.sort_data()
h1.add_temperature()
h1.save_data('output.csv')
h1.draw_plots()
## **Question 2** (Bonus: 6 points)
Q 2.1: Define a function named `min_iterative` using iteration to find the minimal value from a given list of numbers. Your must use a `for` or `while` loop in your implementation. You should also use `try-except` statement for capturing and processing `TypeError` caused by non-int and non-float elements in the input list. After defining this function, test it using lists `[5, -1, 4, -9, 3, 4, 3, 7, -9, 10]`, and `[5, -1, 4, -9, 3, 4, 3, 7, 'abc', 10]`, respectively. **(3 bonus marks)**
# answer Q 2.1 here
import math
def min_iterative(lst):
# Find the min
min_val = lst[0]
for i in range(1, len(lst)):
try:
if lst[i] < min_val:
min_val = lst[i]
except TypeError as e:
pass
return min_val
print(min_iterative([5, -1, 4, -9, 3, 4, 3, 7, -9, 10]))
print(min_iterative([5, -1, 4, -9, 3, 4, 3, 7, 'abc', 10]))
Q2.2: Define a function named `min_recursion` using recursion to find the minimal value a given list of numbers. You must consider the base case and the recursive case for making progress. A `try-except` statement do not need to be used in this function. After defining this function, test it using list `[5, -1, 4, -9, 3, 4, 3, 7, -9, 10]`. **(3 bonus marks)**
# answer Q 2.2 here
def min_recursion(lst):
print(f"Running min_recursion([{', '.join(list(map(str, lst)))}])...")
if len(lst) == 1:
return lst[0]
else:
return min(lst[0], min_recursion(lst[1:]))
min_recursion([5, -1, 4, -9, 3, 4, 3, 7, -9, 10])