# Matlab Program to Do Data Analysis Assignment Solution.

## Instructions

Objective
Write a program to do data analysis in matlab.

## Requirements and Specifications

Source Code

```clc, clear all, close all % Create a matrix that will store the data for each month. The matrix will % contains 12 rows and n columns where n is the number of samples in each % dataset all_load = []; january_load = []; january_firstweek_load = []; january_firstday_load = []; all_price = []; january_price = []; january_firstweek_price = []; january_firstday_price = []; all_times = []; % Create a cell array with all files files = {"1Jan.csv", "2Feb.csv", "3March.csv", "4April.csv", "5May.csv", "6June.csv", "7July.csv", "8August.csv", "9September.csv", "10October.csv", "11November.csv", "12December.csv"}; %% a) Monthly Profile monthly_demand = []; monthly_price = []; monthly_labels = []; daily_demand = []; daily_price = []; daily_labels = []; hourly_demand = []; hourly_price = []; hourly_labels = []; for i = 1:length(files) file = "data\" + files{i}; T = readtable(file); all_load = [all_load;T.TOTALDEMAND]; all_times = [all_times;T.SETTLEMENTDATE]; price = T.RRP; % Clean wrong values % If there are values higher than 2 times the mean, set them to 2x % times the mean if length(find(price > 2*mean(price))) > 0 idx = find(price > 2*mean(price)); price(idx) = 2*mean(price); end if length(find(price < -2*mean(price))) > 0 idx = find(price < -2*mean(price)); price(idx) = -2*mean(price); end all_price = [all_price;price]; % january load and price if strcmp(file, "data\1Jan.csv") january_load = T.TOTALDEMAND; january_price = T.RRP; % Data for first week index = find(week(T.SETTLEMENTDATE, 'weekofmonth') == 1); january_firstweek_load = T.TOTALDEMAND(index); january_firstweek_price = price(index); % First day index = find(day(T.SETTLEMENTDATE) == 1); january_firstday_load = T.TOTALDEMAND(index); january_firstday_price = price(index); end % Monthly monthly_demand = [monthly_demand;T.TOTALDEMAND]; monthly_price = [monthly_price;price]; monthly_labels = [monthly_labels;month(T.SETTLEMENTDATE, 'name')]; % Daily [groups, times] = findgroups(weekday(T.SETTLEMENTDATE)); for j = 1:length(groups) index = find(hour(T.SETTLEMENTDATE) == groups(j)); daily_demand = [daily_demand;T.TOTALDEMAND(index)]; daily_price = [daily_price;price(index)]; daily_labels = [daily_labels;weekday(T.SETTLEMENTDATE(index))]; end % Hourly [groups, times] = findgroups(timeofday(T.SETTLEMENTDATE)); for j = 1:length(times) index = find(timeofday(T.SETTLEMENTDATE) == times(j)); hourly_demand = [hourly_demand;T.TOTALDEMAND(index)]; hourly_price = [hourly_price;price(index)]; hourly_labels = [hourly_labels;timeofday(T.SETTLEMENTDATE(index))]; end end figure subplot(3,1,1) boxplot(hourly_demand, hourly_labels); grid on ylabel('Real Power (MW)'); xlabel('Hours of a Day') subplot(3,1,2) boxplot(daily_demand, daily_labels); xticklabels({'Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday'}) grid on ylabel('Real Power (MW)'); xlabel('Days of a Week') subplot(3,1,3) boxplot(monthly_demand, monthly_labels); grid on ylabel('Real Power (MW)'); xlabel('Months of a Year') figure subplot(3,1,1) boxplot(hourly_price, hourly_labels); grid on ylabel('Price'); xlabel('Hours of a Day') subplot(3,1,2) boxplot(daily_price, daily_labels); xticklabels({'Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday'}) grid on ylabel('Price'); xlabel('Days of a Week') subplot(3,1,3) boxplot(monthly_price, monthly_labels); grid on ylabel('Price'); xlabel('Months of a Year') %% Section 3: Data Analysis %% Part a) Correlation between load and electricity corr_a = corr(all_load, all_price); fprintf("The correlation coefficient between Demand and Price for year 2010 is: %.4f\n", corr_a); %% Part b) Correlation for January corr_b = corr(january_load, january_price); fprintf("The correlation coefficient between Demand and Price for January is: %.4f\n", corr_b); %% Part c) Correlation for first week of January corr_c = corr(january_firstweek_load, january_firstweek_price); fprintf("The correlation coefficient between Demand and Price for the first Week of January is: %.4f\n", corr_c); %% Part d) Correlqation for first day of January corr_d = corr(january_firstday_load, january_firstday_price); fprintf("The correlation coefficient between Demand and Price for the first Day of January is: %.4f\n", corr_d); %% Part e) Scatter plot of electricity and load figure scatter(all_load, all_price), grid on xlabel('Load') ylabel('Price') title('Price vs. Demand') %% Part f) Scatter for January figure scatter(january_load, january_price), grid on xlabel('Load') ylabel('Price') title('Price vs. Demand for January') %% Part g) Scatter for first week of January figure scatter(january_firstweek_load, january_firstweek_price), grid on xlabel('Load') ylabel('Price') title('Price vs. Demand for first week of January') %% Part h) Scatter for first day of January figure scatter(january_firstday_load, january_firstday_price), grid on xlabel('Load') ylabel('Price') title('Price vs. Demand for first day of January') %% part i) mean_load = mean(all_load) mean_price = mean(all_price) std_load = std(all_load) std_price = std(all_price) skewness_load = skewness(all_load) skesness_price = skewness(all_price) %% part j) % Normalize the data all_load_norm = (all_load - min(all_load))/(max(all_load) - min(all_load)); all_price_norm = (all_price - min(all_price))/(max(all_price) - min(all_price)); pd_normal_load = fitdist(all_load_norm, 'Normal'); % pd_weibull_load = fitdist(all_load_norm, 'Weibull'); pd_beta_load = fitdist(all_load_norm, 'Beta'); pd_normal_price = fitdist(all_price_norm, 'Normal'); % pd_weibull_price = fitdist(all_price_norm, 'Weibull'); pd_beta_price = fitdist(all_price_norm, 'Beta'); %% Section 4: % Part a), b), c), d) and e) for perc = 0.05:0.05:0.2 %% part a) Shift 5% new_load = all_load; % We see that the hours of highest price sare between 6:30 and 18:30, and % the hours of lowest prices are between 18:30 and 6:30 % Get index of data for times between 6:30 and 18:30 index1 = find(timeofday(all_times) >= duration([6,30,0]) & timeofday(all_times) <= duration([18,30,0])); % Get index of data for times between 18:30 and 6:30 index2 = find(timeofday(all_times) >= duration([18,30,0])); index2 = [index2;find(timeofday(all_times) <= duration([6,30,0]))]; % Now, take percentage from the first range of time and load = perc*all_load(index1); % Remove to the first range of time new_load(index1) = new_load(index1)*(1-perc); % Now add that load to the second range of time new_load(index2) = new_load(index2) + load; % Calculate the total amount of money for this new pattern of load new_cost = sum(new_load.*all_price); % Calculate the total amount of money for the original pattern old_cost = sum(all_load.*all_price); % Check for the amount saved amount_saved = old_cost - new_cost; fprintf("The amount of money saved then %.1f%% of load shifted is: \$%.2f\n", perc*100, amount_saved); end %% Section 5: %% Part a) %We can see that the all time-high price is around 15:00-17:00 and it is % because these are the rush hours %% Part b) % The time with the lowest price is 12:30. Although this is a semi-crowded % hour, the low price may be due to some type of excess supply or some % type of failure (blackouts in some areas) that caused an excess supply. % Generally, the lowest prices should be located in the early morning. %% Part c) % Shifting 20% of the load returns the highest amount saved (\$7211353.49) %% Section 7: %% a) % * New England REZ Transmission Link % * Sydney Ring (Reinforcing Sydney, Newcastle and Wollongong Supply) % * HumeLink %% b) % * the principal key barrier is time, since these projects have aan % estimated finish date % * Securing social license for VRE, Storage and Transmission % * Completing actions in AEMO's Engineering Framework %% c) % Each of these projects belong to an expansion plan with stipulated dates % and projects organized independently. %% d) % Among the main strategies are engineering designs, cost estimates and % research related to communities of interest. % % Identification of barriers to community acceptance and estimates of costs % associated with overcoming them. %% e) % Among the pros of the project is the increase in the robustness of the % electrical system and its reliability. In addition, among the projects that % they want to implement there is a zero emissions plan through the % implementation of renewable energies. On the other hand, one of the main % pros is to guarantee electric service to all people through interconnection % with other countries, guaranteeing that there is always an offer in % electric service. % Among the main disadvantages are the difficulty of each project and the % time required for its execution. In addition to having to comply with the % assigned times, they are projects with high monetary costs. On the other % hand, they require various regulatory permits that vary between countries % but are necessary to achieve interconnections.```