Program To Implement Big Data Processing Assignment Solution.

Instructions

Objective

Write a program to implement big data processing in python.

Requirements and Specifications

In this question, we will use all we have learned about Python to do a project about big data processing. We observe earthquakes and topography on the Earth’s surface. The goal of this project is to determine the correlation between these two, or to check if large earthquakes preferentially occur in regions with high or low topography.

There are 2 data files in this folder:

“Large-Eq.csv”: an earthquake catalog containing 25,000 events with a magnitude >5.0 since year 2007. The data structure is the same as the other earthquake catalog files we used in this class.

“topo.dat”: topography data. The 3 columns are Longitude, latitude, and elevation, respectively.

You will eventually make a plot showing the relationship between the number of earthquakes and the elevation, e.g., a plot in which the horizontal axis is the elevation and the vertical axis is the number of earthquakes. I provide an example figure in the example.pdf file. However, you can make any plot that you want and the details of the figure depend on your choice. The key is that people should easily find under what elevation the earthquakes happen the most by reading your plot. If you want to make histograms, you will need to study by your self through online materials.
Your codes should run without problem.
Your codes should be readable to me. Please include as many comments as you think is necessary
Group discussion is encouraged. However, do not copy others’ codes.
Make your code run as fast as possible. When you turn in your codes, provide information on how long it takes to run the code on your computer.
Grades will be based on both the correctness and the readability of the codes.
Please try to make your code run as fast and precise as possible. Ideally, the code takes a few seconds. It should not take more than 10 minutes.

Important hints:

in the earthquake catalog, the longitude ranges from -180 to 180 degrees. The negative values indicate in the west and the positive values indicate in the east. However, in the topography data, the longitude ranges from 0 to 360 degrees, which means 0-180 degree is in the eastern hemisphere and 180-360 degrees is in the western hemisphere. In both cases, the 0 is defined as the Zero degrees longitude which is an imaginary line known as the Prime Meridian, and you are moving eastward as the longitude increases.
the grid points of (longitude, latitude) in the earthquake catalog often does not match the grid points in the topography data. Therefore, if you need help with python assignment to be creative when finding the elevation for an earthquake from the topography file, e.g., choosing the closest grid points in topo.dat, or do intepolation.

Source Code

# Assignment 8 (A small project)

Due date: April 29, 2022, 11:59pm

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt