+1 (315) 557-6473 

SQL Program to Implement Data Queries Assignment Solution.


Instructions

Objective
Write a program to implement data queries in SQL.

Requirements and Specifications

Assessment Review
Please use the csv data set along with the below data table dictionaries to answer the below question
Table descriptions:
User Table
ColumnDescription
product_account_id

user identifier specific and unique to each

user Primary and unique key in user table

registration_datedate that the user signed up for FanDuel
first_deposit_date

date the user made their first deposit on the

FanDuel site/app

first_paid_entry_date

date the user made their first paid entry on the

FanDuel site/app

first_deposit_amount

dollar amount of the users first deposit

Relates to first_deposit_date

registration_paid_media_partner_namemedia partner that is associated with driving the user's registration. All partners are paid media partners aside from "ORGANIC"
first_deposit_paid_media_partner_namemedia partner that is associated with driving the user's first deposit. All partners are paid media partners aside from "ORGANIC"
first_paid_entry_paid_media_partner_namemedia partner that is associated with driving the user's first paid entry. All partners are paid media partners aside from "ORGANIC"
registration_statethe state the users was in when they signed up for FanDuel. Relates to registration_date
Entry table
ColumnDescription
product_account_iduser identifier specific and unique to each user. Is not unique on the entry table
entry_weekthe week a user entered a contest. Week is noted as the first day of the week
entry_sportthe sport associated with the user's entry
contest_typethe type of contest associated with the user's entry
entry_sourcethe platform associated with the user's entry
weekly_entry_feethe summed total of an users weekly entries
weely_total_entry_winningsthe summed total of an users weekly winnings represented as a gross total. To find net winnings you would do "weekly_entry_fee" - "weekly_total_entry_winnings"
total_entriesthe total count of entries a user made in a week
avg_contest_sizethe average size of the contest the users entered into based on the combination of "entry_sport", "contest_type", and "entry_source" in a week
Data Querying
  • Given the above data sets show what month and year (Eg. March 2019, April 2018, etc.) has the largest number of registrations? Please perform this using SQL and provide both the output and your query to produce your results.
    1. select count(registration_date) as countofregistrationdate ,registration_date
    2. from user_data
    3. group by registration_date
    4. order by countofregistrationdate DESC
    5. limit 1
    1. Given the above data sets show the top 10 ad partners by total users per First Paid Entry Credit (column “first_paid_entry_paid_media_partner_name”). Please perform this using SQL and provide both the output and your query to produce your results. Be sure to not include users who have not made a first paid entry.
    2. Like the above show the top 10 Ad Partners, grouped by First Paid Entry Credit (column “first_paid_entry_paid_media_partner_name”) and ranked by total entry fees for users who first paid entry data was in September. In addition to total entry fees, include the total count of entries and the average entry fee amount per user per ad partner. Please perform this using SQL and provide both the output and your query to produce your results.
    3. For users who registered in 2019, which month, based on a user’s first_paid_entry_date, saw the largest number of unique users make an NBA entry at any point in time? Please perform this using SQL and provide both the output and your query to produce your results.
    4. What were the top 5 sports in both 2018 and 2019 based on total entries for users that made their first paid entry within one week of registering? Include the average first_deposits_amount, and average time between registration and first paid entry per sport and year as well. Please write this as a single query using the “rank()” function. Please perform this using SQL and provide both the output and your query to produce your results. Note, we want to know the top 5 sports for each year not overall in 2018 and 2019 combined.

Source Code

-- question 1

select count(*) as 'Count', DATE_FORMAT(registration_date, '%M %Y') as 'Date'

from user

group by DATE_FORMAT(registration_date, '%M %Y')

order by count(*) DESC

LIMIT 1

-- question 2

select count(*) 'Count', first_paid_entry_paid_media_partner_name

FROM user GROUP BY first_paid_entry_paid_media_partner_name

ORDER BY count(*) DESC

LIMIT 10;

-- question 3

select B.total_entries 'Total Entries', B.weekly_entry_fee 'Total Entry Fees',

(B.weekly_entry_fee/B.total_entries) 'Average Entry Fee', MONTH(B.entry_week) 'Month',

A.first_paid_entry_paid_media_partner_name 'Partner Name', A.product_account_id 'User ID'

FROM user A

INNER JOIN entry B

ON A.product_account_id = B.product_account_id AND MONTH(B.entry_week) = 9

LIMIT 10;

-- question 4

SELECT YEAR(A.registration_date) 'Registered at Year', MONTH(A.first_paid_entry_date) 'Paid at Month',

B.entry_sport 'Entry Sport', COUNT(DISTINCT A.product_account_id) 'Number of users'

FROM user A

INNER JOIN entry B

ON B.entry_sport = 'NBA'

WHERE YEAR(A.registration_date) = 2019

GROUP BY MONTH(A.first_paid_entry_date)

ORDER BY COUNT(A.product_account_id) DESC

LIMIT 1

-- question 5

SELECT YEAR(A.entry_week) 'Year', A.total_entries 'Total Entries', A.entry_sport 'Sport',

AVG(DATEDIFF(B.first_paid_entry_date, B.registration_date)) 'Average time between registration and first paid entry',

AVG(B.first_deposit_amount) 'Average first deposit amount',

RANK() OVER (PARTITION BY A.entry_sport ORDER BY A.total_entries) 'Rank'

FROM entry A

INNER JOIN user B

ON A.product_account_id = B.product_account_id

WHERE DATEDIFF(B.first_paid_entry_date, B.registration_date) <= 7

AND A.entry_sport != ''

AND (YEAR(A.entry_week) = 2018 OR YEAR(A.entry_week) = 2019)

GROUP BY YEAR(A.entry_week), A.entry_sport

ORDER BY YEAR(A.entry_week), 'Rank' ASC