A search engine company hires you to maintain a database that records the searched words per customer and the advertisements paid per advertisers.
You must store all sentences searched by customers, the timestamp of the query, and the customer identifier (e.g., using browser cookies). These sentences consist of different words (called keywords). You should also determine the language the customer used to search each sentence based on the keywords used. Note that a given keyword may appear in one or more languages.
Advertisers will also share their pages to be displayed once customers type a keyword related to their page. The price paid by advertisers is proportional to the probability to return their page first when queried by customers.
(Q1): Draw an ER diagram, and provide the database schema. Underline the primary keys.
(Q2): Write the CREATE TABLE statements including all the constraints.
(Q3): List the top 20 most common keywords searched per customer, in decreasing frequency order.
(Q4): Returns all languages that have never been used in any searched query.
(Q5): Write a SQL query to show bilingual customers. Return the customer identifier, his name ordered by decreasing the number of languages that he potentially speaks. We assume that a customer speaks a language if he searched at least once a sentence from the language.
create table Customer ( CookieID varchar(100) not null, Name varchar(50), primary key(CookieID) ); create table Search ( Timestamp varchar(15) not null, CookieID varchar(50) not null, Sentence varchar(100) not null, primary key(Timestamp, CookieID), foreign key(CookieID) references Customer(CookieID) ); create table Keyword ( Word varchar(50) not null, Language varchar(50) not null, primary key(Word, Language) ); create table SearchKeyword ( Timestamp varchar(15) not null, CookieID varchar(50) not null, Word varchar(50) not null, Language varchar(50) not null, primary key (Timestamp, CookieID, Word, Language), foreign key (Timestamp, CookieID) references Search(Timestamp, CookieID), foreign key (Word, Language) references Keyword(Word, Language) ); create table PagesHit ( Advertiser varchar(50) not null, Url varchar(255) not null, Timestamp varchar(15) not null, CookieID varchar(50) not null, Word varchar(50) not null, Language varchar(50) not null, primary key(Advertiser, Url, Timestamp, CookieID, Word, Language), foreign key (Timestamp, CookieID, Word, Language) references SearchKeyword(Timestamp, CookieID, Word, Language) );
select top 20 Customer.CookieID, SearchKeyword.Word, count(SearchKeyword.word) as SearchCount from Customer, Search, SearchKeyword where Customer.CookieID = Search.CookieID and Search.Timestamp = SearchKeyword.Timetamp and Search.CookieID = SearchKeyword.CookieID group by Customer.CookieID, SearchKeyword.Word order by count(SearchKeyword.word) desc;
select Keyword.Language from Keyword where Keyword.Language not in ( select SearchKeyword.Language from SearchKeyword );
select Customer.CookieID, Customer.Name, count(DistinctLanguages.Language) as LanguageCount from Customer, (select distinct SearchKeyword.CookieID, SearchKeyword.Language from SearchKeyword) as DistinctLanguages where Customer.CookieID = DistinctLanguages.CookieID group by Customer.CookieID, Customer.Name having count(DistinctLanguages.Language) >= 2 order by count(DistinctLanguages.Language) desc;