全部版块 我的主页
论坛 数据科学与人工智能 数据分析与数据科学 SAS专版
1467 2
2014-08-26

Sample Task


We have 2 MySql tables that have cost, revenue, and 1 SASdataset that has web stats data


tbl_cost:

financial_date (i.e., 2013-06-01)

Site (i.e., suggest.com, note there could be many sitesin this dataset)

Source (where the visitor came from i.e., Google notethere could be many sources in this dataset)

Adgroupid (a lower level of source, i.e., Travel, notethere could be many adgroupid’s in this dataset)

cost (amount of cost, $150. – can be any number from 0to infinity)

cost_clicks (amount of clicks, 100, – can be any numberfrom 0 to infinity)

1 row per financial_date/site/source/adgroupid

tbl_revenue:

financial_date (i.e., 2013-06-01)

Site (i.e., suggest.com, note there could be many sitesin this dataset)

Source (where the visitor came from i.e., Google notethere could be many sources in this dataset)

Adgroupid (a lower level of source, i.e., Travel, notethere could be many adgroupid’s in this dataset)

revenue (amount of net revenue, $300, – can be anynumber from 0 to infinity)

revenue_clicks (amount of clicks, 50, – can be anynumber from 0 to infinity)

impressions (ad impressions, 100, – can be any numberfrom 0 to infinity)

multiple rows per financial_date/site/source/adgroupid

web_stats

financial_date (i.e., 2013-06-01)

Site (i.e., suggest.com, note there could be many sitesin this dataset)

Source (where the visitor came from i.e., Google notethere could be many sources in this dataset)

Adgroupid (a lower level of source, i.e., Travel, notethere could be many adgroupid’s in this dataset)

visits (similar to cost_clicks, 100, – can be anynumber from 0 to infinity)

pageviews (pages looked at, 500, – can be any numberfrom 0 to infinity)

multiple rows per financial_date/site/source/adgroupid

The datasets are entire days worth of data, so we wouldhave a separate dataset for 6/1 vs. 6/2 and they would look like this;

web_stats_20130601 , web_stats_20130602

tbl_cost & tbl_revenue are located on server'financials' assume the database is the same as the server name

web_stats is located at '/sas/datasets/stats/'


All that being said, I would like to get a program thataccepts a date (i.e., 6/1/2013), queries the MySql tables (i.e., wherefinancial_date = '2013-06-01') and opens the correct web_stats dataset(i.e.web_stats_20130601), and joins all 3 together, summarizes it at the followinglevels below and in the end appends back to a MySql database table calledfinancial_date_sum on 'financials' server, assume database is the same('financials').

Summary levels (4 levels);

financial_date

Site, financial_date

Site, source, financial_date

Site, source, adgroupid, financial_date


Also create the following variables:

ppv = pageviews/visits

profit = revenue - cost

conv_rate = revenue_clicks/cost_clicks

Also calculate day over day(dod) and week over week(wow)growth rates for ppv, profit, and conv_rate.

I would like to be able to put in a time frame (last 10days), process each day, delete day from 'financial_date_sum' (if it exists inthere already), and append to 'financial_date_sum'


Lastly, 'financial_date_sum' MySql table should have oneadditional column on it called summary_by and it should be a character variablethat indicates what level the summary is at (i.e, Site, source, adgroupid,financial_date)


The 'financial_date_sum' MySql table should have thefollowing columns then;

financial_date

site

source

adgroupid

summary_by

cost

cost_clicks

revenue

revenue_clicks

impressions

visits

pageviews

ppv

dod_ppv_growth_rate

wow_ppv_growth_rate

profit

dod_profit_growth_rate

wow_profit_growth_rate

conv_rate

dod_conv_rate_growth_rate

wow_conv_rate_growth_rate


二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2014-8-26 09:43:29
英文看着太费劲,另外也没有数据样本。
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2014-8-26 10:07:33
tbl_cost:
financial_date (i.e., 2013-06-01)
Site (i.e., suggest.com, note there could be many sitesin this dataset)
Source (where the visitor came from i.e., Google notethere could be many sources in this dataset)
Adgroupid (a lower level of source, i.e., Travel, notethere could be many adgroupid’s in this dataset)
cost (amount of cost, $150. – can be any number from 0to infinity)
cost_clicks (amount of clicks, 100, – can be any numberfrom 0 to infinity)
1 row per financial_date/site/source/adgroupid
一个一个看
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群