跪求牛人帮看下这个题怎么code（level2）

1883

收藏 2014-05-13

Level 1
Scenario:
There is a file called “Big_Data” and a file called “Needed_Data”.  The “Needed_Data file contains a list of fields that needs to be pulled from the “Big_Data” file.  When the code runs it should create the file called “Output_Data” and it should contain all the needed fields plus the primary key of the “Big_Data” file.
Input  files
File one:
• File Name: Big_Data
• File Type: SAS dataset
• Records: 10 million records.
• Variables: 5 thousand variables per record.
• Primary key: Account_number

File two:
• File Name: Needed_Data.
• File Type SAS dataset.
• Records: 1 to X number.
• Variables: 1
• Varname:
o Keep_list: Contains the name of a single variable that would be on the Big_Data file.

Example data:
  Keep_list
  Apples
  Oranges
  Grapes

Processing requirement:
Output file “Output_Data” should contain all the fields that was requested in the “Needed_Data” file plus the primary key.

Output and Usage requirement:
None.

Error handling requirement:
None.

Suggestion:
For now assume the “Needed_Data” file will always contain variables that are on the “Big_Data” file.

Level 2

All requirements identical to Level 1 except for the following changes.

Input  files
File two:
• File Name: Needed_Data.
• File Type SAS dataset.
• Records: 1 to X number.
• Variables: 3
• Varname:
o Keep_list: Name of a single variable that is on the “Big_Data” file.
o Where_list: The expected value of the variable in the keep list.
o Rename_list: The name of the variable to be named in the “Output_Data file”.

Example data:
  Keep_list Where_list Rename List
  Apples Red Ambrosia
         Oranges Orange
         Grapes Green Seedless

Processing requirement:
Output file “Output_Data” should contain all the fields that was requested in the “Needed_Data” file plus the primary key.  The output fields should be renamed where asked it was asked for.

Example Output_data:
  Account_number
  Ambrosia
   Oranges
   Seedless

Output and Usage requirement:
None.

Error handling requirement:
Do not expect all fields being request in the “Needed_Data” file is on the “Big_Data” file.  If a field is missing it should not show up on the “Output_Data” file and a note should be add to the log indicating the data field was not available. The code should then continue with the remainder of the fields.

Suggestion:
Do not assume the “Needed_Data” file will always contain variables that are on the “Big_Data” file

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群