全部版块 我的主页
论坛 提问 悬赏 求职 新闻 读书 功能一区 学道会
548 0
2020-03-20
C3 data and informationLO
Explain how data and its sources are an asset to organizations, governments, and the lives of citizensExplain the distinction between data, information, knowledge, and wisdomExplain why data quality is important
Define and operationalize key data-quality attributes
Define attributes of datasets, such as missing values, outliers, and probability distributions.


data growth

Data summarization

Data quality

Production view of data quality
Intrinsic data quality: accuracy, objectivity, believability, reputation
Accessibility data quality: accessibility, access security
Contextual data quality: relevancy, value-added, timeliness, completeness, amount of data
Representational data quality: interpretability, ease of understanding, concise representation, consistent representation

data quality in six dimensions
accuracy / completeness /  timeliness / Validity / Integrity / Consistency

Consumption view of data quality

Data characteristics

Data types

Variables
-Binary
-nominal
-ordinal
-interval
-ratio

Cardinality

Data distributions

The dangers of assuming normally distributed data

Outliers
An outlier is an observation that is distinctly different from the other observations.
- Procedural error
- Extraordinary event
- Extraordinary observation
- Unique combination of variables

Missing data
- Missing completely at random (MCAR)
- Missing at random (MAR)
- Missing not at random (MNAR)
- -- - - -missing value analysis and imputation- common skills



二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群