全部版块 我的主页
论坛 计量经济学与统计论坛 五区 计量经济学与统计软件 winbugs及其他软件专版
1451 6
2015-03-21

Authors:

Michael Hahsler, Bettina Grün, Kurt Hornik

Title:

[download]
(15527)
arules - A Computational Environment for Mining Association Rules and Frequent Item Sets

Reference:

Vol. 14, Issue 15, Sep 2005Submitted 2005-04-15, Accepted 2005-09-29

Type:

Article

Abstract:

Mining frequent itemsets and association rules is a popular and well researched approach for discovering interesting relationships between variables in large databases. The R package arules presented in this paper provides a basic infrastructure for creating and manipulating input data sets and for analyzing the resulting itemsets and rules. The package also includes interfaces to two fast mining algorithms, the popular C implementations of Apriori and Eclat by Christian Borgelt. These algorithms can be used to mine frequent itemsets, maximal frequent itemsets, closed frequent itemsets and association rules.


二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2015-3-21 08:08:15

Example 1: Analyzing and preparing a transaction data set


In this example, we show how a data set can be analyzed and manipulated before associations are mined. This is important for finding problems in the data set which could make the mined associations useless or at least inferior to associations mined on a properly prepared data set.
For the example, we look at the Epub transaction data contained in package arules. This data set contains downloads of documents from the Electronic Publication platform of the Vienna University of Economics and Business available via http://epub.wu-wien.ac.at from January 2003 to December 2008.

复制代码


二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-3-21 08:10:11

Example 2: Preparing and mining a questionnaire data set


As a second example, we prepare and mine questionnaire data. We use the Adult data set from the UCI machine learning repository (Asuncion and Newman 2007) provided by package arules. This data set is similar to the marketing data set used by Hastie et al. (2001) in their chapter about association rule mining. The data originates from the U.S. census bureau database and contains 48842 instances with 14 attributes like age, work class, education, etc. In the original applications of the data, the attributes were used to predict the income level of individuals. We added the attribute income with levels small and large,
representing an income of ≤ USD 50,000 and > USD 50,000, respectively. This data is included in arules as the data set AdultUCI.
复制代码


二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-3-21 08:13:52
提示: 作者被禁止或删除 内容自动屏蔽
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-3-21 08:17:30

Example 3: Extending arules with a new interest measure

复制代码

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-3-21 08:36:31

Example 4: Sampling


In this example, we show how sampling can be used in arules. We use again the Adult data set.

复制代码
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

点击查看更多内容…
相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群