Data Science with R Association Rules
Association analysis defined Data Mining at its roots in 1989 and during the 1990s. It remains
one of the preeminent techniques for modelling big data and so remains a core tool for the data
scientist’s toolbox.
As an unsupervised learning technique it has delivered considerable benefit in areas ranging from the traditional shopping basket analysis to the analysis of who bought what other books or who watched what other videos, and in areas including health care, telecommunications, and so on. Often for any data mining project we might usually begin with association analysis to identify issues with our data and then to build multiple local models. The analysis aims to identify patterns that are linked by some commonality (such as by a common person).
In this chapter we review association analysis and will discover new insights into our data through
the building of association rule models.
The required packages for this module include:
- library(arules) # Association rules.
- library(dplyr) # Data munging: tbl_df(), %>%.
 
 
As we work through this chapter, new R commands will be introduced. Be sure to review the command’s documentation and understand what the command does. You can ask for help using the ? command as in:
?read.csv
We can obtain documentation on a particular package using the help= option of library():
library(help=rattle)
This chapter is intended to be hands on. To learn effectively, you are encouraged to have R running (e.g., RStudio) and to run all the commands as they appear here. Check that you get the same output, and you understand the output. Try some variations. Explore