A User's Guide to Business Analytics
Ayanendranath Basu, Srabashi Basu
A User's Guide to Business Analytics provides a comprehensive discussion of statistical methods useful to the business analyst. Methods are developed from a fairly basic level to accommodate readers who have limited training in the theory of statistics. A substantial number of case studies and numerical illustrations using the R-software package are provided for the benefit of motivated beginners who want to get a head start in analytics as well as for experts on the job who will benefit by using this text as a reference book.
The book is comprised of 12 chapters. The first chapter focuses on business analytics, along with its emergence and application, and sets up a context for the whole book. The next three chapters introduce R and provide a comprehensive discussion on descriptive analytics, including numerical data summarization and visual analytics. Chapters five through seven discuss set theory, definitions and counting rules, probability, random variables, and probability distributions, with a number of business scenario examples. These chapters lay down the foundation for predictive analytics and model building.
Chapter eight deals with statistical inference and discusses the most common testing procedures. Chapters nine through twelve deal entirely with predictive analytics. The chapter on regression is quite extensive, dealing with model development and model complexity from a user’s perspective. A short chapter on tree-based methods puts forth the main application areas succinctly. The chapter on data mining is a good introduction to the most common machine learning algorithms. The last chapter highlights the role of different time series models in analytics. In all the chapters, the authors showcase a number of examples and case studies and provide guidelines to users in the analytics field.
Features
• Presents a comprehensive discussion on commonly used statistical methods
• Includes case studies from various business applications and discusses issues faced by users
• Offers an interdisciplinary review of concepts
• Uses R throughout, which is free and has a very wide acceptability
• Provides R codes and explains/interprets outputs
Table of Contents
What Is Analytics?
The Emergence and Application of Analytics
Similarities with and Dissimilarities from Classical Statistical Analysis
Theory versus Computational Power
Fact versus Knowledge: Report versus Prediction
Actionable Insight
Suggested Further Reading
Introducing R—An Analytics Software
Basic System of R
Reading, Writing, and Extracting Data in R
Statistics in R
Graphics in R
Further Notes about R
Suggested Further Reading
Reporting Data
What Is Data?
Types of Data
Data Collection and Presentation
Reporting Current Status
Measures of Association for Categorical Variables
Suggested Further Reading
Statistical Graphics and Visual Analytics
Univariate and Bivariate Visualization
Multivariate Visualization
Mapping Techniques
Scopes and Challenges of Visualization
Suggested Further Reading
Probability
Basic Set Theory
The Classical Definition of Probability
Counting Rules
Axiomatic Definition of Probability
Conditional Probability and Independence
The Bayes Theorem
Comprehensive Example
Appendix
Suggested Further Reading
Random Variables and Probability Distributions
Discrete and Continuous Random Variables
Some Special Discrete Distributions
Distribution Functions
Bivariate and Multivariate Distributions
Expectation
Appendix
Suggested Further Reading
Continuous Random Variables
The PDF and the CDF
Special Continuous Distributions
Expectation
The Normal Distribution
Continuous Bivariate Distributions
Independence
The Bivariate Normal Distribution
Sampling Distributions
The Central Limit Theorem
Sampling Distributions Arising from the Normal
Random Samples from Two Independent Normal Distributions
Normal Q-Q Plots
Summary
Appendix
Suggested Further Reading
Statistical Inference
Inference about a Single Mean
Single Population Mean with Unknown Variance
Two Sample t-test: Independent Samples
Two Sample t-test: Dependent (Paired) Samples
Analysis of Variance
Chi-Square Tests
Inference about Proportions
Appendix
Suggested Further Reading
Regression for Predictive Model Building
Simple Linear Regression
Multiple Linear Regression
ANOVA for Multiple Linear Regression
Hypotheses of Interest in Multiple Linear Regression
Interaction
Regression Diagnostics
Regression Model Building
Other Regression Techniques
Logistic Regression
Interpreting Logistic Regression Model
Interpretation and Inference for Logistic Regression
Goodness of Fit for the Logistic Regression Model
Hosmer-Lemeshow Statistics
Classification Table and ROC Curve
Suggested Further Reading
Decision Trees
Algorithm for Tree-Based Methods
Impurity Measures
Pruning a Tree
Aggregation Method: Bagging
Random Forest
Variable Importance
Decision Tree and Interaction among Predictors
Suggested Further Reading
Data Mining and Multivariate Methods
Dimension Reduction Technique: Principal Component Analysis
Factor Analysis
Classification Problem
Discriminant Analysis
Clustering Problem
Suggested Further Reading
Modeling Time Series Data for Forecasting
Characteristics and Components of Time Series Data
Time Series Decomposition
Autoregression Models
Forecasting Time Series Data
Other Time Series
Suggested Further Reading