Statistical Methods in Bioinformatics
An Introduction
Authors: Warren J. Ewens, Gregory Grant
Advances in computers and biotechnology have had a profound impact on biomedical research, and as a result complex data sets can now be generated to address extremely complex biological questions. Correspondingly, advances in the statistical methods necessary to analyze such data are following closely behind the advances in data generation methods. The statistical methods required by bioinformatics present many new and difficult problems for the research community.
This book provides an introduction to some of these new methods. The main biological topics treated include sequence analysis, BLAST, microarray analysis, gene finding, and the analysis of evolutionary processes. The main statistical techniques covered include hypothesis testing and estimation, Poisson processes, Markov models and Hidden Markov models, and multiple testing methods.
The second edition features new chapters on microarray analysis and on statistical inference, including a discussion of ANOVA, and discussions of the statistical theory of motifs and methods based on the hypergeometric distribution. Much material has been clarified and reorganized.
The book is written so as to appeal to biologists and computer scientists who wish to know more about the statistical methods of the field, as well as to trained statisticians who wish to become involved with bioinformatics. The earlier chapters introduce the concepts of probability and statistics at an elementary level, but with an emphasis on material relevant to later chapters and often not covered in standard introductory texts. Later chapters should be immediately accessible to the trained statistician. Sufficient mathematical background consists of introductory courses in calculus and linear algebra. The basic biological concepts that are used are explained, or can be understood from the context, and standard mathematical concepts are summarized in an Appendix. Problems are provided at the end of each chapter allowing the reader to develop aspects of the theory outlined in the main text.
Table of contents
Front Matter
Pages i-xx
Probability Theory (i): One Random Variable
Pages 1-61
Probability Theory (ii): Many Random Variables
Pages 62-110
Statistics (i): An Introduction to Statistical Inference
Pages 111-154
Stochastic Processes (i): Poisson Processes and Markov Chains
Pages 155-173
The Analysis of One DNA Sequence
Pages 174-219
The Analysis of Multiple DNA or Protein Sequences
Pages 220-256
Stochastic Processes (ii): Random Walks
Pages 257-274
Statistics (ii): Classical Estimation Theory
Pages 275-303
Statistics (iii): Classical Hypothesis Testing Theory
Pages 304-344
BLAST
Pages 345-383
Stochastic Processes (iii): Markov Chains
Pages 384-408
Hidden Markov Models
Pages 409-429
Gene Expression, Microarrays, and Multiple Testing
Pages 430-474
Evolutionary Models
Pages 475-496
Phylogenetic Tree Estimation
Pages 497-536
Back Matter
Pages 537-597