Statistical data analysis divides the methods for analyzing data into two categories: exploratory methods and confirmatory methods. Exploratory methods are used to discover what the data seems to be saying by using simple arithmetic and easy-to-draw pictures to summarize data. Confirmatory methods use ideas from probability theory in the attempt to answer specific questions. Probability is important in decision making because it provides a mechanism for measuring, expressing, and analyzing the uncertainties associated with future events. The majority of the topics addressed in this course fall under this heading.


Studying a problem through the use of statistical data analysis usually involves four basic steps.

What distinguishes data mining from conventional statistical data analysis is that data mining is usually done for the purpose of "secondary analysis" aimed at finding unsuspected relationships unrelated to the purposes for which the data were originally collected.

The concept of influence is the study of the impact on the conclusions and inferences on various fields of studies including statistical data analysis. This is possible by a perturbation analysis. For example, the influence function of an estimate is the change in the estimate when an infinitesimal change in a single observation divided by the amount of the change. It acts as the sensitivity analysis of the estimate.

For undergraduate and graduate level courses that combines introductory statistics with data analysis or decision modeling.

Designing ways to collect data is an important job in statistical data analysis. Two important aspects of a statistical study are:

There are a few fundamental statistical tests such as test for randomness, test for homogeneity of population, test for detecting outliner(s), and then test for normality. For all these necessary tests there are powerful procedures in statistical data analysis literatures. Moreover since the authors are limiting their presentation to the test of mean, they can invoke the CLT for, say any sample of size over 30.