MULTIDIMENSIONAL DATA

So far, we have considered the organization and representation of data in one dimension, but in applications we often observe multidimensional data. Of course, we may list summary measures for each single variable, but this would miss an important point: the relationship between different variables. In issues concerning independence, correlation, etc. Here we want to… Continue reading MULTIDIMENSIONAL DATA

Quartiles and boxplots

Among the many percentiles, a particular role is played by the quartiles, denoted by Q1, Q2, and Q3, corresponding to 25%, 50%, and 75%, respectively. Clearly, Q2 is simply the median. A look at these values and the mean tells a lot about the underlying distribution. Indeed, the interquartile range has been proposed as a measure of dispersion, and an alternative measure… Continue reading Quartiles and boxplots

Dispersion measures

Location measures do not tell us anything about dispersion of data. We may have two distributions sharing the same mean, median, and mode, yet they are quite different. Figure 4.8, repeated illustrates the importance of dispersion in discerning the difference between distributions sharing location measures. One possible way to characterize dispersion is by measuring the range X(n) − X(1) i.e.,… Continue reading Dispersion measures

Location measures: mean, median, and mode

We are all familiar with the idea of taking averages. Indeed, the most natural location measure is the mean. DEFINITION 4.5 (Mean for a sample and a population) The mean for a population of size n is defined as The mean for a sample of size n is The two definitions above may seem somewhat puzzling,… Continue reading Location measures: mean, median, and mode

ORGANIZING AND REPRESENTING RAW DATA

We have introduced the basic concepts of frequencies and histograms in Section 1.2.1. Here we treat the same concepts in a slightly more systematic way, illustrating a few potential difficulties that may occur even with these very simple ideas. Imagine a car insurance agent who has collected the weekly number of accidents occurred during the last… Continue reading ORGANIZING AND REPRESENTING RAW DATA

WHAT IS STATISTICS?

A rather general answer to this question is that statistics is a group of methods to collect, analyze, present, and interpret data (and possibly to make decisions). We often consider statistics as a branch of mathematics, but this is the result of a more recent tendency. From a historical perspective, the term “statistics” stems from… Continue reading WHAT IS STATISTICS?

Introduction

Some fundamental concepts of descriptive statistics, like frequencies, relative frequencies, and histograms, have been introduced informally. Here we want to illustrate and expand those concepts in a slightly more systematic way. Our treatment will be rather brief since, within the framework descriptive statistics is essentially a tool for building some intuition paving. We introduce basic… Continue reading Introduction