Covariance matrices

Given a random vector  with expected value μ, the covariance matrix can be expressed as Note that inside the expectation we are multiplying a column vector p × 1 and a row vector 1 × p, which does result in a square matrix p × p. It may also be worth noting that there is a slight inconsistency of notation, since we denote… Continue reading Covariance matrices

MATRIX ALGEBRA AND MULTIVARIATE ANALYSIS

In this section we discuss a few more concepts that are useful in multivariate analysis. Unfortunately, when moving to multivariate statistics, we run out of notation. As usual, capital letters will refer to random quantities, with boldface reserved for random vectors such as X and Z; elements of these vectors will be denoted by Xi and Zi, and scalar random variables will… Continue reading MATRIX ALGEBRA AND MULTIVARIATE ANALYSIS

Correspondence analysis

Correspondence analysis is a graphical technique for representing the information included in a two-way contingency table containing frequency counts. For example, Table 15.2 lists the number of times an attribute (crispy, sugar-free, good with coffee, etc.) is used by consumers to describe a snack (cookies, candies, muffins, etc.).5 The method deals with two categorical or discrete quantitative variables… Continue reading Correspondence analysis

Multidimensional scaling

Multidimensional scaling is a family of procedures that aim at producing a low-dimensional representation of object similarity/dissimilarity. Consider n brands and a similarity matrix, whose entry dij measures the distance between brands i and j, as perceived by consumers. This matrix is a direct input of multidimensional scaling, whereas other methods aim at computing distances. Then, we want to find a representation… Continue reading Multidimensional scaling

Structural equation models with latent variables

Consider the relationship between the following variables: The assumption that these variables are somehow related makes sense, but unfortunately they are not directly observable; they are latent variables. Nevertheless, imagine that we wish to build a model expressing the dependence between latent variables. For instance, we may consider the structural equation where ζ and ξ are latent variables, ν is an error term,… Continue reading Structural equation models with latent variables

Discriminant analysis

Consider a firm that, on the basis of a set of variables measuring customer attributes, wishes to discriminate between purchasers and nonpurchasers of a product of service. In concrete, the firm has collected a sample of consumers and, given their attributes and observed behavior, wants to find a way to classify them. Two-group discriminant analysis… Continue reading Discriminant analysis

Canonical correlation

Consider two sets of variables that are collected in vectors X and Y, respectively, and imagine that we would like to study the relationship between the two sets. One way for doing so is by forming two linear combinations, Z = aTX and W = bTY, in such a way that the correlation ρZ,W is maximized. This is what is accomplished by canonical correlation, or canonical analysis. Essentially,… Continue reading Canonical correlation

Cluster analysis

The aim of cluster analysis is categorization, i.e., the creation of groups of objects according to their similarities. The idea is hinted at in Fig. 15.3. There are other methods, such as discriminant analysis, essentially aimed at separating groups of observations. However, they differ in the underlying approach, and some can only deal with metric data.… Continue reading Cluster analysis

Factor analysis

Factor analysis is another interdependence technique, which shares some theoretical background with PCA, as we show in Section 17.3. Factor analysis can be used for data reduction, too, but it should not be confused with PCA, as in factor analysis we are looking for hidden factors that may explain common sources of variance between variables. Formally,… Continue reading Factor analysis

Principal component analysis

Principal component analysis (PCA) is a data reduction method. Technically, we take a vector of random variables , and we transform it to another vector , by a linear transformation represented by a square matrix . In more detail we have These equations should not be confused with regression equations. The transformed Zi variables are not observed and used in… Continue reading Principal component analysis