Nonhierarchical clustering: k-means

The best-known method in the class of nonhierarchical clustering algorithms is the k-means approach. In the k-means method, unlike with the hierarchical ones, observations can be moved from one cluster to another in order to minimize a joint measure of the quality of clustering; hence, the method is iterative in nature. The starting point is the selection of k seeds,… Continue reading Nonhierarchical clustering: k-means

Hierarchical methods

A first class of approaches to cluster formation is based on the sequential construction of a tree (dendrogram), that leads us to form clusters; Fig. 17.5 shows a simple dendrogram. The leaves of the tree correspond to objects; branches of the tree correspond to sequential groupings of observations and clusters. Since a tree suggests a natural hierarchy,… Continue reading Hierarchical methods

Measuring distance

Given two observations , we may define various distance measures between them, such as Other distances may be defined to account for categorical variables; we may also assign weights to variables to express their relative importance. These distances measure the dissimilarity between two single observations, but when we aggregate several observations into clusters, we need some… Continue reading Measuring distance

FACTOR ANALYSIS

The rationale behind factor analysis may be best understood by a small numerical example. Example 17.2 Consider observations in  and the correlation matrix of their component variables X1, X2, …, X5: Does this suggest some structure? We see that X1 and X2 seem to be strongly correlated with each other, whereas they seem weekly correlated with X3, X4, and X5. The latter variables, on the contrary, seem… Continue reading FACTOR ANALYSIS

Applications of PCA

Principal component analysis can be applied in a marketing setting when questionnaires are administered to potential customers asking for a quantitative evaluation along many dimensions. Many such questions are, or are perceived as, redundant. Spotting the few principal components may help in assessing which product features, or combination thereof, are most important. They can also tell… Continue reading Applications of PCA

A small numerical example

Principal component analysis in practice is carried out on sampled data, but it may be instructive to consider an example where both the probabilistic and the statistical sides are dealt with.5 Consider first a random variable with bivariate normal distribution, X ∼ N(0, Σ), where and ρ > 0. Essentially X1 and X2 are standard normal variables with positive correlation ρ. To find the eigenvalues of Σ,… Continue reading A small numerical example

Another view of PCA

Another view is obtained by interpreting the first principal component in terms of orthogonal projection. Consider a unit vector , and imagine projecting the observed vector X on u. This yields a vector parallel to u, of length uTX. Since u has unit length, the projection of observation X(k) on u is We are projecting p-dimensional observations on just one axis, and of course we would like to… Continue reading Another view of PCA

A geometric view of PCA

The linear data transformation, including centering, can be written as where . We assume that data have already been centered, in order to ease notation. Hence The Zi variables are called principal components: Z1 is the first principal component. We recall that the matrix A rotating axes is orthogonal: Now, let us consider the sample covariance matrix of X, i.e., SX. Since we assume centered… Continue reading A geometric view of PCA

THE NEED FOR DATA REDUCTION

Consider a sample of observations , k = 1,…, n. Each observation X(k) consists of a vector of p elements . If p = 2, visualizing observations is easy, but this is certainly no piece of cake for large values of p. Hence, we need some way to reduce data dimensionality, by mapping observations in  to observations in a lower-dimensional space , where q is possibly much smaller than p. Reducing data… Continue reading THE NEED FOR DATA REDUCTION