Correlation analysis testing significance and potential dangers

To estimate the correlation coefficient ρXY between X and Y, we may just plug sample covariance SXY and sample standard deviations SX, SY into its definition, resulting in the sample coefficient of correlation, or sample correlation for short: The factors n − 1 in SXY, SX, and SY cancel each other, and it can be proved that −1 ≤ rXY ≤ +1, just like its probabilistic counterpart ρXY. Once again, we stress that the… Continue reading Correlation analysis testing significance and potential dangers

Estimating covariance and related issues

Just as we have defined sample variance, we may define sample covariance SXY between random variables X and Y: where n is the size of the sample, i.e., the number of observed pairs (Xi, Yi). Sample covariance can also be rewritten as follows: To see this, we note the following: This rewriting mirrors the relationship σXY = E[XY] − μX μY from probability theory. It is important to realize that our… Continue reading Estimating covariance and related issues

Estimating and testing variance

It is easy to prove that sample variance S2 is an unbiased estimator of variance σ2, but if we want a confidence interval for variance, we need distributional results on S2, which depend on the underlying population. For a normal population we may take advantage of Theorem 9.4. In particular, we recall that the sample variance is related to the… Continue reading Estimating and testing variance

Testing hypotheses about the difference in the mean of two populations

Sometimes, we have to run a test concerning two (or more) populations. For instance, we could wonder if two markets for a given product are really different in terms of expected demand. Alternatively, after the re-engineering of a business processes, we could wonder whether the new performance measures are significantly different from the old ones.… Continue reading Testing hypotheses about the difference in the mean of two populations

Testing with p-values

In the manufacturing example of the previous section we found such a large value for the test statistic that we are quite confident that the null hypothesis should be rejected, whatever significance level we choose. In other cases, finding a suitable value of α can be tricky. Recall that the larger the value of α, the easier it… Continue reading Testing with p-values

One-tail tests

When the null hypothesis is of the form H0 : μ = μ0, we consider a two-tail rejection region. In many problems, the null hypotheses has the form H0 : μ ≥ μ0 or H0 : μ ≤ μ0 are more appropriate. As one could expect, this leads to a rejection region consisting of one tail. Before illustrating the technicalities involved, it is useful to consider a practical example. Example 9.14 A firm… Continue reading One-tail tests

HYPOTHESIS TESTING

The need for testing a hypothesis about an unknown parameter arises from many problems related to inferential statistics. There are general and powerful ways to build appropriate procedures for testing hypotheses, which we outline in Section 9.10. Since they do require some level of mathematical sophistication, we offer here an elementary treatment that is strongly linked… Continue reading HYPOTHESIS TESTING

Setting the sample size

From a qualitative perspective, the form of the confidence interval (9.10) suggests the following observations: The last statement is quite relevant, and is related to an important issue. So far, we have considered a given sample and we have built a confidence interval. However, sometimes we have to go the other way around: Given a… Continue reading Setting the sample size