- H
true H_{0}false_{0} - Reject H
Type I error Correct_{0} - Fail to Correct Type II error
- reject H
_{0} - Type I error is the probability of rejecting the null hypothesis when it is really true
- The probability of making a type I error is denoted as
- Type II error is the probability of failing to reject a null hypothesis that is really false
- The probability of making a type II error is denoted as

In this chapter, you'll often see these outcomes represented with distributions

To make these representations clear, let's first consider
the situation where H** _{0}** is, in fact,
true:

Now assume that H** _{0}**
is false (i.e., that some "treatment" has an effect on our dependent
variable, shifting the mean to the right)

Thus, power can be defined as follows:

Assuming some manipulation effects the dependent variable,
** power** is the

As such, the power of an experiment depends on three (or four) factors:

As alpha is moved to the left (for example, if one used an alpha of 0.10 instead of 0.05), beta would decrease, power would increase ... but, the probability of making a type I error would increase

*:*

The further that H** _{1}**
is shifted away from H

The smaller the standard error of the mean (i.e., the
less the two distributions overlap), the greater the power. As suggested
by the CLT, the standard error of the mean is a function of the ** population
variance** and

Most power calculations use a term called effect size
which is actually a measure of the degree to which the H_{0} and
H_{1} distributions overlap

As such, effect size is sensitive to both the difference
between the means under H_{0} and H_{1}, and the standard
deviation of the parent populations

Specifically:

In English then, d is the number of standard deviations
separating the mean of H** _{0}** and the
mean of H

Note: N has not been incorporated in the above formula. You'll see why shortly

- Prior research
- An assessment of the size of effect that would be important
- Rule of thumb:
- small effect d=.20
- medium effect d=.50
- large effect d=.80

As d forms the basis of all calculations of power, the first step in these calculations is to estimate d

Since we do not typically know how big the effect will be a priori, we must make an educated guess on the basis of:

The calculation of d took into account 1) the difference
between the means of H** _{0}** and H

However, it did not take into account the third variable the effects the overlap of the two distributions; N

This was done purposefully so that we have one term that represents the relevant variables we, as experimenters, can do nothing about (d) and another representing the variable we can do something about; N

The statistic we use to recombine these factors is called delta and is computed as follows:

where the specific differs depending on the type of t-test you are computing the power for

In the context of a one sample t-test, the alluded to above is simply

Thus, when calculating the power associated with a one sample t, you must go through the following steps:

1) Estimate d, or calculate it using:

2) Calculate using:

3) Go to the power table, and find the power associated with the calculated given the level of you plan to use (or used) for the t-test

Say I find a new stats textbook and after looking at it, I think it will raise the average mark of the class by about 8 points. From previous classes, I am able to estimate the population standard deviation as 15. If I now test out the new text by using it with 20 new students, what is my power to reject the null hypothesis (that the new students marks are the same as the old students marks)

How many new students would I have to test to bring my power up to .90?

Note: Don't worry about the bit on "noncentrality parameters" in the book

When an independent t-test is used, the power calculations use the same computation for calculating d, but the calculations of are different because of a different

When sample sizes are equal, you do the following:

1) Estimate d, or calculate it using:

2) Calculate using:

- 3) Go to the power table, and find the power associated with the calculated given the level of you plan to use (or used) for the t-test

where N is the number of subjects in one of the samples

Assume I am going to run two groups of 18 subjects through a non-smoking study. One group will receive the treatment of interest, the other will not. I expect the treatment to have a medium effect, but I have nothing to go on other than that. Assuming there really is a medium effect, what is my power to detect it?

How many subjects would I need to run to increase my power to 0.80?

Power calculations for independent samples t-tests become slightly more complicated when Ns are unequal.

The proper way to deal with the situation is to do everything
the same as above except to use the harmonic mean of the two Ns (N** _{1}**
& N

The harmonic mean of two Ns is denoted and computed as follows:

So, as a final example, reconsider the power of my smoking study if I had run 24 subjects in my stop smoking group, but only 12 in my control group.