Assessment 3 Context
You will review the theory, logic, and application of t-tests. The t-test is a basic inferential statistic often reported in psychological research. You will discover that t-tests, as well as analysis of variance (ANOVA), compare group means on some quantitative outcome variable.
Recall that null hypothesis tests are of two types: (1) differences between group means and (2) association between variables. In both cases there is a null hypothesis and an alternative hypothesis. In the group means test, the null hypothesis is that the two groups have equal means, and the alternative hypothesis is that the two groups do not have equal means. In the association between variables type of test, the null hypothesis is that the correlation coefficient between the two variables is zero, and the alternative hypothesis is that the correlation coefficient is not zero.
Notice in each case that the hypotheses are mutually exclusive. If the null is false, the alternative must be true. The purpose of null hypothesis statistical tests is generally to show that the null has a low probability of being true (the p value is less than .05) – low enough that the researcher can legitimately claim it is false. The reason this is done is to support the allegation that the alternative hypothesis is true.
In this context you will be studying the details of the first type of test. This is the test of difference between group means. In variations on this model, the two groups can actually be the same people under different conditions, or one of the groups may be assigned a fixed theoretical value. The main idea is that two mean values are being compared. The two groups each have an average score or mean on some variable. The null hypothesis is that the difference between the means is zero. The alternative hypothesis is that the difference between the means is not zero. Notice that if the null is false, the alternative must be true. It is first instructive to consider some of the details of groups. Means, and difference between them.
Null Hypothesis Significance Test
The most common forms of the Null Hypothesis Significance Test (NHST) are three types of t tests, and the test of significance of a correlation. The NHST also extends to more complex tests, such as ANOVA, which will be discussed separately. Below, the null hypothesis and the alternative hypothesis are given for each of the following tests. It would be a valuable use of your time to commit the information below to memory. Once this is done, then when we refer to the tests later, you will have some structure to make sense of the more detailed explanations.
1. One-sample t test: The question in this test is whether a single sample group mean is significantly different from some stated or fixed theoretical value – the fixed value is called a parameter.
· Null Hypothesis: The difference between the sample group mean and the fixed value is zero in the population.
· Alternative hypothesis: The difference between the sample group mean and the fixed value is NOT zero in the population.
2. Dependent samples t test (also known as correlated groups t or repeated measures t): The question in this test is whether two scores for each participant differ significantly. It is actually a special case of the one-sample test, where each person’s score is the difference between his or her two original scores (difference scores). If there is no significant difference in the population, then the mean population difference score is zero (the fixed value).
· Null Hypothesis: The mean difference between the two scores for each participant is zero in the population.
· Alternative hypothesis: The mean difference between the two scores for each participant is NOT zero in the population
3. Independent samples t test (two independent groups): The question in this test is whether or not two group means are from the same population, or from populations with different means.
· Null Hypothesis: The difference between the two group’s means is zero in the population, or the two groups are from the same population.
· Alternative hypothesis: The difference between the two group’s means is NOT zero in the population, or the two groups are from different populations.
Logic of the t-Test
Imagine that a school psychologist compares the mean IQ scores of Class A versus Class B. The mean IQ for Class A is 102.0 and the mean IQ for Class B is 105.0. Is there a significant difference in mean IQ between Class A and Class B?
To answer this question, the school psychologist conducts an independent samples t-test. The independent samples t-test compares two group means in a between-subjects (between-S) design. In this between-S design, participants in two independent groups are measured only once on some outcome variable. By contrast, a paired samples t-test compares group means in a within-subjects (within-S) design for one group. Each participant is measured twice on some outcome variable, such as a pretest-posttest design. For example, a school psychologist could measure self-esteem for a class of students prior to taking a public speaking course (pretest) and then measure self-esteem again after completing the public speaking course (posttest). The paired samples t-test determines if there is a significant difference in mean scores from the pretest to the posttest.
Focus on the logic and application of the independent samples t-test. There are two variables in an independent samples t-test: the predictor variable (X) and the outcome variable (Y). The predictor variable must be dichotomous, meaning that it can only have two values (for example, male = 1; female = 2). Notice this is nominal level variable. The outcome variable must be at the interval level or above (ratio). Group membership is mutually exclusive. In nonexperimental designs, group membership is based on some naturally occurring characteristic of a group (for example, gender). In experimental designs, participants are randomly assigned to one of two group conditions (for example, treatment group = 1; control group = 2). In contrast to the dichotomous (nominal) predictor variable, the outcome variable must be quantitative to calculate a group mean (for example, mean IQ score, mean heart rate score).
Assumptions of the t-Test
All inferential statistics, including the independent samples t-test, operate under assumptions checked prior to calculating the t-test in SPSS. Violations of assumptions can lead to erroneous inferences regarding a null hypothesis. The first assumption is independence of observations. For predictor variable X in an independent samples t-test, participants are assigned to one and only one “condition” or “level,” such as a treatment group or control group. This assumption is not statistical in nature; it is controlled by proper research procedures that maintain independence of observations.
The second assumption is that outcome variable Y is quantitative and normally distributed. This assumption is checked by a visual inspection of the Y histogram and calculation of skewness and kurtosis values. A researcher may also conduct a Shapiro-Wilk test in SPSS to check whether a distribution is significantly different from normal. The null hypothesis of the ShapiroWilk test is that the distribution is normal. If the Shapiro-Wilk test is significant, then the normality assumption is violated. In other words, a researcher wants the Shapiro-Wilk test to not be significant at p < .05.
The third assumption is referred to as the homogeneity of variance assumption. Ideally, the amount of variance in Y scores is approximately equal for group 1 and group 2. This assumption is checked in SPSS with the Levene test. The null hypothesis of the Levene test is that group variances are equal. If the Levene test is significant, then the homogeneity assumption is violated. In other words, a researcher wants the Levene test to not be significant at p < .05.
SPSS output for the t-test provides two versions of the t-test: “Equal variances assumed” and “Equal variances not assumed.” If the Levene test is not significant, researchers report the “Equal variances assumed” version of the t-test. If the Levene test is significant, researchers report the more conservative “Equal variances not assumed” calculation of the t-test in the second row of the output table.
Hypothesis Testing for a t-Test
The null hypothesis for a t-test predicts no significant difference in population means, or H0: µ1 = µ2. A directional alternative hypothesis for a t-test is that the population means differ in a specific direction, such as H1: µ1 > µ2 or H1: µ1 < µ2. A non-directional alternative hypothesis simply predicts that the population means differ, but it does not stipulate which population mean is significantly greater (H1: µ1 ≠ µ2). For t-tests, the standard alpha level for rejecting the null hypothesis is set to .05. SPSS output for a t-test showing a p value of less than indicates that the null hypothesis should be rejected; there is a significant difference in population means. A p value greater than .05 indicates that the null hypothesis should not be rejected; there is not a significant difference in population means.
Effect Size for a t-Test
There are two commonly reported estimates of effect size for the independent samples t-test, including eta squared (η2) and Cohen’s d . Eta squared is analogous to r2. It estimates the amount of variance in Y that is attributable to group differences in X. Eta squared ranges from 0 to 1.0, and it is interpreted similarly to r2 in terms of “small,” “medium,” and “large” effect sizes. Eta squared is calculated as a function of an obtained t value and the study degrees of freedom.
Cohen’s d is an alternate effect size representing the number of standard deviations the two population means are in the sample. A small Cohen’s d (< .20) indicates a high degree of overlap in population means. A large Cohen’s d (> .80) indicates a low degree of overlap in population means.
Testing Assumptions: The Shapiro-Wilk Test and the Levene Test
Recall that two assumptions of the t-test are that:
4. Outcome variable Y is normally distributed.
5. The variance of Y scores is approximately equal across groups (homogeneity assumption).
The Shapiro-Wilk Test
In addition to a visual inspection of histograms and skewness and kurtosis values, SPSS provides a formal statistical test of normality referred to as the Shapiro-Wilk test. A perfect normal distribution will have a Shapiro-Wilk value of 1.0. Values less than 1.0 indicate an increasing departure from a perfect normal shape. The null hypothesis of the Shapiro-Wilk test is that the distribution is normal. When the Shapiro-Wilk test indicates a p value less than .05, the normality assumption is violated.
To obtain the Shapiro-Wilk test, in SPSS select “Analyze…Descriptive Statistics…Explore.” Place the outcome variable Y in the “Dependent List” box and select the “Plots” option. Select the “Normality plots with tests” option. Press “Continue” and then “Ok.” SPSS provides the Shapiro-Wilk test output for interpretation. A significant Shapiro-Wilk test ( p < .05) suggests that the distribution is not normal and interpretations may be affected. However, the t-test is fairly robust to violations of this assumption when sample sizes are sufficiently large (that is, > 100).
The Levene Test
The homogeneity of variance assumption is tested with Levene test. The Levene test is automatically generated in SPSS when an independent samples t-test is conducted. The null hypothesis for the Levene test is that group variances are equal. A significant Levene test ( p < .05) indicates that the homogeneity of variance assumption is violated. In this case, report the “Equal variances not assumed” row of the t-test output. This version of the t-test uses a more conservative adjusted degrees of freedom ( df) that compensates for the homogeneity violation. The adjusted df can often result in a decimal number (for example, df = 13.4), which is commonly rounded to a whole number in reporting (for example, df = 13). If the Levene test is not significant (that is, homogeneity is assumed), report the “Equal variances assumed” row of the t-test output.
Proper Reporting of the Independent Samples t-Test
Reporting a t-test in proper APA style requires an understanding of the following elements, including the statistical notation for an independent samples t-test (t), the degrees of freedom, the t value, the probability value, and the effect size. To provide context, provide the means and standard deviations for each group. For example, imagine an industrial/organizational psychologist randomly assigns 9 employees to a treatment group (for example, team-bonding exercises) and 9 employees to a control group (for example, no exercises) and then subsequently measures their rates of organizational citizenship behavior (OCB) over a period of six months. The results show:
The mean OCB scores differed significantly across groups, t(16) = -2.58, p = .02 (two-tailed). Mean OCB for the control group (M = 67.8, SD = 8.2) was about 10 OCB points lower than mean OCB for the treatment group (M = 77.9, SD = 8.1). The effect size, as indexed by η2 was .30; this is a very large effect.
t, Degrees of Freedom, and t Value
The statistical notation for an independent samples t-test is t, and following it is the degrees of freedom for this statistical test. The degrees of freedom for t is n1 + n2 – 2, where n1 equals the number of participants in group 1 and n2 equals the number of participants in group 2. In the example above, N = 18 (n1 = 9; n2 = 9). The t value is a ratio of the difference in group means divided by the standard error of the difference in sample means. The t value can be either positive or negative.
A researcher estimates the probability value based on a table of critical values of t for rejecting the null hypothesis. In the example above, with 16 degrees of freedom and alpha level set to .05 (two-tailed), the table indicates a critical value of +/- 2.12 to reject the null hypothesis. The obtained t value above is -2.58, which exceeds the critical value required to reject the null hypothesis. SPSS determined the exact p value to be .02. This p value is less than .05, which indicates that the null hypothesis should be rejected for the alternative hypothesis (that is, the two groups are significantly different in mean OCB).
A common index of effect size for the independent samples t-test is eta squared (η2). SPSS does not provide this output for the independent samples t-test, but it is easily calculated by hand with the following formula: t2 ÷ (t2 + df). In the example above, the calculation is (-2.58)2 ÷ [(-2.58)2 + 16] = 6.65 ÷ (6.65 + 16) = 6.65 ÷ 22.65 = .29. This eta squared value falls between < .20 and > .80, and is therefore a “medium” effect size.
Lane, D. M. (2013). HyperStat online statistics textbook. Retrieved from http://davidmlane.com/hyperstat/index.html
Warner, R. M. (2013). Applied statistics: From bivariate through multivariate techniques (2nd ed.). Thousand Oaks, CA: Sage Publications.