## Chi-Square & T-Test

The computation of the Chi-Square statistic can be accomplished by clicking on [Statistics => Summarize => Crosstabs...]. This particular procedure will be your first introduction to coding of data, in the data editor. To this point data have been entered in a column format. That is, one variable per column. However, that method is not sufficient in a number of situations, including the calculation of Chi-Square, Independent T-tests, and any Factorial ANOVA design with between subjects factors. I'm sure there are many other cases, but they will not be covered in this tutorial.  Essentially, the data have to be entered in a specific format that makes the analysis possible.  The format typcially reflects the design of the study, as will be demonstrated in the examples.

In your text, the following data appear in section 6.????. Please read the text for a description of the study. Essentially, the table - below - includes the observed data and the expected data in parentheses.

 Fault Guilty Not Guilty Total Low 153(127.559) 24(49.441) 177 High 105(130.441) 76(50.559) 181 Total 258 100 358

In the hopes of minimizing the load time for remaining pages,  I will make use of the built in table facilty of HTML to simulate the Data Editor in SPSS. This will reduce the number of images/screen captures to be loaded.

For the Chi-Square statistic, the table of data can be coded by indexing the column and row of the observations.  For example, the count for being guilty with Low fault is 153.  This specific cell can be indexed as coming from row=1 and column=1.  Similarly, Not Guilty with High fault is coded  as row=2 and column=2.  For each observation, four in this instance, there is unique code for location on the table.  These can be entered as follows,

 Row Column Count 1 1 153 1 2 24 2 1 105 2 2 76

• So, 2 rows * 2 columns equals 4 observations.  That should be clear.
• For each of the rows, there are 2 corresponding columns, that is reflected in the Count column.  The Count column represents the number of time each unique combination Row and Column occurs.
The above presents the data in an unambigous manner.  Once entered, the analysis is a matter of selecting the desired menu items, and perhaps selecting additional options for that statistic.  [Don't forget to use the labelling facilities, as mentioned earlier, to meaningfully identify the columns/variables.  The labels that are chosen will appear in the output window.]

To perform the analysis,

• The first step is to inform SPSS that the COUNT variable represents the frequency for each unique coding of ROW and COLUMN, by invoking the WEIGHT command. To do this, click on [Data => Weight Cases]. In the resultant dialog box, enable the Weight cases by option, then move the COUNT variable into the Frequency Variable box. If this step is forgotten, the count for each cell will be 1 for the table.

• Now that the COUNT variable has been processed as a weighted variable, select [Statistics => Summarize => Crosstabs...] to launch the controlling dialog box.

• At the bottom of the dialog box are three buttons, with the most important being the [Statistics...] button. You must click on the [Statistics...] button and then select the Chi-square option, otherwise the statistic will not be calculated. Exploring this dialog box makes it clear that SPSS can be forced to calcuate a number of other statistics in conjuction with Chi-square. For example, one can select the various measures of association (e.g., contingency coefficient, phi and cramer's v,...), among others.

• Move the ROW variable into the Row(s): box, and the COLUMN variable into the Column(s):, then click [OK] to perform the analysis. A subset of the output looks like the following,

Although simple, the calculation of the Chi-square statistic is very particular about all the required steps being followed. More generally, as we enter hypothesis testing, the user should be very careful and should make use of manuals for the programme and textbooks for statistics.

## T-tests

By now, you should know that there are two forms of the t-test, one for dependent variables and one for independent variables, or observations. To inform SPSS, or any stats package for that matter, of the type of design it is necessary to have to different ways of laying out the data. For the dependent design, the two variables in question must be entered in two columns. For independent t-tests, the observations for the two groups must be uniquely coded with a Gruop variable. Like the calculation of the Chi-square statistic, these calculations will reinforce the practice of thinking about, and laying out the data in the correct format.

## Dependent T-Test

To calculate this statistic, one must select [Statistics => Compare Means => Paired-Samples T Test...] after enterin the data. For this analysis, we'll use the data from Table 7.3, in Howell.

• Enter the data into a new datafile. Your data should look a bit like the following. That is, the two variables should occupy separate columns...

 Mnths_6 Mnths_24 124 114 94 88 115 102 110 2 116 2 139 2 116 2 110 2 129 2 120 2 105 2 88 2 120 2 120 2 116 2 105 2 ... ... ... ... 123 132

Note that the variable names start with a letter and are less than 8 characters long. This is a bit constraining, however, one can use the variable label option to label the variable with a longer name. This more descriptive name will then be reproduced in the output window.

• To calculate the t statistic click on [Statistics => Compare Means => Paired-Samples T Test...], then select the two variables of interest. To select the two variables, hold the [Shift] key down while using the mouse for selection. You will note that the selection box requires that variables be selected two at a time. Once the two variables have been selected, move them to the Paired Variables: list. This procedure can be repeated for each pair of variables to be analyzed. In this case, select MNTHS_6 and MNTHS_24 together, then move them to the Paired Variables list. Finally, click the [OK] button.

The critical result for the current analysis will appear in the output window as follows,

As you can see an exact t-value is provided along with an exact p-value, and this p-value is greater that the expected value of 0.025, for a two-tailed assessment. Closer examination indicates several other statistics are presented in output window.

Quite simply, such calculations require very little effort!

## Independent T-tests

When calculating an independent t-test, the only difference involves the way the data are formatted in the datasheet. The datasheet must include both the raw data and group coding, for each variable. For this example, the data from table 7.5 will be used. As an added bonus, the number of observations are unequal for this example.

Take a look at the following table to get a feel for how to code the data.

 Group Exp_Con 1 96 1 127 1 127 1 119 1 109 1 143 1 ... 1 ... 1 106 1 109 2 114 2 88 2 104 2 104 2 91 2 96 2 ... 2 ... 2 114 2 132

From the above you can see that we used the "Group" variable to code for the two variables. The value of 1 was used to code for "LBW-Experimental", while a value of 2 was used to code for "LBW-Control". If you're confused please study the table, above.

To generate the t-statistic,

• Clik on [Statistics => Compare Means => Independent-Samples T Test] to launch the appropriate dialog box.

• Select "exp_con" - the dependent variable list - and move it to the Test Variable(s): box.

• Select "group" - the grouping variable list - and move it to the Grouping Variable: box.

• The final step requires that the groups be defined. That is, one must specify that Group1 - the experimental group in this case - is coded as 1, and Group2 - the control group in this case - is coded as 2. To do this, click on the [Define Groups...] button. Click on the [Continue] button to return to the controlling dialog box.

• Run the analysis by clicking on the [OK] button.

The output for the current analysis extracted from the output window looks like the following.

The p-value of .004 is way lower than the cutoff of 0.025, and that suggests that the means are significantly different. Further, a Levene's Test is performed to ensure that the correct results are used. In this case the variances are equal, however, the calculations for unequal variances are also presented, among some other statistics - some not presented.

In the next section we will briefly demonstrate the calculation of correlations and regression, as discussed in Chapter 9 of Howell. In truth, you should be able to work through many statistics with your current knowledge base and the help files, including correlations and regressions. Most statistics can be calculated with a few clicks of the mouse.