Crosstabulation tables (contingency tables) display the
relationship between two or more categorical (nominal or ordinal) variables. The
size of the table is determined by the number of distinct values for each
variable, with each cell in the table representing a unique combination of
values. Numerous statistical tests are available to determine whether there is a
relationship between the variables in a table. This tutor uses the file demo.sav.
- A Simple Crosstabulation
- Counts vs. Percentages
- Significance Testing for Crosstabulations
- Adding a Layer Variable
1) What factors affect the products that people buy? The most obvious is probably how much money people have to spend. In this example, we'll examine the relationship between income level and PDA (personal digital assistant) ownership.
1a) From the menus choose:
Note: This feature requires the Statistics Base option.
1b) Select Income
category in thousands (inccat) as the row variable.
1c) Select Owns PDA
(ownpda) as the column variable.
1d) Click OK
to run the procedure.
2) The cells of the table show the count or number of cases for
each joint combination of values. For example, 455 people in the income range
$25,000–$49,000 own PDAs.
3) None of the numbers in this table, however, stand out in any
obvious way, indicating any obvious relationship between the variables.
1) It is often difficult to analyze a crosstabulation simply by looking at the
simple counts in each cell.
2) The fact that there are more than twice as many PDA owners
in the $25,000–$49,000 income category than in the under $25,000 category may
not mean much (or anything) since there are also more than twice as many people
in that income category.
2a) Open the Crosstabs dialog box
again. (The two variables should still be selected.)
2b) You can use the Dialog Recall
button on the toolbar to quickly return to recently used procedures.
2c) Click Cells.
2d) Click (check) Row in the Percentages group.
2e) Click Continue and then click OK in the
main dialog box to run the procedure.
3) A clearer picture now starts to emerge. The percentage of people
who own PDAs rises as the income category rises.
1) The purpose of a crosstabulation is to show the relationship (or lack thereof)
between two variables.
2) Although there appears to be some relationship between the
two variables, is there any reason to believe that the differences in PDA
ownership between different income categories is anything more than random
variation?
3) A number of tests are available to determine if the relationship
between two crosstabulated variables is significant. One of the more common
tests is chi-square. One of the advantages of
chi-square is that it is appropriate for almost any kind of data.
3a) Open the Crosstabs dialog box
again.
3b) Click Statistics.
4) Pearson chi-square tests the
hypothesis that the row and column variables are independent. The actual value
of the statistic isn't very informative.
5) The significance value (Asymp. Sig.) has the information we're looking for. The lower the significance
value, the less likely it is that the two variables are independent (unrelated).
6) In this case, the significance value is so low that it is
displayed as .000, which means that it would appear that the two variables are,
indeed, related.
=====================================
1) You can add a layer variable to create a three-way table in which categories of the row and column variables are further subdivided by categories of the layer variable.
1) You can add a layer variable to create a three-way table in which categories of the row and column variables are further subdivided by categories of the layer variable.
2) This variable is sometimes referred to as the control variable because it may reveal how the relationship between the row and column variables changes when you "control" for the effects of the third variable.
2a) Open the Crosstabs dialog box
again.
2b) Click Cells.
2c) Uncheck Row
Percents.
2d) Click Continue.
2e) Select Level of
Education (ed) as the layer variable.
2f) Click OK
to run the procedure.
3) If you look at the crosstabulation table, it might appear that the
only thing we have accomplished is to make the table larger and harder to
interpret.
4) But if you look at the table of chi-square statistics, you can
easily see that in all but one of the education categories, the apparent
relationship between income and PDA ownership disappears (typically, a
significance value less than 0.05 is considered "significant").
5) This suggests that the apparent relationship between income
and PDA ownership is merely an artifact of the underlying relationship between
education level and PDA ownership.
6) Since income tends to rise as education rises, apparent
relationships between income and other variables may actually be the result of
differences in education.
No comments:
Post a Comment