1) Several categorical variables in the data file demo.sav are, in fact, derived from scale variables in that data file. For example, the variable inccat is simply income grouped into four categories.
2) This categorical variable uses the integer values 1–4 to represent the following income categories (in thousands): less than $25, $25–$49, $50–$74, and $75 or higher.
3) To create the categorical variable inccat:
► From the menus in the Data Editor
window choose:
5) Since Visual Binning relies on actual values in the data
file to help you make good binning choices, it needs to read the data file
first. Since this can take some time if your data file contains a large number
of cases, this initial dialog box also allows you to limit the number of cases
to read ("scan").
6) This is not necessary for our sample data file. Even though
it contains more than 6,000 cases, it does not take long to scan that number of
cases.
► Drag and drop Household income in thousands [income] from the Variables list
into the Variables to Bin list, and then click Continue.
► In the main Visual Binning dialog
box, select Household income in thousands [income] in
the Scanned Variable List.
A histogram displays the distribution of the selected variable
(which in this case is highly skewed).
► Enter inccat2 for the new binned variable name and Income category [in thousands] for the variable label.
► Click Make
Cutpoints.
► Select Equal
Width Intervals.
► Enter 25
for the first cutpoint location, 3 for the number of
cutpoints, and 25 for the width.
The number of binned categories is one greater than the number of
cutpoints. So in this example, the new binned variable will have four
categories, with the first three categories each containing ranges of 25
(thousand) and the last one containing all values above the highest cutpoint
value of 75 (thousand).
► Click Apply.
The values now displayed in the grid represent the defined
cutpoints, which are the upper endpoints of each category. Vertical lines in the
histogram also indicate the locations of the cutpoints.
By default, these cutpoint values are included in the
corresponding categories. For example, the first value of 25 would include all
values less than or equal to 25.
But in this example, we want categories that correspond to
less than 25, 25–49, 50–74, and 75 or higher.
► In the Upper Endpoints group,
select Excluded (<).
► Then click Make
Labels.
This automatically generates descriptive value labels for each
category. Since the actual values assigned to the new binned variable are simply
sequential integers starting with 1, the value labels can be very useful.
You can also manually enter or change cutpoints and labels
in the grid, change cutpoint locations by dragging and dropping the cutpoint
lines in the histogram, and delete cutpoints by dragging cutpoint lines off of
the histogram.
► Click OK
to create the new, binned variable.
The new variable is displayed in the Data Editor. Since the
variable is added to the end of the file, it is displayed in the far right
column in Data View and in the last row in Variable View.
No comments:
Post a Comment