The Descriptives procedure is useful for obtaining summary comparisons of approximately normally distributed scale variables and for easily identifying unusual cases across those variables by computing z scores.
Using Descriptives to Study Quantitative Data
================================
A telecommunications company maintains a customer database that includes, among other things, information on how much each customer spent on long distance, toll-free, equipment rental, calling card, and wireless services in the previous month.
This information is collected in
telco.sav. See the topic Sample Files for more information. Use Descriptives to study customer spending to determine which services are most profitable.
Running the Analysis
===============
► To run a Descriptives analysis,
from the menus choose:
These selections generate the following command syntax:
DESCRIPTIVES
VARIABLES=longmon tollmon equipmon cardmon wiremon
/STATISTICS=MEAN STDDEV MIN MAX .
• The procedure analyzes the variables longmon, tollmon, equipmon, cardmon, and wiremon.
• The STATISTICS subcommand requests the mean, standard deviation, minimum, and maximum.
► To recode 0's as missing values,
from the menus choose:
► Select Long
distance last month, Toll free last month, Equipment last month, Calling card last
month, and Wireless last month as numeric
variables.
► Type 0 as
the Old Value.
► Select System-missing New Value.
These selections generate the following command syntax:
RECODE
longmon tollmon equipmon cardmon wiremon (0=SYSMIS) .
EXECUTE .
► Click Options in the Descriptives dialog box.
► Deselect Minimum and Maximum.
► Select Skewness and Kurtosis.
► Click Continue.
► Click OK
in the Descriptives dialog box.
These selections generate the following command syntax:
DESCRIPTIVES
VARIABLES=longmon tollmon equipmon cardmon wiremon
/STATISTICS=MEAN STDDEV SKEWNESS KURTOSIS .
• The STATISTICS subcommand now
requests the skewness and kurtosis instead of the minimum and maximum.
Descriptive Statistics
===============
When the analysis is conditional upon the customer's actually having the
service, the results are dramatically different.
Wireless and equipment rental services bring in far more revenue per customer
than other services.
Moreover, while wireless service remains a high variable
prospect, equipment rental has one of the lowest
standard deviations.
This hasn't solved the problem of who purchases these
services, but it does point you in the direction of which services deserve
greater marketing.
Finding Unusual Cases
================
You can find customers who spend much more or much less than other
customers on each service by studying the standardized values (or
z
scores) of the variables.
However, a requirement for using z
scores is that each variable's distribution is not markedly non-normal. The
skewness and
kurtosis values reported in the statistics table
are all quite large, showing that the distributions of these variables are
definitely not normal.
One possible remedy, because the variables all take positive
values, is to study the z scores of the
log-transformed variables. The log-transformed
variables have already been computed and entered into the data file; you can use
the Descriptives procedure to compute the the z
scores.
Running the Analysis
===============
► To obtain z scores for
the log-transformed variables, recall the Descriptives dialog box.
► Deselect Long
distance last month through Wireless last month
as analysis variables.
► Select Log-long
distance through Log-wireless as analysis
variables.
► Select Save
standardized values as variables.
► Click OK.
These selections generate the following command syntax:
DESCRIPTIVES
VARIABLES=loglong logtoll logequi logcard logwire
/STATISTICS=MEAN STDDEV SKEWNESS KURTOSIS
/SAVE .
• The SAVE subcommand specifies that
z-scores for each of the variables on the VARIABLES subcommand should be saved to the active
dataset.
Descriptive statistics table
==================
With the exception of Log-toll free, the
skewness and
kurtosis are considerably smaller for the
log-transformed variables.
The log-transformed toll-free service may continue to have a
large skewness and kurtosis because a customer spent an unusually large amount
last month. Check boxplots to verify this.
Boxplots of Z Scores
===============
To visually scan the z scores and find unusual values,
from the menus choose:
► Select Summaries of separate variables.
► Select Zscore:
Log-long distance through Zscore: Log-wireless as
the variables the boxes represent.
► Click Options.
► Click Continue.
► Click OK
in the Define Simple Boxplot dialog box.
EXAMINE VARIABLES=Zloglong Zlogtoll Zlogequi Zlogcard Zlogwire
/COMPARE VARIABLE
/PLOT=BOXPLOT
/STATISTICS=NONE
/NOTOTAL
/MISSING=PAIRWISE.
Boxplots of the z scores show that customer 567 spent much more than the average customer on toll-free service last month. This should account for the larger skewness and kurtosis observed in Toll free last month.
Summary
=======
You have determined that equipment rental and wireless services
have a high return per customer, although wireless has greater variability. You
still need to determine whether these services can be effectively marketed to
your customer base in order to fully assess their profitability.
You have also found that one customer, compared to other
customers, spent an unusually large amount on toll-free services last month.
This should be investigated to determine whether this spending was a one-time
event or will be ongoing.
Related Procedures
=============
The Descriptives procedure is a useful tool for summarizing and standardizing scale variables.
• You can alternatively use the Frequencies procedure to summarize scale variables. Frequencies also provides statistics for summarizing categorical variables.
• The Means procedure provides descriptive statistics and an ANOVA table for studying relationships between scale and categorical variables.
• The Summarize procedure provides descriptive statistics and case summaries for studying relationships between scale and categorical variables.
• The OLAP Cubes procedure provides descriptive statistics for studying relationships between scale and categorical variables.
• The Correlations procedure provides summaries describing the relationship between two scale variables.
Recommended Readings
=================
See the following texts for more information on summarizing data:
Hays, W. L. 1981. Statistics, 3rd ed. New York:
Holt, Rinehart, and Winston.
Norusis, M. 2004. SPSS 13.0 Guide to Data
Analysis. Upper Saddle-River, N.J.: Prentice Hall, Inc..
Norusis, M. 2004. SPSS 13.0 Statistical Procedures
Companion. Upper Saddle-River, N.J.: Prentice Hall, Inc..