Using SPSS to Understand Research and Data Analysis.

Chapter 6

The Frequencies Procedure:
Summarizing Data With Descriptive Statistics

6.1 Introduction

One of the primary reasons for doing research is to be able to make accurate statements regarding the behavior or characteristics of a large number of people. In most cases it is impossible to actually collect data from every member of a target group (referred to as a population), so researchers typically collect data from a smaller subset of the population (called a sample) and attempt to generalize from the sample to the larger population. For example, 228 participants were selected from the several thousand employees of EZ Manufacturing for the leadership study.

But even when working with relatively smaller samples, it is very difficult for the human mind to comprehend large numbers of individual facts. Few researchers could remember all the individual performance scores of the 228 EZ Manufacturing employees, and it would be difficult to draw meaningful conclusions from this raw data itself. For this reason researchers often begin data analyses by summarizing characteristics of the participants in the sample.

One way to do this is to generate frequency tables of variables. Recall from Chapter 3 that frequency distributions summarize data by listing the number of participants who received scores of the possible values on the variables. Thus, a frequency table of how many employees scored a 1, 2, 3, etc. on the perform variable will be much easier to understand and interpret than would a simple listing of all 228 individual performance scores. This process is sometimes called number crunching, because large volumes of data are crunched into more manageable, meaningful units. This facilitates drawing conclusions and determining trends in the data (e.g., do most of the employees score towards the high or low end of the performance scale?).

Beyond constructing simple frequency distributions, researchers typically crunch data even further down into a single statistic that is typical of the the entire set of scores in some way. These numerical indices are called descriptive statistics, because they provide a single number, or index, that best summarizes all of the scores on some dimension. Common descriptive statistics employed are measures of central tendency and variablility.

The mean, median, mode, are common measures of central tendency (so called because these indices tend toward the middle, or center, of the distribution). The standard deviation, variance, and range are examples of indices of variability (so called because they describe the how much all the scores vary around the middle of the distribution). These summary statistics go a long towards helping the researcher understand the data, and they enable her/him to describe the findings in a brief, precise manner.

Recall that we used the Frequencies procedure in Chapter 3 as an introduction to SPSS. In this chapter we will use this procedure to generate frequency tables of two variables in the ezdata.sav file. We will also demonstrate how this procedure can be used to generate and interpret descriptive statistics for these variables. Open your ezdata.sav file and follow along with the example. You will be asked to do the same analyses on different variables in the file as an exercise at the end of the chapter.