How to Use Excel for Data Entry in SPSS

How to Use Excel for Data Entry in SPSS

SPSS LogoExcel is a very popular tool for entering and manipulating data. This document shows you how to enter data that you can easily open in statistics packages such as R, SAS or SPSS. Excel has some statistical analysis capabilities but they often provide incorrect answers and I do not recommend using them. For a comprehensive list of these limitations,

Basic Rules of Data Structure

  • All your data should be in a single spreadsheet of a single file.
  • Enter variable names in the first row of the spreadsheet.
  • Consider the length of your variable names. If you know for sure what software you will use, follow its rules for how many characters names can contain. When in doubt, use variable names that are no longer than 8 characters, beginning with a letter. Those short names can be used by any software.
  • Variable names should not contain spaces, but may use the underscore character.
  • No other text rows such as titles should be in the spreadsheet.
  • No blank rows should appear in the data.
  • Always include an ID variable on your original data collection form and in the spreadsheet to help you find the case again if you need to correct errors. You may need to sort the data later, so the row number in Excel would then apply to a different subject or sampling unit. Position the ID variable in the left-most column. If you plan to use only R for your analysis, do not name the ID variable in the top row. This will tell R to put the variable into the rownames attribute automatically.
  • If you have multiple groups, put them in the same spreadsheet along with a variable that indicates group membership (see Gender example below).
  • Avoid using alphabetic characters for values. For example to enter political party, enter 1 instead of Democrat, 2 instead of Republican and 3 instead of Other.
  • If your group has only two levels, coding them 0 and 1 makes some analyses much easier to do.
  • For missing values, leave the cell blank. Although SPSS and SAS use a period to represent a missing value, if you actually type a period in Excel, some software (like R) will read the column as character data so you will not be able to, for example, calculate the mean of a column.
  • You can enter dates with slashes (6/13/2003) and times with colons (12:15 AM).
  • For text analysis, you can enter up to 32K of text, about 8 pages in a single cell. However, if you cut & paste if from elsewhere, remove carriage returns first so as they will cause it to jump to a new cell.

A data structure that’s easy to analyze:

 

ID Gender Salary

1

0

32000

2

1

23000

3

0

37000

4

1

54000

5

1

48500

 

Here is the same data, but in a form that is not easy to analyze:

 

Data for Female Subjects
ID Salary

1

32000

3

37000

Data for Male Subjects
ID Salary

2

23000

4

54000

5

48500

Data Entry Tips

  • Save your data frequently and make backup copies and store them in separate buildings. Don’t risk losing all your hard work in a fire or theft! Get a free account athttp://drive.google.comhttp://dropbox.com, or http://skydrive.live.com and save copies there.
  • Avoid using Excel to sort your data. It’s too easy to sort one column independent of the others, which essentially destroys your data! Statistics packages can sort data and they understand the importance of keeping all the values in each row locked together.
  • If you need to enter a pattern of consecutive values such as an ID number with values such as 1,2,3 or 1001,1002,1003, enter the first two, select them and drag the box in the lower right corner as far as you wish. Excel will see the pattern of the first two entries and extend it as far as you drag your selection. This works for days of the week and dates too. You can create your own lists in Options>Lists, if you use a certain pattern often.
  • To help prevent typos, you can set minimum and maximum values, or create a list of valid values. Select a column or set of similar columns, then choose Validation from the Data menu. To set minimum and maximum values, choose Allow: Whole Numbers or Decimals and then fill in the values in the Minimum and Maximum boxes. To create a list of valid values, choose Allow: List and then fill in the numeric or character values separated by commas in the Source box.
  • The gold standard for data accuracy is the dual entry method. With this method you actually enter all the data twice. Only this method can catch errors that are within the normal range of values, but still wrong. Excel can show you where the values differ. Enter the data first in Sheet1. Then enter it again using the exact same layout in Sheet2. Finally, go to Sheet3 in cell A1 and enter this formula:
    =IF(sheet1!A1=sheet2!A1,1,0)
    This means that if the value in Sheet1 cell A1 is equal to the value in Sheet2 cell A1, then Sheet3 A1 will display a 1 to indicate a match and 0 to indicate bad data. To extend this formula to all the cells, select cell A1 in Sheet3 and drag the box in the lower right corner until the cell stretches to cover all the space you used for your data in Sheet1. Then check to see where the zeros are in sheet 3. Those will be your typos. You then check to see which entry was wrong, Sheet1 or Sheet2. Make corrections until Sheet3 is full of ones, indicating no errors. When you read the data into a statistics package, you will only need to read the data in Sheet1.
  • When looking for data errors, it can be very helpful to display only a subset of values. To do this, select all the columns you wish to scan for errors. Choose Filter from the Data menu and then choose Autofilter. A downward-pointing triangle will appear at the top of each column selected. Clicking it displays a list of the values contained in that column and the words (All) and (Custom). If you have entered values that are supposed to be, for example, between 1 and 5 and you see 6 on this list, choosing it will show you only those rows in which you made that error. Then you can fix them and choose (All) from the drop-down menu. The (Custom) selection will allow you to use simple logic to find, for example, all rows with values greater than 5. When you are finished, choose Autofilter from the Data->Filter menu.

Steps for Reading Excel Data Into R

There are several ways to read an Excel file into R. Perhaps the easiest method uses the following commands:
# Do this once to install:
install.packages(“xlsReadWrite”)
library(“xlsReadWrite”)
xls.getshlib()

# Do this every time you want to read an Excel file:
library(“xlsReadWrite”)
mydata <- read.xls(“mydata.xls”)
mydata

Steps for Reading Excel Data Into SPSS

  1. In SPSS, choose File: Open: Data.
  2. Change the “Files of file type” box to “Excel (*.xls)”
  3. Select the spreadsheet name as you would in Excel
  4. When the Opening Excel Data Source box appears, check the box for Read variable names from the first row of data, then click OK.
  5. When the data appears in the SPSS data editor spreadsheet, Choose File: Save as and leave the Save as type box to SPSS (*.sav).
  6. Enter the name of the file without the .sav extension and then click Save to save the file in SPSS format
  7. Next time open the .sav version, you won’t need to convert the file again.
  8. If you create variable or value labels in the SPSS file and then need to read your data from Excel again you can copy them into the new file. First, make sure you use the same variable names. Next, after opening the file in SPSS, use Copy Data Properties from the Data menu. Simply name the SPSS file that has properties (such as labels) that you want to copy, check off the things you want to copy and click OK.

Steps for Reading Excel Data Into SAS

The process of importing data into SAS is quick but saving the data permanently as SAS file is complex. Therefore, we recommend that you import the data each time you need it. If you are an advanced SAS programmer familiar with SAS data libraries, it will probably be obvious how to avoid this repetition.

  1. In SAS, choose File: Import Data. The Import Wizard will appear.
  2. Make sure that the Standard data source box is checked and that the Select a data source from the list below is set to the version of Microsoft Excel that you used to create the file. Then click Next.
  3. In the Select File box, browse to find the file and click Next.
  4. In the Choose the SAS destination box, leave the Library box set to WORK and enter TEMP as the Member name. Then click Finish.
  5. If you click Next instead of Finish in the step above, SAS will say, The Import Wizard can create a file containing PROC IMPORT statements that can be used in SAS programs to import this data again. If you want these statements to be generated, enter the filename where they should be saved. SAS programmers will appreciate this feature but we recommend beginners avoid this step by clicking on Finish.
  6. The data can now be used by any SAS program. For example, submitting:
    PROC MEANS; RUN;
    should calculate means and other basic statistics using your data.

Credit: http://r4stats.com/articles/excel-data-entry/

How to Create a Chart or Graph using Excel [Solved]

How to Create a Chart or Graph using ExcelExcel Logo

Step 1 – Launch Excel – If Excel is already open on your workstation open a new Excel workbook, There are three ways to do that.
1. Go to the Standard toolbar. Click on the New Workbook button.
2.
Go to the File menu. Select New.
3.
Use a keyboard combination: on a Macintosh use Command + N and on a Windows computer use Ctrl + N

Step 2 Enter the data to be graphed. For the purpose of this lesson you will use data from a Favorite Fruit Survey. Enter it as you see below:

Step 3 – Highlight data to be graphed. Do not include the row with heading titles, only the names of fruit and the numbers. If your worksheet looks like the one above; put your cursor in call A2, click hold the mouse button down and drag to cell B7. Highlighted data should look like the image below:


Note: Cell A2 is selected, the select color extends around the cell
Step 4 – Select the Chart Wizard. That is done by going to the Insert menu and selecting Chart. You can also click on the Chart Wizard button on the Standard toolbar.

Step 5 – From the Chart Wizard box that opens select Chart type. For this activity, I selected pie.

After you have selected the Chart type, click and hold your mouse pointer down on the Press and Hold… button to see what your data looks like in the chart type you selected. If you do not like the look, select another chart type. After you have selected the chart type you will have two options:

  • Select Next and let Chart Wizard show you a series of options to make changes to your chart.
  • Select Finish and Chart Wizard puts your completed chart on the spreadsheet. You can see the finished product below.

The second step taken by Chart Wizard is to verify the range of data being used for this chart. The Data range displayed below is read “all cells from A2 to B7.”

Notice where the cursor is located in the dialog box above. It is pointing to the small box at the end of the line where the Data range is displayed. If the data range should be changed, click on the box the cursor is pointing to.

The dialog box shrinks allowing you to see your entire spreadsheet. You can edit the data range in this small window. When you are finished, click the same box at the end to restore the window.

Select Next to go to the dialog box below. This box allows you to add a title to the chart, make changes on the legend, or make changes on the data labels.

Select Next to move to the final dialog box which allows you to see the chart as a new sheet or place it on one of the sheets in your workbook.

If you let the Chart Wizard finish your chart after the first dialog box, or work through each of the four steps, your chart will look something like the one below.

Credit: http://www.internet4classrooms.com/excel_create_chart.htm

How to Create Histograms in SPSS [Solved]

How to Create Histograms in SPSS

SPSS LogoA histogram is a bar graph for quantitative data, in which the heights of the bars represent frequency or relative frequency, and there are no gaps between the bars. Creating a histogram involves “binning” the data. For more information on this, see the text.

To create a histogram in SPSS:

1) Open a data file (in .sav or .por format) or type your data in.

2) In either the Data Editor window or the Output Viewer window, click on the “Graphs” menu and then on “Histogram…”

3) Select the variable of interest. Click on the little arrow next to “Variable.” This will move the variable into the “Variable” box.

4) If you would like to add a title to your plot, click the “Titles…” button. A new dialog box appears. Enter any titles or subtitles you want to use, and click “Continue…”

5) Click “OK.”  The histogram appears in the Output Viewer. Notice that the vertical axis is labeled “Frequency.” SPSS has chosen bin widths according to some criteria the programmers wrote into the program. If you would like to change the label on the vertical axis, or change the bin width or the anchor of the first bin, use the Chart Editor.

NOTE: Histograms are for quantitative data only. If you have qualitative data, consider using a Pareto chart or a bar graph instead.

Credit: http://emp.byui.edu/brownd/SPSS/Instr_Graphs/histogram_SPSS.htm

How to Interpret Data in SPSS for a Paired T-test [Solved]

How to Interpret Data in SPSS for a Paired T-test

Look at the Paired Samples Statistics Box

SPSS LogoTake a look at this box. You can see each variable name in left most column. If you have given your variables meaningful names, you should know exactly which conditions these variable names represent. You can find out the number of participants, mean and standard deviation for each condition by reading across each of the two condition rows.

pst Example

In the Paired Samples Statistics Box, the mean for the caffeine condition (CAFDTA) is 5.40. The mean for the no caffeine condition (NOCAFDTA) is 9.40. The standard deviation for the caffeine condition is 1.14 and for the no caffeine condition, also 1.14. The number of participants in each condition (N) is 5.

Paired Samples Test Box

This is the next box you will look at. It contains info about the paired samples t-test that you conducted. You will be most interested in the value that is in the final column of this table. Take a look at the Sig. (2-tailed) value.

pst2

Sig (2-Tailed) value

This value will tell you if the two condition Means are statistically different. Often times, this value will be referred to as the p value. In this example, the Sig (2-Tailed) value is 0.005.

If the Sig (2-Tailed) value is greater than 05…

You can conclude that there is no statistically significant difference between your two conditions. You can conclude that the differences between condition Means are likely due to chance and not likely due to the IV manipulation.

If the Sig (2-Tailed) value is less than or equal to .05…

You can conclude that there is a statistically significant difference between your two conditions. You can conclude that the differences between condition Means are not likely due to change and are probably due to the IV manipulation.

Our Example

The Sig. (2-Tailed) value in our example is 0.005. This value is less than .05. Because of this, we can conclude that there is a statistically significant difference between the mean hours of sleep for the caffeine and no caffeine conditions. Since our Paired Samples Statistics box revealed that the Mean number of hours slept for the no caffeine condition was greater than the Mean for the caffeine condition, we can conclude that participants in the no caffeine condition were able to sleep significantly more hours than participants in the caffeine condition.

Credit: http://statistics-help-for-students.com/How_do_I_interpret_data_in_SPSS_for_a_paired_samples_T_test.htm#.U6oaarFiL3w

Generate MS-Excel Graphs and Charts from SPSS Output Files [Solved]

Generate MS-Excel Graphs and Charts from SPSS Output Files

Description of Procedure

SPSS LogoRaw text files can be used as input to the SPSS software to perform a statistical analysis. Using the data import facility in the “File – Read text data” menu of the SPSS data file screen, one can import text data delimited by any character (space, tab etc) on to SPSS, and save the data file as a .sav file.

Screen shot – SPSS data file imported from text file:

screenshot

The next step is to find the appropriate command from one of the “Data”, “Transform”, “Analyze” or “Variables” menus, to suit the analytical need of the user, and run it on the data set. The results are shown as an output file on the screen, on the SPSS viewer. These results can be saved , if necessary, as a .spo file.

Screen shot – SPSS output file:

screenshot 2

The output tables can be highlighted (they are then know as pivot tables) and double clicked to set them in the editable mode. Then one can select all the contents of the table(s) by using the “Ctrl-A” command, and copy them using the “Edit-Copy” or “Ctrl-C” commands.

Screen shot – MS Excel worksheet created from SPSS output file:

screenshot 3Screen shot – generation of graph from MS Excel worksheet:

screenshot 4

These results can be copied and pasted on to an MS-Excel template in a tabular form. To create a graph/chart on Excel, first the required data is selected on the spreadsheet. Then the “Chart” option is selected from the “Insert” Menu, and the type of chart required is chosen (column, pie, etc). The data range is set to be the one selected on the spreadsheet (it appears automatically that way). A title for the chart can be given at this point, and so can the legends for the X and Y axes, representing what the X and Y axes stand for. The graph can now be saved as an object in the original data sheet or as a new graph.

By the above mentioned simple steps that can be used to combine the two applications, we can take advantage of the strengths of both software packages, and get highly sophisticated statistical analysis as well as superior chart/graph generating capabilities, than as stand alone applications.

Credit: http://tltgroup.wordpress.com/low-threshold-applications/20-generating-ms-excel-graphs-and-charts-from-spss-output-files/

How to Use SPSS for Descriptive Statistics [Solved]

How to Use SPSS for Descriptive Statistics

This tutorial will show you how to use SPSS version 12.0 to perform exploratory data SPSS Logoanalysis and descriptive statistics. You will use SPSS to create histograms, frequency distributions, stem and leaf plots, Tukey box plots, calculate the standard measures of central tendency (mean, median, and mode), calculate the standard measures of dispersion (range, semi-interquartile range, and standard deviation / variance), and calculate measures of kurtosis and skewness. This tutorial assumes that you have:

  • Downloaded the standard class data set (click on the link and save the data file)
  • Started SPSS (click on Start | Programs | SPSS for Windows | SPSS 12.0 for Windows)
  • Loaded the standard data set

 The Frequency Command

The frequencies command can be used to determine quartiles, percentiles, measures of central tendency (mean, median, and mode), measures of dispersion (range, standard deviation, variance, minimum and maximum), measures of kurtosis and skewness, and create histograms. The command is found at Analyze | Descriptive Statistics | Frequencies (this is shorthand for clicking on the Analyze menu item at the top of the window, and then clicking on Descriptive Statistics from the drop down menu, and Frequencies from the pop up menu.):

The frequencies dialog box will appear:

Select the variable(s) that you want to analyze by clicking on it in the left hand pane of the frequencies dialog box. Then click on the arrow button to move the variable into the Variables pane:

Be sure to select “Display frequency tables” if you want a frequency distribution. Specify which statistics you want to perform by clicking on the Statistics button. The Statistics dialog box will appear:

From the statistics dialog box, click on the desired statistics that you want to perform. To calculate a given percentile, click in the box to the left of percentile(s). Type in the desired percentile and click on the Add button. When you have selected all the desired statistics (e.g. mean, median, mode, standard deviation, variance, ragne, etc.), click on the Continue button.

Specify which chart you want to display by clicking on the Chart button. The chart dialog box will appear:

Click on the desired chart (usually Histogram) and click on the Continue button.
Click on OK in the frequencies dialog box. The SPSS Output Viewer will appear.

In the SPSS Output Viewer, you will see the requested statistics and chart. This is what the Statistics output looks like. It lists the requested measures of central tendency, measures of dispersion, measures of skewness and kurtosis, and the quartiles and percentiles.

The output has two columns. The left column names the statistic and the right column gives the value of the statistic. For example, the mean of this data is 1.26 (since your data set may be different, you may get a different value.)
The skewness measure is greater than 0 when the distribution is skewed.
The kurtosis measure is 0 for a normal distribution. Positive values imply a leptokurtic distribution, while negative values imply a platykurtic distribution.

If you scroll down, you will see the frequency distributions.

If you scroll down, you will see the histogram (or whatever chart you requested.)

The Descriptives Command

The descriptives command can be used to determine measures of central tendency (mean), measures of dispersion (range, standard deviation, variance, minimum and maximum), and measures of kurtosis and skewness. The command is found at Analyze | Descriptive Statistics | Descriptives (this is shorthand for clicking on the Analyze menu item at the top of the window, and then clicking on Descriptive Statistics from the drop down menu, and Descriptives from the pop up menu.):

The descriptives dialog box will appear:

Select the variable(s) that you want to analyze by clicking on it in the left hand pane of the descriptives dialog box. Then click on the arrow button to move the variable into the Variables pane:

Specify which statistics you want to perform by clicking on the Options button. The Options dialog box will appear:

Select the statistics that you want by clicking on them (e.g. mean, standard deviation, variance, range, minimum, etc.). Then click on the Continue button. Click on the OK button in the Descriptives dialog box. The SPSS Output Viewer will appear with your results in it. The following is an example of the output:

The output gives the values of the requested statistics.

The Explore Command

The explore command can be used to determine measures of central tendency (mean and median), measures of dispersion (range, interquartile range, standard deviation, variance, minimum and maximum), measures of kurtosis and skewness, and prepare histograms, stem and leaf plots, and Tukey box plots. The command is found at Analyze | Descriptive Statistics | Explore:

The explore dialog box will appear:

Select the variable(s) that you want to analyze by clicking on it in the left hand pane of the explore dialog box. Then click on the top arrow button to move the variable into the Dependent List:

Specify which plots you want to prepare by clicking on the Plots button. The Plots dialog box will appear:

Select the plots that you want by clicking on them (e.g. Stem-and-leaf and histogram). Then click on the Continue button. Click on the OK button in the Explore dialog box. The SPSS Output Viewer will appear with your results in it. The following is an example of the output for the descriptive statistics:

The output gives the values of the requested statistics. If you scroll down, you will see the requested plots:

 

Credit: http://academic.udayton.edu/gregelvers/psy216/spss/descript1.htm

How to Apply Boxplot in SPSS [Solved]

How to Apply Boxplot in SPSS

SPSS LogoA boxplot is another useful visualization for viewing how the data are distributed. A boxplot contains several statistical measures that we will explore after creating the visualization.

Note: This example uses Employee data.

 From the menus choose:

 On the Basic tab, select gender and salary. (Use Ctrl+Click to select multiple fields/variables.)

 Select Boxplot.

Basic tab selections, boxplot
Basic tab selections, boxplot

 Click OK.

Boxplot
Boxplot

Let’s explore the different parts of the boxplot:

• The dark line in the middle of the boxes is the median of salary. Half of the cases/rows have a value greater than the median, and half have a value lower. Like the mean, the median is a measure of central tendency. Unlike the mean, it is less influenced by cases/rows with extreme values. In this example, the median is lower than the mean (compare to Example: Bar Chart with a Summary Statistic ). The difference between the mean and median indicates that there are a few cases/rows with extreme values that are elevating the mean. That is, there are a few employees who earn large salaries.

• The bottom of the box indicates the 25th percentile. Twenty-five percent of cases/rows have values below the 25th percentile. The top of the box represents the 75th percentile. Twenty-five percent of cases/rows have values above the 75th percentile. This means that 50% of the case/rows lie within the box. The box is much shorter for females than for males. This is one clue that salary varies less for females than for males. The top and bottom of the box are often called hinges.

• The T-bars that extend from the boxes are called inner fences or whiskers. These extend to 1.5 times the height of the box or, if no case/row has a value in that range, to the minimum or maximum values. If the data are distributed normally, approximately 95% or the data are expected to lie between the inner fences. In this example, the inner fences extend less for females compared to males, another indication that salary varies less for females than for males.

• The points are outliers. These are defined as values that do not fall in the inner fences. Outliers are extreme values. The asterisks or stars are extreme outliers. These represent cases/rows that have values more than three times the height of the boxes. There are several outliers for both females and males. Remember that the mean is greater than the median. The greater mean is caused by these outliers.

Credit: http://pic.dhe.ibm.com/infocenter/spssstat/v20r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Fgraphboard_creating_examples_boxplot.htm

How to Apply Dependent T-Test using SPSS

How to Apply Dependent T-Test using SPSS

Introduction

SPSS LogoThe dependent t-test (called the paired-samples t-test in SPSS) compares the means between two related groups on the same continuous, dependent variable. For example, you could use a dependent t-test to understand whether there was a difference in smokers’ daily cigarette consumption before and after a 6 week hypnotherapy programme (i.e., your dependent variable would be “daily cigarette consumption”, and your two related groups would be the cigarette consumption values “before” and “after” the hypnotherapy programme). If your dependent variable is dichotomous, you should instead use McNemar’s test.

This “quick start” guide shows you how to carry out a dependent t-test using SPSS, as well as interpret and report the results from this test. However, before we introduce you to this procedure, you need to understand the different assumptions that your data must meet in order for a dependent t-test to give you a valid result. We discuss these assumptions next.

SPSStop ^

Assumptions

When you choose to analyse your data using a dependent t-test, part of the process involves checking to make sure that the data you want to analyse can actually be analysed using a dependent t-test. You need to do this because it is only appropriate to use a dependent t-test if your data “passes” four assumptions that are required for a dependent t-test to give you a valid result. In practice, checking for these four assumptions just adds a little bit more time to your analysis, requiring you to click a few more buttons in SPSS when performing your analysis, as well as think a little bit more about your data, but it is not a difficult task.

Before we introduce you to these four assumptions, do not be surprised if, when analysing your own data using SPSS, one or more of these assumptions is violated (i.e., is not met). This is not uncommon when working with real-world data rather than textbook examples, which often only show you how to carry out a dependent t-test when everything goes well! However, don’t worry. Even when your data fails certain assumptions, there is often a solution to overcome this. First, let’s take a look at these four assumptions:

  • Assumption #1: Your dependent variable should be measured on a continuous scale (i.e., it is measured at the interval or ratio level). Examples of variables that meet this criterion include revision time (measured in hours), intelligence (measured using IQ score), exam performance (measured from 0 to 100), weight (measured in kg), and so forth. You can learn more about continuous variables in our article: Types of Variable.
  • Assumption #2: Your independent variable should consist of two categorical, “related groups” or “matched pairs“. “Related groups” indicates that the same subjects are present in both groups. The reason that it is possible to have the same subjects in each group is because each subject has been measured on two occasions on the same dependent variable. For example, you might have measured 10 individuals’ performance in a spelling test (the dependent variable) before and after they underwent a new form of computerised teaching method to improve spelling. You would like to know if the computer training improved their spelling performance. The first related group consists of the subjects at the beginning of (prior to) the computerised spelling training and the second related group consists of the same subjects, but now at the end of the computerised training. The dependent t-test can also be used to compare different subjects, but this does not happen very often. Nonetheless, to learn more about the different study designs that can be analysed using a dependent t-test, see our enhanced dependent t-test guide.
  • Assumption #3: There should be no significant outliers in the differences between the two related groups. Outliers are simply single data points within your data that do not follow the usual pattern (e.g., in a study of 100 students’ IQ scores, where the mean score was 108 with only a small variation between students, one student had a score of 156, which is very unusual, and may even put her in the top 1% of IQ scores globally). The problem with outliers is that they can have a negative effect on the dependent t-test, reducing the validity of your results. In addition, they can affect the statistical significance of the test. Fortunately, when using SPSS to run a dependent t-test on your data, you can easily detect possible outliers. In our enhanced dependent t-test guide, we (a) show you how to use SPSS to compute the difference scores, (b) show you how to detect outliers using SPSS, and (c) discuss some of the options you have in order to deal with outliers.
  • Assumption #4: The distribution of the differences in the dependent variable between the two related groups should be approximately normally distributed. We talk about the dependent t-test only requiring approximately normal data because it is quite “robust” to violations of normality, meaning that the assumption can be a little violated and still provide valid results. You can test for normality using the Shapiro-Wilk test of normality, which is easily tested for using SPSS. In addition to showing you how to do this in our enhanced dependent t-test guide, we also explain what you can do if your data fails this assumption (i.e., if it fails it more than a little bit).

You can check assumptions #3 and #4 using SPSS. Before doing this, you should make sure that your data meets assumptions #1 and #2, although you don’t need SPSS to do this. When moving on to assumptions #3 and #4, we suggest testing them in this order because it represents an order where, if a violation to the assumption is not correctable, you will no longer be able to use a dependent t-test (although you may be able to run another statistical test on your data instead). Just remember that if you do not run the statistical tests on these assumptions correctly, the results you get when running a dependent t-test might not be valid. This is why we dedicate a number of sections of our enhanced dependent t-test guide to help you get this right. You can find out about our enhanced content as a whole here, or more specifically, learn how we help with testing assumptions here.

In the section, Test Procedure in SPSS, we illustrate the SPSS procedure required to perform a dependent t-test assuming that no assumptions have been violated. First, we set out the example we use to explain the dependent t-test procedure in SPSS.

SPSStop ^

Example

A group of Sports Science students (n = 20) are selected from the population to investigate whether a 12-week plyometric-training programme improves their standing long jump performance. In order to test whether this training improves performance, the students are tested for their long jump performance before they undertake a plyometric-training programme and then again at the end of the programme (i.e., the dependent variable is “standing long jump performance”, and the two related groups are the standing long jump values “before” and “after” the 12-week plyometric-training programme).

SPSS

Test Procedure in SPSS

The six steps below show you how to analyse your data using a dependent t-test in SPSS when the four assumptions in the previous section, Assumptions, have not been violated. At the end of these six steps, we show you how to interpret the results from this test. If you are looking for help to make sure your data meets assumptions #3 and #4, which are required when using a dependent t-test, and can be tested using SPSS, you can learn more in our enhanced guides here. We also show you how to correctly enter your data into SPSS in order to run a dependent t-test, as well as explaining how to deal with missing values (e.g., if a participant completed a pre-test but failed to turn up to the post-test). However, in this “quick start” guide, we focus on the six steps required to run the dependent t-test procedure using SPSS.

  • Click Analyze > Compare Means > Paired-Samples T Test… on the top menu, as shown below:
    The Dependent T Test Menu
  • Published with written permission from SPSS, IBM Corporation.
  • You will be presented with the Paired-Samples T Test dialogue box, as shown below:
    The Paired Sample T Test Dialogue BoxPublished with written permission from SPSS, IBM Corporation.
  • Transfer the variables JUMP1 and JUMP2 into the Paired Variables: box. There are two ways to do this: (a) click on both variables whilst holding down the shift key (which highlights them) and then pressing the SPSS Right Arrow Button button; or (b) drag-and-drop each variable separately into the boxes. If you are using older versions of SPSS, you will need to transfer the variables using the former method. You will end up with a screen similar to the one shown below:
    The Paired Sample T Test Dialogue BoxPublished with written permission from SPSS, IBM Corporation.

    Note:
    Arrow Down Button button shifts the pair of variables you have highlighted down one level.
    Arrow Up Button button shifts the pair of variables you have highlighted up one level.
    Double Arrow Button button shifts the order of the variables within a variable pair.

  • If you need to change the confidence level limits or exclude cases, click on the Options Button button. You will be presented with the Paired-Samples T Test: Options dialogue box, as shown below:
    Options Dialog Box in the Paired Sample T TestPublished with written permission from SPSS, IBM Corporation.
  • Click the button. You will be returned to the Paired-Samples T Test dialogue box.

  • Click the button.

Output of the Dependent T-Test in SPSS

SPSS generates three tables in the Output Viewer under the title “T-Test”, but you only need to look at two tables: the Paired Samples Statistics table and the Paired Samples Test table. In addition, you will need to interpret the boxplots that you created to check for outliers and the output from the Shapiro-Wilk test for normality, which you used to determine whether the distribution of the differences in the dependent variable between the two related groups were approximately normally distributed. This is explained in our enhanced guide. However, in this “quick start” guide, we focus on the two main tables you need to understand if your data has met all the necessary assumptions:

Paired Sample Statistics Table

The first table, titled Paired Samples Statistics, is where SPSS has generated descriptive statistics for your variables. You could use the results here to describe the characteristics of the first and second jumps in your write-up.

Output for the Paired T TestPublished with written permission from SPSS, IBM Corporation.

Paired Samples Test Table

The Paired Samples Test table is where the results of the dependent t-test are presented. A lot of information is presented here and it is important to remember that this information refers to the differences between the two jumps (the subtitle reads “Paired Differences”). As such, the columns of the table labelled “Mean“, “Std. Deviation“, “Std. Error Mean” and “95% Confidence Interval of the Difference” refer to the mean difference between the two jumps and the standard deviation, standard error and 95% confidence interval of this mean difference, respectively. The last three columns express the results of the dependent t-test, namely the t-value (“t“), the degrees of freedom (“df“) and the significance level (“Sig. (2-tailed)“).

Output for the Paired T Test
Published with written permission from SPSS, IBM Corporation.

You are essentially conducting a one-sample t-test on the differences between the groups.

SPSS

Reporting the Output of the Dependent T-Test

You might report the statistics in the following format: t(degrees of freedom) = t-value, p = significance level. In our case this would be: t(19) = -4.773, p < 0.0005. Due to the means of the two jumps and the direction of the t-value, we can conclude that there was a statistically significant improvement in jump distance following the plyometric-training programme from 2.48 ± 0.16 m to 2.52 ± 0.16 m (p < 0.0005); an improvement of 0.03 ± 0.03 m.

Note: SPSS can output the results to many decimal places, but you should understand your measurement scale in order to know whether it is appropriate to report your results with such precision.

In our enhanced dependent t-test guide, we show you how to write up the results from your assumptions tests and dependent t-test procedure if you need to report this in a dissertation/thesis, assignment or research report. We do this using the Harvard and APA styles. It is also worth noting that in addition to reporting the results from your assumptions and dependent t-test, you are increasingly expected to report effect sizes. Whilst there are many different ways you can do this, we show you how to calculate effect sizes from your SPSS results in our enhanced dependent t-test guide. Effect sizes are important because whilst the dependent t-test tells you whether differences between group means are “real” (i.e., different in the population), it does not tell you the “size” of the difference. Providing the effect size in your results helps to overcome this limitation. You can learn more about our enhanced content here.

Credit: https://statistics.laerd.com/spss-tutorials/dependent-t-test-using-spss-statistics.php

How to Create Histograms in SPSS

How to Create Histograms in SPSS

A histogram is a bar graph for quantitative data, in which the heights of the bars represent frequency or relative frequency, and there are no gaps between the bars. Creating a histogram involves “binning” the data. For more information on this, see the text.

To create a histogram in SPSS:

1) Open a data file (in .sav or .por format) or type your data in.

2) In either the Data Editor window or the Output Viewer window, click on the “Graphs” menu and then on “Histogram…”

3) Select the variable of interest. Click on the little arrow next to “Variable.” This will move the variable into the “Variable” box.

4) If you would like to add a title to your plot, click the “Titles…” button. A new dialog box appears. Enter any titles or subtitles you want to use, and click “Continue…”

5) Click “OK.”  The histogram appears in the Output Viewer. Notice that the vertical axis is labeled “Frequency.” SPSS has chosen bin widths according to some criteria the programmers wrote into the program. If you would like to change the label on the vertical axis, or change the bin width or the anchor of the first bin, use the Chart Editor.

NOTE: Histograms are for quantitative data only. If you have qualitative data, consider using a Pareto chart or a bar graph instead.

SPSS Logo

Credit: http://emp.byui.edu/brownd/SPSS/Instr_Graphs/histogram_SPSS.htm