SPSS Syntax & Macros - Glenn Thompson

SPSS Syntax & Macros
Glenn L. Thompson, University of Ottawa

I am offering the following original SPSS syntax, macro, and Production Facility files for download. The files are organized by task, and most have a brief instruction manual. The code hasn't been tested under all conditions or with all versions of SPSS, so you bear the responsibility of making sure your analysis is being performed correctly (e.g. predictable results).

To report any problems with the code or to request a special modification please contact the author ([email protected]). Please note that copying and pasting code from a word processor into an SPSS syntax window can lead to errors (i.e., type out the programs yourself directly into SPSS, or use a downloaded .sps file).

Finally, if you have an interesting application for which you require an SPSS program, I might be willing to take shot (contact me at the above e-mail).

Regression Coefficient Analysis (RCA)

Outlier Screening for RT data

These are two macros that I wrote to perform Regression Coefficient Analysis or Random Coefficient Analysis (RCA). This code is useful for researchers that are familiar with basic statistical concepts, and who have not yet mastered Hierarchical Linear Modelling (HLM) techniques. The method is described in detail by Lorch & Myers (1990) and Myers & Browles (2000).

A detailed explanation of the technique is available in Thompson (2007).

Important
Please note that the article and code should not be construed as an endorsement of RCA over multi-level analysis. I see RCA more as an intermediate step between aggregation methods and multi-level analysis, and a useful pedagogical step at that. In psycholinguistics, change comes slowly and baby steps may be the order of the day.

Download Requirements: SPSS 11 or later

To download a zipped package containing the materials related to Thompson (2007), including the penultimate version of the paper, here. The macros can be found within the file: Thompson.sps.

The final version of Thompson (in press) and the materials can be obtained at the TQMP website when it appears (ETA: fall 2007).

References
Lorch, R. & Myers, J. (1990). Regression analyses of repeated measures
      data in cognitive research. Journal of Experimental Psychology:
      Learning, Memory, and Cognition, 16, 149-157.
Myers, L. & Broyles, S. (2000). Regression coefficient analysis for
      correlated binomial outcomes. Journal of Applied Statistics, 27, 217-
      234.
Thompson, G. L. (2007). Eliminating Aggregation Bias in Experimental
       Research: Random Coefficient Analysis as an Alternative to
      Performing a �by-subjects� and/or �by-items� ANOVA. Tutorials in
      Quantitative Methods for Psychology.

Below is an implementation (Thompson, 2006) of the non-recursive shifting z-score criterion procedure described by Van Selst & Jolicoeur (1994). Additional 'modules' perform database management tasks (e.g. computing means, restructuring the datafile) are available.

The program is intended for the outlier screening of reaction time data. It assumes that each participant has generated multipe responses within each experimental condition.

To use the program, simply download and read the instructions. To execute the program you must dowload both .zip files. Zip files are compressed files that can be opened with free software (e.g. winzip).

Important
Please note that the program has been updated & corrected since the publication of Thompson (2006). The version available here should be used instead of the code reported in the appendix of that article.

Dowload requirements: SPSS 11.5 or better with Production Facility (comes bundled with SPSS)

An implementation of Van Selst & Jolicoeur (1994): Instructions , the Production Facility file and the Syntax files)

A modified version of the Van Selst & Jolicoeur (1994) procedure. The program is identical except it applies a 3 SD criterion rather than a 2.5 SD criterion. WARNING: I did not obtain the correction values by running simulations. Rather, I expressed the corrected z-score values provided by Van Selst & Jolicoeur (1994, Tables 4) as proportions (e.g. 2.45/2.50 = .98) and then used these to compute corrected z-scores for the 3 SD criterion (e.g. 3 x .98 = 2.94). When I get some 'free' time, I will run some simulations to get more accurate values. Instructions, Production Facility file, Syntax Files.

References
Thompson, G. L. (2006). An SPSS implementation of the non-recursive
      outlier deletion procedure with shifting z-score criterion (Van Selst &
      Jolicoeur, 1994). Behavior Research Methods,38, 344-352.
Van Selst, M., & Jolicoeur, P. (1994). A solution to the effect of sample
     size on outlier elimination. The Quarterly Journal of Experimental
     Psychology, 47A, 631-650.

Outlier screening/Recoding - Ungrouped continuous variable(s)

A macro, actually a combination of two macros, that does two things to a variable or a list of variables (zipped Syntax File):

(1) It screens for outliers using a user-supplied Z-score criterion

(2) It recodes outliers to less extreme values in the following manner for each tail of the distribution:

    i) It takes the least extreme outlier, and recodes it to the value of the next most extreme case, plus a value equal to a user-supplied percentage of the variable's standard deviation (note: the standard deviation is calculated after excluding outliers).
    ii) It then takes the next most extreme value and recodes it the value of the outlier that was just recoded, plus a value equal to a user-supplied ..., etc..
    iii) It outputs a histogram of all variables that are screened before and after the recoding, to allow for visual inspection.

If a variable has no outliers, or only outliers in one tail of the distribution, then you will get an error message related to the aggregate command, but it is safe to ignore.

Avantages
1. It preserves the rank order of outliers
2. The recode is based on the scale & variability of the variable in question, rather than an arbitrary value like 1.
Limitations
1. It should only be used to recode plausible or 'valid' outliers. That is, outliers that are likely to be drawn from the target distribution, but that are unduly influential in determining the value of parameter estimates. Recoding coding errors, for example, could distort your analysis, if less so than if you had kept the extreme case anyway.

Hosted by www.Geocities.ws