UPDATE NOTES TO STUDENTS
1/24/00 TOMORROW for our session starting at 5:45 PM we will review NOT9905, and go back over some of the basics of simple regression, to be sure we have all those pieces down and in place. 
Memos for distribution: Memo0422 including homework 

This is a course in statistics for MA and beginning PhD level students in criminal justice. The course has two, interlocking general goals. The first: students can run simple and multiple regression analyses, intelligently interpret the output, and know how to "check out" various problems that may be inherent in the data or analysis. The second: students can read and understand criminal justice research articles using multivariate statistical analyses based on simple or multiple regression, or related techniques Their expertise is such that they can easily grasp the details of the results.
The bulk of the quantitative methods used in these articles rely on
the general linear model (GLM). GLMs include simple correlation,
simple regression, partial correlation, ANOVA, ANCOVA, and multiple
regression. Variations on the GLM include logit, probit, tobit, loglinear,
principal components analysis and factor analysis, discriminant
functions, and so on.
We will be devoting the bulk of the semester to learning about the basic ideas behind such techniques.
I want you to understand the reasoning behind statistical analyses.
Yes, there may be some formulas to memorize, but not a lot (less than
60). I also want you to be able to think intelligently about output
produced by statistical analyses.
In teaching this course I stress two general themes; themes you may find useful regardless of where you go after completing your work in this program.
It is always important to find out how the data are arranged.
There is no substitute for binocular inspection  eyeballing the data.
Researchers starting analyses with a new set of data may be tempted
to "skip" preliminary exploratory analyses, and go right to
univariate or multivariate statistical tests. I strongly urge against
this for a number of reasons that should become clear over the course
of the semester.
Weird cases can make a world of difference in your results
and interpretation.
Statistical data processing has become more interactive and iterative
over the last decade. It is easier to look at your results and redo
them without a case or two that may be strongly influencing your
results. Editors and reviewers expect that we will spend more time
doing exactly this. Therefore in this course we spend considerable
time on regression "diagnostics"  indicators telling us if
a case may be having an unusually strong impact on the results  and
redoing analyses in light of those diagnostics.
The course may help you become literate or more literate in microcomputerbased statistical computing. In past years (back in the Cenozoic era), students relied solely on hand calculators. They spent too much time trying to get the numbers to come out right, and less time thinking about what the numbers meant. They also found that their hand calculator skills were not in great demand outside of the classroom. I hope that the microcomputer experience you gain with this course will serve you well on the job or in a position as a research assistant. This is the fifth year that I have taught this course on a microcomputerbase.
You have three software options.
FREE. If you don't want to spend any money, you can use SPSS FOR WINDOWS VERSION 6.x available in the 5^{th} floor lab, and also available in the GH 107 lab. It also is available in the main Anderson lab.
CHEAP. If you need SPSS software to run at home because you cannot spend your time in the lab here on campus, and you want to buy the very cheapest version available, get SPSS FOR WINDOWS STUDENT VERSION. This will run you somewhere between $60 and $90. For information on how to buy either check the bookstore or contact PRENTICE HALL.
Be aware that this program is limited.
* it cannot write command (also called syntax) files. I think being able to write command files is an extremely important capability.
* it can only work with data files if they include less than 50 variables and less than 1500 cases.
* There may be other limitations as well, but I think these are the most important.
COSTLY. If you want to spend more money, you have two options:
SPSS GRAD PACK. This will cost you about $200. It is a pretty fullfeatured program, although again, there are limitations. If you are going on in CJ BUT WILL NOT BE TAKING CJ 605, this is probably a good allaround program.
SPSS GRAD PACK FOR BUSINESS. This is as above, but it DOES include time series, which you PROBABLY will need for CJ 605. But it lacks some of the other multivariate techniques you would like to have.
You can get more details on each of these by contact SPSS directly at
The Bookstore refused to stock copies of either of the latter packages, so you will need to deal directly with SPSS.
Software only works with the requisite hardware. Be sure you have enough clock cycles and disk space to run these things. You probably at least want to have a 486DX2 if you're a patient person. Disk space requirements go up from 4 MB of disk space for the STUDENT VERSION. You want to have at least 16 mb of RAM.
SOFTWARE VERSIONS PROBLEM
The SPSS version in the lab is SPSS for WINDOWS 6.x. When you buy new versions of SPSS those will be either 8, 8.5 or 9. Some commands differ from version to version. Most importantly, SPSS writes output files differently, and creates charts in different ways. Beware.
When I give you example command files, and directions on how to
handle output, those will probably be in version 6.x format, to guide
those in the lab.
Of course, any command files and data files written with the 6.x
command set can be read by versions 8 and up. I am not so sure about
the reverse.
Students in past years have typically reported that the time involved in this course is anywhere from 25% to 100% greater than what is required in other graduate courses. The extra effort is required because you are learning two different things: how to run computer programs, and how to think about results. Try and plan ahead to allocate more time to this course, particularly if you do not consider yourself "computer literate."
Further, given the volume of material covered every week it is essential that you be here for every class. If you absolutely must miss a class, please let me know in advance.
There are three working assumptions behind how I have set up this course.
My first assumption is that you have had an undergraduate course in
social statistics and remember some of it, and/or are willing to
spend significant time reviewing that material.
By basic statistics I refer to the following concepts:
frequency distributions
measures of central tendency
measures of dispersion
the normal curve
areas under the normal curve
the logic of hypothesis testing
probability theory
t test
To help you get back up to speed on these basics:
* We will spend a little time early in the semester reviewing some of
these concepts,
* You will probably need to spend some time outside of class
refamiliarizing yourself with some of these materials.
* During the first two or three weeks I will be willing to hold
tutorial sessions with students in groups or four or more if several
of you feel you have big gaps that would not be
easily remedied by serious, intensive selfdirected study. Or if the
computer stuff is driving you totally bonkers. You
should network with each other and let me know if there is interest.
If there is sizable interest (at least 3 people) we can schedule a
specific session and let everyone know about it.
* If you feel you need more indepth therapy, ask Dr. Avakame if you
can sit in on his statistics course throughout the semester. You may
want to consider doing that before going ahead with
this course.
My second assumption is that you are somewhat familiar with
microcomputers. I assume that you know how to turn them on, how to
insert floppies, how to find your way around hard disks, how to use
basic WINDOWS and DOS commands such as copy, delete, dir, and so on.
If you do not have some basic computer literacy, please prevail upon
one of your more knowledgeable friends to help you out as soon as possible.
Every time I teach this course, at least one student loses mission
critical files. The more you know about Windows and DOS commands, the
less likely you are to lose important information. I recommend
WINDOWS FOR DUMMIES. You also could get and read DOS FOR DUMMIES
or Van Wolverton's RUNNING MSDOS. The more you know about
Windows and MSDOS the less likely you are to lose files.
It also is true that every time I have taught this course at least
one student has had a floppy fail or has suffered from a virus. Keep
all mission critical files  including data, command files, listing
files, and papers, on at least a couple of disks.
My third assumption is that you have access to a Windows 3.1capable,
IBM compatible computer for running the software OR you are willing
to spend time in the Gladfelter lab getting your stuff run out. If
you are not sure what your hardware can do, see me. If you are
having trouble getting access to the appropriate type of
microcomputer, please let me know immediately.
We will be working with two different datasets throughout the course of the semester. One is an ecological dataset using information from 50 states. The second is questions from a recent national survey on gun ownership, conducted by Phil Cook and Jens Ludwig for the Police Foundation. I use the two different datasets so we can get used to thinking about theory at different levels, and so we can see some of the differences between individual and ecological data.
You will have a homework assignment almost every week. To complete most assignments you will run one or more statistical analyses on a dataset, and interpret the results, and write it up. For some weeks the homework assignment may involve reading an article and writing the findings up in your own words.
You should bring to class two copies of your homework
assignment, or an original and a copy. You will hand in your
assignment due for that week at the beginning of class. That way you
can keep a second copy and make notes on it as the class discussion unfolds.
During class I will be presenting conceptual material, reviewing
readings, answering questions, and reviewing homeworks. You should be
prepared to answer questions on any aspects of the readings and/or
homework assignment for that week. In short, when you come to class,
be prepared to talk about a few or many aspects of the work you have completed.
We are scheduling a one hour lab, in addition to the regular 2.5 hour course, as approved in 1995 by the department's Graduate Committee. Students in past years have strongly recommended a lab on a SEPARATE night to help them better absorb the vast mountain of material covered in this course.
In that lab we will run through procedures needed to complete the
homework assignment for the following week, and may explore
additional issues. The lab is scheduled on a separate night to avoid
total meltdown. The lab will be held in the 5^{th} floor lab.
We will need to talk about when to schedule this. The only times I am
currently available would be Tuesday afternoon or early evening, or
some time early on Friday. Obviously having the lab on Tuesday
creates problems for homework problems where you only have a week to
complete the assignment.
Your grade in this class will be based on the following:
70% 
Average grade on handed in homework assignments. I will drop your worst grade from the average. Each assignment will either (a) ask you to run a problem and interpret the results or (b) read an article and describe detailed results in your own words. Toward the end of the semester I may announce that a limited number of homework assignments can be redone 
20% 
Final examination, to be held at the end of the semester. This will probably be an inclass nonotes exam. The exam will take place THURSDAY MAY 6 AT THE USUAL TIME. 
10% 
Inclass participation. The participation may take several forms: answering questions, completing inclass groupwork or inclass individual assignments. 
GRADING POLICIES
GUIDELINES ON AVOIDING ACADEMIC MISCONDUCT
We will discuss in class the nature of academic misconduct, including plagiarism. You are responsible for understanding the different varieties of academic misconduct. If I encounter solid evidence of academic misconduct I will discuss the matter with you, and then deliver the consequence I deem appropriate. Possible consequences include: failure on the assignment in question (i.e., a 0); assigning a failing grade for the course; or attempting to have you expelled from Temple University. Should you wish to contest a decision I make on academic misconduct, I will inform you of the procedures to follow. The department and the college have fully specified grievance procedures.
There will be no makeups for a missed final exam unless
* you notify me before the missed exam
* and you have a reason for missing the exam that I find valid (e.g., car accident) (I no longer accept excuses like your friend's grandmother dying.)
* and I have something in writing, for my records, verifying the nature of the problem.
Assignments are due on the date indicated. I reserve the right to lower the grade for assignments that are handed in late. The amount the grade is lowered increases the longer the delay in handing the assignment in. Depending on the assignment, the grade may be lowered 1% to 10% a day.
If you have an excuse for a late assignment I will take this in to account only if you notify me beforehand about the problem and I find your excuse for the delay to be a valid one and I have something in writing. Again, a friend's grandfather's death may be questionable.
You have the right to submit any assignment for regrading. If you wish to submit an assignment for regrading proceed as follows:
Prepare a written statement explaining why the assignment should be regraded. This applies to written assignments, essay exams, and multiple choice exam questions where you think there was more than one correct answer.
On a cover sheet print your name, SSN, name of the assignment or test, date of the assignment or test, and the date you submitted the assignment for regrading.
Staple the cover sheet to your written rationale and the original assignment.
I will review your request for regrading. I will consult with other faculty if I deem that appropriate. As a result of your request for regrading the grade on your original assignment may stay the same, or it may go up, or it may go down.
You should type each written assignment, double spaced. You also should proof your written work carefully. Misspelled words and flagrantly poor grammar will reduce your grade. On your papers I usually take off one point for every misspelled word and one point for every flagrant grammatical error. Needless to say, this can add up after a while. I urge you to:
* always run the spell checker
* always run a grammar checker
* proofread carefully, if possible, get someone else to proofread for you as well.
Many students find that their writing improves if they consult some books on writing like Strunk & White's The Elements of Style or Provost's 100 Ways to Improve Your Writing. You can find copies of these in the bookstore under my undergraduate course CJ 160.
I strongly urge you to carefully proofread and to
spell check and to grammar check every paper.
CLASSROOM EXPECTATIONS
 Please arrive on time for class. If you have something special, and you know you cannot make it to class on time, please let me know.
 If you must leave class early, please let me know.
 If you must miss class, please let me know beforehand (see above).
 Do not bring food or drinks into the lab. They can throw you out.
Pepsi and floppies do not mix well.
Hamilton, L. C. (1992) Regression with graphics: A Second course in applied statistics. Monterey: Wadsworth.
We will use this as our main text on regression. I chose this book
because 1) students have been unhappy with every other book on
regression I chose; 2) it makes extensive use of graphical displays
for understanding data, an approach used extensively in this course;
3) he deals "up front" with nonnormal data and how to
handle it, and this is an important issue in criminal justice
research; and 4) although they will be covered only lightly in this
course, the volume contains information on important recent
developments in the general linear model (e.g, bootstrapping,
structural equation modeling) that you may need to know about in the
future. In short, I think it will hold up well as a general reference
book on the topic. Unfortunately, Hamilton's examples all come from
environmental science, which some students find less than enthralling.
If you feel that you need another text in this area, here are
some that I have used in the past and are basically pretty good.
Students, of course, have differed with me in their assessments.
Taylor, R. B. (1999). Various notes on statistics and regression.
You will link to my website and print these out, or save them to your
own hard disk. I will do all I can to insure that each file is
downloadable. If you encounter any problems whatsoever let me know
asap. I used to put all this in a student copy pack but the copy
center charges a ton.
These refer mostly to conceptual material we are covering throughout
the course. I will tell you which set of notes we are covering which
week. You want to read these notes thoroughly and carefully before
coming to class for the week they are assigned.
To get to these notes go to:
There are a couple of related topics that we are going to try and get
to at the end of the semester. One deals with regression for ordinal
or nominal outcomes. This text is:
Aldrich, J., and Nelson, F. D. (1984). Linear probability, logit,
and probit models. Newbury Park: Sage.
Another topic addresses multiple dependent variables, or data
reduction for independent variables. This topic is factor analysis.
The text we hope to read is:
Kline, P. (1994). An Easy Guide to Factor Analysis. London: Routledge.
The above two texts "should" be in the bookstore. Of
course, there is always AMAZON.COM. The texts below are NOT in the bookstore.
RECOMMENDED TEXT: Porkess, R. (1991). The Harper Collins
Dictionary of Statistics. New York: Harper.
Porkess is a useful guide to some basics  when you want to review
variance or skewness or the normal distribution.
SOME OTHER TEXTS YOU MIGHT FIND HELPFUL:
Darlington, R. (1992) Regression and linear models. New York:
McGraw Hill.
Advantages: this book also makes extensive use of graphics. He also
introduces logit and probit transforms, which may be important for
those of you who want to loglinear model. Disadvantages: all its
examples come from psychology; example runs come either from SYSTAT
or SAS, neither of which we are using; students in the past have said
they feel Darlington is "talking down" to them.
Cohen, J., and Cohen, P. (1983) Applied multiple
regression/correlation analysis for the behavioral sciences
(Second edition) Hillsdale, NJ: Erlbaum.
This book uses a set theory approach to explaining multiple
regression that many students seemed to like. Be warned, however, the
Cohens use a notational system that takes some getting used to.
Blalock, H. M. (1979) Social statistics (Revised second
edition) New York: McGraw Hill.
This is an excellent and widelyrevered volume. But it is also
closely written; it requires careful reading.
Every week, before lab, I will ATTEMPT to distribute a "lab guide."
SEQUENCE OF TOPICS, ASSIGNMENTS AND READINGS
(subject to possible revision at a later date depending upon a host of factors)
Week Date 
Read 
Class topic 
Lab 
1 1/18 
NOT9901B HAMILTON, pages 1  23; Taylor (1993) Research methods in criminal justice. New York: McGraw Hill, Chapter 10 "Sampling" pp. 183192. TEXT IS ON RESERVE IN PALEY FOR CJ 160 
Class: review syllabi Descriptive data displays for single variables: histogram, box and whisker, s&l, PP, QQ 
Generate and interpret univariate graphical displays; use explore; deciding if a variable is normal 
2 1/25 
NOT9902 GRST8706 
The logic of hypothesis testing; z test Student ttest Ttest 
Carry out independent and dependent ttests; interpret results 
3 2/1 
NOT9904 NOT9905 Hamilton, pp.2942, 5153, 289294 
Understanding covariance 
Transforms and functional relationships; Looking carefully at scatterplots 
4 2/8 
NOT9906 
Simple regression and zeroorder correlation: B, A, r 
Looking at scatterplots and regression lines 
5 2/15 
Hamilton pp. 4249 
Hypotheses we test in simple regression: B and r 

6 2/22 
NOT9907 Hamilton 124133 
Residuals in regression, error, and assumptions 
WE MAY NEED TO RESCHEDULE THIS CLASS DUE TO AN UNAVOIDABLE CONFLICT 
3/1 
NOT9908 Hamilton, 6572 
Residual diagnostics 
WE MAY NEED TO RESCHEDULE THIS CLASS DUE TO AN UNAVOIDABLE CONFLICT 
8 3/15 
Catchup 

9 3/22 
NOT9909 Hamilton, pp. 7782

Partial correlation and multiple regression 

10 3/29 
NOT9910 NOT9910B NOT9910C Hamilton 109133 
 Going back to assumptions: a checklist approach  Measures of influence, leverage, and general deviance 

11 4/5 
NOT9911 NOT9914 Hamilton 5359, 8488 
Dummy variables Interaction terms; test of R squared increment 

12 4/12 
NOT9915 
Path analysis 

13 4/19 
Hamilton, Ch. 8 Aldrich and Nelson 
Logit and probit 

14 4/26 
Hamilton, Ch. 9 Kline 
Factor/principal components analysis 