for "Breaking Away From Broken WIndows" (Westview, 2001)



1982 Sample

Neighborhoods served as the primary sampling unit. We randomly sampled 66 neighborhoods from the 277 neighborhoods in Baltimore city, after excluding the downtown, a dozen public housing communities, 39 unorganized areas that were generally small in size, and a half dozen neighborhoods comprised extensively or exclusively of garden apartment complexes. For details on neighborhood definition procedures see (%%672; 390%%).

Within each neighborhood we randomly selected census blocks, then random sides within each. That block side, in essence, half of a streetblock, was "accepted" as a block if met our eligibility criteria. (See Taylor and Covington 1993 for more details). If it didn't we randomly selected another block side. We continued the sampling of blocks and block sides until we had obtained eight half street blocks per neighborhood. If, in the course of the interviewing, we failed to obtain the desired 25 completed interviews per neighborhood after contacting all sampled households on the eight sampled blocks, we randomly sampled additional block sides using the same procedures. We drew an additional 35 blocks for this reason. We sampled a total of 562 street block sides.

To select households we merged listed households with phones into one list per neighborhood, and interval-sampled households from the list. We set a quota of a minimum of two interviews per block. In the original study we were not treating street blocks as strata.

Eligible respondents were household heads or spouses of heads. When there was more than one eligible respondent we used a random Kish selection procedure after sorting the eligibles by age (%%531%%).

The initial contact attempts were completed by phone. Overall, although the completion rates by phone varied considerably by neighborhood, about 87.6% of all the interviews were completed by phone. We confirmed that there were few differences between those contacted by phone vs. in the field (Taylor and Covington 1993: 380). The response rate was 73%.

The resulting sample was 66% female and 37% African-American; median 1981 household income was between $20,000 and $25,000; median education was 12th grade.


1981 Physical Assessment Procedures

Separate from the selection of streetblocks for interviewing purposes, we randomly selected 20% of all streetblocks in each sampled neighborhood for physical assessment. On each streetblock, assessing both sides of the streetblock, trained teams of raters completed assessments of on-site social and physical conditions. The features assessed included graffiti, abandoned houses, and other incivilities. For details on original data collection procedures see Taylor Shumaker and Gottfredson 1985. Streetblocks could be selected for assessment even if there were no occupied residential addresses there.

1994 Sample Selection

Despite some boundary and name changes in some neighborhoods produced by the Baltimore Department of Planning in their 1992 neighborhood statistics, given our goal of comparing early and later responses, we opted to leave each neighborhood's boundary unchanged so as to increase the comparability between current data collection efforts and previous data collection efforts.

SRS vs. stratified sampling. Because of financial constraints the current study was limited to no more than thirty neighborhoods. We opted for stratified over SRS sampling of the 30 from the 66. By stratifying we hoped to maximize crime changes, from the early 1980s to the early 1990s. It was necessary to complete this stratification before we had completed the programming that allowed us to generate crime counts per neighborhood. In the analyses reported, we use neighborhood crime rates after crimes had been allocated appropriately to neighborhoods.

Crime count data. Using crime counts for Baltimore Police Department Crime Reporting Areas (CRAs) for 1980 through 1992, we summed yearly counts for all CRAs subsumed within or partially contained by our 66 sampled neighborhoods. For each crime we averaged 1981 and 1982 crime counts to get an early-80s average, and 1991 and 1992 counts to get an early-90s average.

Approximating crime change. We constructed a violent crime change index and a property crime change index in the following fashion. For each crime we computed the ratio:


early 90s crime count in neighborhood


early 80s crime count in neighborhood


We then ranked the neighborhoods from lowest to highest change on each crime.


To come up with a violent crime rank change we averaged together the ranks for shifts in: aggravated assaults, rape, robbery, and homicide. For a property crime rank change we averaged together ranks for shifts in: larceny, burglary, and motor vehicle theft.(1)

Changes in violent vs. property crime counts. We trichotomized each of the crime change indexes, and cross-tabulated the results. About half the neighborhoods appeared on the diagonal (29), going up roughly equivalent amounts on both crime counts. A small number of neighborhoods (n=8) were in the top third on one crime count change, and in the bottom third on the other crime change.

We sampled neighborhoods as follows. First, each had to contain at least three streetblocks with 12 residential addresses with listed phones and a completed 1981 on-block assessment. In addition, we sought to sample twice as many neighborhoods (5 per cell) from each of the corners of the 3 x 3 stratification as from each other cell in the stratification. In the high-low and low-high crime change cells we had fewer eligible neighborhoods than needed. If in a cell we ran out of eligible neighborhoods, we substituted from another cell in the stratification plan. One third of the time we chose the high-high crime change cell, one third of the time we chose the low-low crime change cell, and one third of the time we randomly sampled another cell with unsampled neighborhoods remaining. We reasoned that in drawing replacement neighborhoods we wanted to maintain to some degree the contrasts between the high-high crime change and the low-low crime change neighborhoods.

The distribution of sampled neighborhoods across the crime change stratification variables was as follows:

Rank for Change in Violent Crime

Rank for Change in

Property Crime
















Comparing sampled and non-sampled neighborhoods. We compared the 30 selected neighborhoods to the 36 non-selected neighborhoods on 1990 percent African American households and 1990 percent owner occupied households. Both tests of the difference in proportions proved nonsignificant. On these two parameters sampled and non-sampled neighborhoods are not distinguishable. On the unexpected crime change parameters, since we oversampled high crime change neighborhoods, our sampled neighborhoods scored higher than the non-sampled neighborhoods. They also have a larger standard error which was what we hoped to achieve with the stratification plan.

Block selection procedure. We wanted to keep our block selection criteria as similar as possible to our original selection procedure for surveys in the 1982 study. At the same time we wanted to obtain a substantial number of interviews on each block, and to limit the blocks to those where physical assessments were recorded in 1981, so that later block-level analyses would be possible. Block selection criteria were: 1981 assessment available, telephone numbers listed in the reverse phone directory, telephone listings not dominated by large apartment buildings (more than 6 per address with different last names), and at least 12 households with phones. In contrast to the 1982 sample, we took addresses from both sides of the streetblock, rather than just one except in the case of neighborhood boundary blocks.

The reasons for moving to both sides of the block rather than one were as follows. (1) As an operational matter, if we restricted ourselves to one side of a hundred block, rather than both sides, we would have excluded a much larger number of blocks from our block selection procedure. The number of blocks that would have dropped out would have varied by neighborhood race and income. For example, in the mid-northwestern section of the city are several predominantly African-American neighborhoods with many blocks dominated by large frame houses. If we had required 12 eligible households simply on one side of the street, we would have excluded a high proportion of blocks in these neighborhoods. (2) By sampling from both sides of the block we make our survey unit correspond to the physical assessment unit, enabling us to more precisely understand the effects of incivilities, and changes in incivilities, on residents, at the block level. (3) In the actual interviewing, it increased chances of meeting our minimum quota on a block.

The only potential disadvantage accruing from selecting both sides of the street vs. one would occur if residents on one side were consistently different, across neighborhoods, from residents on another side. Although possible, we do not think this plausible.

Block changes. As on-site raters traveled to individual blocks, we discovered a small number of blocks (< 5) that did not fit our sampling selection criteria, even though they appeared to from the maps and the Stewart's reverse telephone directory. These blocks were dropped and replaced with other blocks in the neighborhood meeting the same criteria as the original set.

Drawing sampled households. As in 1982, telephone listings for all three selected blocks in each neighborhood were merged into a long listing. Duplicate phones, nonresidential phones, and large apartment buildings (more than six phones at an address with different last names) were eliminated. SRS was used to draw two replicate samples, so that on small blocks residents would not be overwhelmed all at once with pre-approach letters and initial contact attempts. We used SRS rather than the systematic sampling as was done in 1982 since it avoided some "mechanistic" outcomes of the sampling in neighborhoods with small blocks (e.g., taking every other house). We over-sampled because it was not possible, in several of the neighborhoods, to "open up" additional blocks for interviewing since we did not have 1981 physical assessment information for additional blocks meeting our criteria.

Block quotas. Although we were not treating streetblocks as strata within neighborhoods, we hoped to conduct block level analyses. In order to make such analyses more viable we set minimum and maximum block quotas. We instructed interviewers to obtain at least four completed interviews per block, and no more than 16.

Additional blocks. The 1279 sampled addresses with addresses randomly sorted within each block, were transmitted to the survey team. An additional 100 numbers from seven additional blocks, one in each of seven neighborhoods, were sampled and forwarded in October. It was necessary to open up additional blocks because of low response rates.

In order to ensure interview spread over the length of a large block where we had a large number of numbers, sampled addresses were randomly sorted. Therefore if interviewers worked halfway through a list of numbers on a block they are unlikely to have worked halfway down the geographic block.

Response rate. The total number of sampled addresses was 1379. We obtained 704 completed interviews, for a response rate of at least 51%. (2)

Interviewed vs. non-interviewed addresses. Of the sampled addresses, we had drawn a random subsample of six addresses per block. Photographs were taken of those addresses, and raters used closed-ended rating forms to rate housing conditions and territorial signage, using previously developed scales (%%675%%). These data permit us to contrast sampled-and-interviewed addresses with sampled-but-not-interviewed addresses.

Six conditions were rated by pairs or raters: gardening (intraclass correlation=.92), neatness (intraclass correlation=.74), ornamentation (intraclass correlation=.89), real barriers between public and private property (intraclass correlation=.97), presence of symbolic barriers between public and private properties (intraclass correlation=.94), and overall structural condition of the housing unit (intraclass correlation=.82). Ratings for each pictured address were averaged across the two raters. Inter-rater reliability, as shown by the above intraclass correlations, was quite high. Using a Bonferroni-adjusted alpha level of .008, there were no significant differences between interviewed (n ranged from183 to 205) and non-interviewed (n ranged from 196 to 242) addresses on amount of ornamentation (t < 1), presence of real barriers (t=-1.72, ns), structural upkeep (t < 1), and presence of symbolic barriers (t=-1.98, p < .05). There were significant differences on amount of gardening (t=-3.90; p < .001) and neatness (t=-2.74; p < .007) with interviewed addresses scoring higher. Thus there appear to be no upkeep or defensible space feature differences between interviewed- and non-interviewed-addresses, although territorial functioning (%%1%%) may have been slightly stronger at interviewed addresses, as reflected in gardening and neatness differences.

Respondent selection was the same in 1994 as in 1982. If there was one household head, he/she was selected. If there were multiple household heads or spouses of heads, they were listed by decreasing age and one was randomly sampled using a Kish procedure.

Interviewing. Interviewing began in early September 1994, and completed in early November 1994. All interviews were completed by phone. Data were processed using a CATI system.




Predictors and Centering

At Level 1, we always included gender (0=male, 1=female), length of residence, and years of education. Length of residence and years of education were always group mean centered, so the length variable reflects how much longer or shorter the resident had lived in the neighborhood compared to his/her neighbors, and the education reflects years of schooling the respondent had more or less than his/her average neighbor. Two models also include married vs. un-married household. That variable was deleted from the two models it appears in in the table, and its exclusion made no difference. The variable also was added to the four models where it is not shown in the table, and its inclusion had no significant impact on the coefficients for the other variables.

At Level 2, we had several possible incivilities that could enter: average perceived neighborhood incivilities in 1982, assessed graffiti in the neighborhood in 1981, and counts of vacant, boarded residential housing in 1981. Most of the models shown include only the earlier graffiti measure. Some also include the earlier perceived incivilities measure. Although these three measures did not correlate substantially with one another, their correlations with other structural variables ranged from weak to relatively strong. We tried models for each of the outcomes with different combinations of incivilities. The models shown here were the ones resulting in the highest levels of explained variance. Including additional incivilities usually did not affect the significance or non-significance of other structural predictors.

At Level 1 we successfully entered two separate indices for perceived incivilities, one social, one physical. We were unable to do so at Level 2 because the two indices correlated too strongly with one another.

For all models save dangerous places to avoid, we used the 1980 robbery percentile as our earlier crime indicator. For the avoid outcome, we used the 1980 aggravated assault percentile. We repeated the models substituting robbery instead of aggravated assault. Results were comparable, except the explained Level 2 variance was less; no shifts in predictor significance were observed.


Estimation Models

For the neighborhood, night-time fear outcome, we used standard linear estimation. The outcome was quite normally distributed. For the other three fear items we had skewed outcomes -- many more people felt safe than unsafe -- and the distribution assumptions most closely approximating these outcomes were for a Poisson model with over-dispersion. So we used the generalized HLM, with that distribution, for those outcomes.

Our "dangerous places to avoid" item was, of course, a binary outcome, so we used GHLM with a Bernoulli distribution.

The moving intention outcome also not normally distributed, with so many scoring in the lowerst category, but its distribution was much flatter than those for the daytime fear items, and the block night-time fear items. Nonetheless, we repeated the analyses for moving intention using GHLM with a Poisson distribution with over dispersion. The following differences from those reported in Table 3 surfaced. In the analysis controlling for demographics, earlier structure and prior outcome levels, 1980 % owner occupied became slightly less significant (p < .10) than it was in the linear model (p < .05). At Level 1, the coefficient for length of residence, became slightly more significant (p < .10 in linear model, p < .05 in Poisson model). In the last analysis, adding in incivilities, at Level 2 racial composition remained slightly less significant (p < .10) than it was in the linear model (p < .05). At Level 1, length of residence, nonsignificant in the linear model was slightly significant (p < .10) in the Poisson model. No differences were observed between the linear and Poisson models in terms of significance of remaining Level 2 residual variance.



The version of hierarchical linear modeling (HLM) used for these analyses (4.03) did not permit the use of a weighted sufficient statistics matrix in combination with generalized linear estimation procedures for binary and skewed outcomes. Therefore the analyses here used only unweighted data. It would have been possible to use weighted data for the one outcome using a linear analysis, but that would have made those results non-comparable to the results for the other outcomes. In analyzing neighborhood mean changes on each outcome from 1982 to 1994 we completed both analyses using unweighted data and weighted data correcting both for relative neighborhood population size, and the over-representation of owners in the sample. The weighted and unweighted change analyses yielded essentially similar results.

1. 1. We recognize that the averaging of ranks is, strictly speaking, an inappropriate treatment for ordinal data. Nevertheless, given the small n of crimes in each index (3 or 4) we felt the average might prove less misleading.

We also looked at crime changes in two other ways. We z scored crime shift scores (early 90's count / early 80s count), and averaged z scored shift scores to get a violent crime change and a property crime change. Finally, we averaged early-80s violent crime scores and used them to predict early 90s crime scores, then examined the residuals from the prediction, also doing the same thing for property crimes.

These alternate approaches did not organize the neighborhoods in a way that was markedly different from the change in rank approach used here.

2. 2. Many of the sampled addresses were not contacted, because quotas on the block or neighborhood already had been completed. If we looked just at the fraction of contacted addresses resulting in an interview, the response rate would be higher.