The Effectiveness of Neighborhood Watch

Neighborhood watch (also known as block watch, apartment watch, home watch and community watch) grew out of a movement in the US during the late 1960s that promoted greater involvement of citizens in the prevention of crime. Since then, interest in neighborhood watch has grown considerably and recent estimates suggest that over a quarter of the UK population and over forty per cent of the US population live in areas covered by neighborhood watch schemes.The primary aim of this review is to assess the effectiveness of neighborhood watch in reducing crime.


BACKGROUND FOR THE REVIEW Introduction
Neighborhood watch (also known as block watch, apartment watch, home watch and community watch) grew out of a movement in the US that promoted greater involvement of citizens in the prevention of crime (Titus, 1984). One of the first recorded neighborhood watch programs in the US was the Seattle Community Crime Prevention Project launched in 1973 (Cirel et al., 1977). One of the first recorded neighborhood watch schemes in the UK was the Home Watch program implemented in 1982 in Cheshire (Anderton, 1985).
Since the 1980s, the number of neighborhood watch schemes in the UK has expanded considerably. The report of the 2000 British Crime Survey estimated that over a quarter (27%) of all households (approximately six million households) in England and Wales were members of a neighborhood watch scheme (Sims, 2001). This amounted to over 155,000 active schemes. A similar expansion has occurred in the US. The report of The 2000 National Crime Prevention Survey (National Crime Prevention Council, 2001) estimated that 41 per cent of the American population lived in communities covered by neighborhood watch. The report concluded, 'This makes Neighborhood Watch the largest single organized crime prevention activity in the nation' (p. 39). Considering such large investments in terms of resources and community involvement, it is reasonable for researchers to ask whether neighborhood watch is effective in reducing crime.

The theory of neighborhood watch
The most frequently recorded mechanism by which neighborhood watch is supposed to reduce crime is by residents looking out for suspicious activities and reporting these to the police. The link between reporting and crime reduction is not usually elaborated in the literature. However, it has been argued that visible surveillance might reduce crime as a result of its effect on the perceptions and decision making of potential offenders. Hence, watching and reporting might deter offenders if they are aware of the propensity of the local residents to report suspicious behavior and if they perceive this as increasing the risks of being caught.
Neighborhood watch might also lead to a reduction in crime through the reduction in opportunities for crime. One method discussed in the literature is through the creation of signs of occupancy. Some of the methods by which members of neighborhood watch schemes might create signs of occupancy were discussed in the report of the Seattle scheme (Cirel et al., 1977). These include removing newspapers and milk from outside neighbors' homes when they are away, mowing the lawn, and filling up trash cans. The way in which signs of occupancy might reduce crime might be through the effect that this has on the perceptions of potential offenders in terms of their likelihood of getting caught.
Neighborhood watch might also lead to a reduction in crime through the various mechanisms of social control. Informal social control is not one of the mechanisms for reducing crime stated in the publicity material of these schemes. Nevertheless, they might indirectly serve to enhance community cohesion and increase the ability of communities to control crime (Greenberg, Rohe, and Williams, 1985). Informal social control can affect community crime through the generation of acceptable norms of behavior and by direct intervention by residents.
It is also possible that neighborhood watch schemes might reduce crime through enhancing police detection. Neighborhood watch might serve to increase in the flow of useful information from the public to the police. An increase in information concerning crimes in progress and suspicious persons and events might lead to a greater number of arrests and convictions and result (when a custodial sentence is passed) in a reduction in crime through the incapacitation of local offenders (Bennett, 1990).
It is also feasible that neighborhood watch might reduce crime through the other components of the program package. It has been argued that property marking might lead to a reduction in crime as a result of making the disposal of marked property more difficult (Laycock, 1985). This might reduce offending rates if potential offenders viewed marked property as increasing the risk of detection. Home security surveys might lead to a reduction in crime as a result of making it physically more difficult for an offender to enter the property (Bennett and Wright, 1984).

Program elements
Neighborhood watch is often implemented as part of a comprehensive package. The typical package is sometimes referred to as the 'big three' and includes neighborhood watch, property-marking and home security surveys (Titus, 1984). Some programs include a third or fourth element such as a recruitment drive for special constables, increased regular foot patrols, citizen patrols, educational programs for young people, auxiliary police units, and victim support services.
Neighborhood watch schemes vary in terms of the size of the area covered. Some of the earlier schemes in the US and the UK were based on areas covering just a few households. More recent schemes sometimes cover many thousand households. One of the smallest schemes included in the review was the 'cocoon' neighborhood watch program in Rochdale in England covering just one dwelling and its immediate neighbors (Forrester, Frenz, O'Connell, and Pease, 1990). One of the largest was the Manhattan Beach neighborhood watch scheme in Los Angeles covering a population of over 30,000 residents .
Neighborhood watch schemes can be both public and police initiated. Schemes launched in the UK during the early period of a program tended to be police initiated (e.g. the early neighborhood watch schemes in the Metropolitan Police District). More recently, neighborhood watch schemes have been launched mainly at the request of the public. Some police departments continue initiating their own schemes, even when the program is fully developed. A program implemented in Detroit, for example, maintained a section of policeinitiated schemes in order to promote neighborhood watch in areas that were unlikely to generate public-initiated requests (Turner and Barker, 1983).
In the US, block watches are usually run by a block captain who is responsible to a block coordinator or block organizer. The block co-ordinator acts as the liaison person to the local police department. Neighborhood watch schemes in the UK often include street co-ordinators (equivalent to block captains) and area co-ordinators (equivalent to the block organizer).
There is little information in the literature on the number and type of neighborhood watch meetings. The evidence that does exist suggests that some schemes have public meetings that involve all of the residents participating in the scheme, while others have meetings that involve only the organizers of the scheme (Bennett, 1990).
The funding of neighborhood watch schemes is nearly always a joint venture between the local police department and the scheme members through their fund-raising activities. The relative contribution of the two sources varies considerably. Some schemes in the United States are provided with no more than an information package from the local police. Others are provided with police facilities for the production of newsletters and the use of police premises for meetings (Turner and Barker, 1983). Apart from police funding, the majority of schemes are encouraged to raise some funds from other sources such as voluntary contributions, local businesses, and the proceeds of fêtes, and raffles.

Crimes targeted
There is a consensus in the literature that the main aim of neighborhood watch is crime prevention. There are small variations among the programs in terms of which crimes are targeted. The vast majority of programs identify residential burglary as the sole or most important target crime of neighborhood watch. Some programs list other offences that it is hoped neighborhood watch will reduce. The list of other offences is sometimes specific (e.g. 'street robberies, auto thefts and vandalism') and sometimes general (e.g. 'street crime' and 'property crime').

Previous reviews
There are several reviews of evaluations of neighborhood watch programs. One of the earliest conducted in the US was by Titus (1984) who summarized the results of nearly forty community crime prevention programs. Most of these included elements of neighborhood watch. The majority of studies was conducted by police departments or included data from police departments. Nearly all of the studies found that neighborhood watch areas were associated with lower levels of crime. However, most of the evaluations were described as 'weak' on the grounds that they offered no comparison group.
Another review of the literature looked mainly at community watch programs in the UK . The study reviewed the results of nine existing evaluations and conducted an original analysis of community watch in six additional locations using police-recorded crime data. The review of existing evaluations concluded that there was little evidence that NW prevented crime.
One of the most recent reviews of the literature on the effectiveness of community watch programs selected only evaluations with the strongest research designs. The authors included only studies that used random assignment or studies that monitored both watch areas and similar comparison areas without community watch. The review found just four evaluations that matched these criteria. The results of these evaluations were largely negative. The authors concluded, 'The oldest and best-known community policing program, Neighborhood Watch, is ineffective at preventing crime' (Sherman et al, 1997, p. 353; see also Sherman and Eck, 2002).

Objectives of the review
The primary aim of this review is to assess the effects of neighborhood watch on crime.
The primary objectives of the review are: 1) To operationalize the inputs (e.g. neighborhood watch) and the outcomes (e.g. crime) for the purpose of conducting the review.
2) To identify studies that have evaluated the effect of neighborhood watch on crime.
3) To identify a list of studies that meet the minimum criteria of scientific rigor. 4) To obtain a comparable measure of effect size for the most rigorous studies. 5) To arrive at a conclusion about the effectiveness of neighborhood watch.

Types of intervention
Neighborhood watch is often implemented alongside other programs. In practice, this is done in two main ways: a) Neighborhood watch schemes often include elements of other programs within the project. Watch schemes are sometimes described as comprising 'the big three' (neighborhood watch, property marking, and security surveys). The additional elements (property marking and security surveys) are viewed as part of neighborhood watch when implemented as part of a package. b) Neighborhood watch schemes (either single watch schemes or 'the big three') are sometimes implemented alongside other related schemes (such as environmental improvements and neighborhood organizing programs) as part of a comprehensive (multiproject) program.
The following types of intervention will be included in the review: a) stand-alone neighborhood watch schemes (comprising solely a watch component). b) neighborhood watch schemes that include 'the big three' (neighborhood watch, property marking and security surveys) as long as there is a watch component. c) neighborhood watch schemes that include two components of 'the big three' as long as there is a watch component.
In other words, for the purpose of the review we are defining neighborhood watch to mean stand-alone neighborhood watch schemes and neighborhood schemes with additional related elements.

Types of participants
Watch programs can be based on a diversity of populations, including boat owners, farmers, and business employees, and a diversity of locations, including car parks, yacht marinas, and the countryside. The current review is based on schemes involving residents living in neighborhoods.

Types of mediating processes
One of the most important defining elements of neighborhood watch is the mechanism by which the project aims to reduce crime. The main mechanisms of the 'watch' part of neighborhood watch schemes are: a) residents operating as the 'eyes and ears' of the police (i.e. surveillance) b) residents reporting suspicious behavior to the police or neighborhood co-ordinator c) residents interacting and working together to solve problems (which might strengthen social cohesion, collective efficacy, community activism, and other mechanisms of informal social control.) The mechanisms described above rule out neighborhood wardens and similar citizen patrols. Citizen patrols are based: (a) on the appointment of residents to a particular role, and (b) on agreement to conduct particular duties such as patrolling the streets. Watch schemes are based solely on residents operating in their capacity as residents.

Types of outcome
The review focuses mainly on the impact of neighborhood watch schemes on crime. The types of crimes covered in the review are those that neighborhood watch might be able to reduce. These include the following: a) crimes against residents b) crimes against dwellings c) other (street) crimes occurring in the watch area When crime measures are based on police recorded crimes, the main outcome measure is the total number of crimes recorded in the areas studied. When crime measures are based on victimization surveys, the main outcome measure is the prevalence of victimization.

Types of evaluation design
The criteria for selecting rigorous evaluations are based on the Maryland Scientific Methods Scale (SMS) (Sherman and Eck, 2002). This is a five-point scale ranging from level 1 (the weakest design) to level 5 (the strongest design) in terms of overall internal validity. Sherman and Eck (2002) argue that evaluations should be at least level 3 in order to conclude, with a reasonable level of certainty that the program worked. The current review of evaluations also uses this level as the minimum acceptable for inclusion in the review. This level requires that the evaluation must comprise at least a comparison of one or more experimental units and one or more comparable control units over time. Hence, the minimum requirement for inclusion of evaluations in the review of neighborhood watch is that they are based on both before and after surveys and experimental and comparison areas.

Search strategy for identification of studies
Criteria for selecting studies The review included published and unpublished literature. It was based on documented evaluations. There was no restriction on country of origin.
The evaluations had to be available in English.
There was no restriction on source sector (e.g. academic, government, policy, voluntary, etc.). There was no restriction in terms of year (e.g. year of implementation, study, or publication). There was no restriction of the time period covered by the evaluation (e.g. short-term or longterm effects).

Sources used for selecting studies
The following search strategies were used:

Search terms
The following search terms were used in the database searches: neighborhood watch, neighbourhood watch, street watch, block watch, apartment watch, home watch, community watch, home alert, block association, crime alert, block clubs, crime watch, 'big three'.

Description of methods used in primary research
The main types of research design used for evaluating neighborhood watch schemes have been discussed in previous sections. The most common is some kind of quasi-experimental design. The review includes only the strongest of these designs. In practice, quasiexperimental designs were selected only if they included before and after measures in experimental and comparable control areas.

Criteria for determination of independent findings
Evaluations sometimes produce multiple outcome measures. These can occur when: (1) there are multiple methods of measuring the same outcome, and (2) when the same outcome is measured at multiple points in time.
When multiple outcome measures are provided (e.g. multiple outcome measures of crime) we listed the results for each measure. However, the analysis is based on only one measure. The measure chosen is based on a system for prioritizing the results (i.e. burglary first, followed by all property crimes and then all crimes). When the same outcome is measured at multiple points in time, we have selected the year before and the year after the implementation of the scheme as the first choice. Failing this, we chose other periods in accordance with the above priority system (i.e. periods nearest to the point of implementation were chosen first).

Details of coding categories
The information extracted included: author, publication date, study date, location, physical context of the intervention, type of intervention, duration of the intervention, duration of the evaluation, sample size, other interventions employed at the time, outcome measures, data source, research design, results, author(s) conclusion.

Statistical procedure and conventions
Meta-analyses were carried out to determine an overall effect size. Odds ratios were calculated for each evaluation and a weighted mean odds ratio was calculated for all studies combined based on the guidelines summarized in Lipsey and Wilson (2001). See metaanalysis section for details of the statistical methods used. Table 1 presents the results of the literature searches. The first section of the table displays the number of publications identified (i.e. the number of 'hits') from the literature searches described above. A total of 1,595 publications were identified from the searches. The second section of the table shows the number of publications that was provisionally selected for inclusion. Overall, 335 publications were selected as potentially relevant evaluations. Criteria for selecting publications at this stage were based on a review of titles and abstracts. Publications that were clearly NOT evaluations of neighborhood watch were excluded. The 335 potentially relevant publications included 110 publications that had been identified previously. Hence, 225 unique publications were selected for potential inclusion in the review. The third section of the table displays the number of selected (non-duplicated) publications that were obtained and not obtained. Of the 225 selected publications, 137 (61%) were obtained. The main reasons for not obtaining publications were that they could not be located following various attempts to obtain them by inter-library loan, through the Internet, or by contacting the authors. The fourth section of the table shows publications eligible for inclusion. Thirty publications were eligible and 107 were ineligible. The main reason for ineligibility (n=60) was that the publication did not include an evaluation of neighborhood watch (see Table A1 for details of all ineligible publications). Eleven of the eligible publications presented results that were included in another eligible publication. In each of these cases, the most detailed publication was selected for inclusion in the review. This resulted in 19 publications that presented findings from 19 unique studies. Some studies included evaluations of more than neighborhood watch program. In total, these 19 studies covered evaluations of 43 separate neighborhood watch schemes. The last section of the table shows the number of studies that were included in the meta-analysis. Of the 19 studies, 12 were suitable for inclusion in the meta-analysis on the grounds that they provided sufficient data to conduct the analyses required. In total, these 12 studies covered evaluations of 18 separate neighborhood watch schemes. Notes: Publication = a published document. Study = a research project. Evaluation = an evaluation of a single neighborhood watch scheme. The results of a research project (study) might be reported in more than one publication. A study might present the results of more than one evaluation.

Description of studies meeting the eligibility criteria
The searches described above resulted in 19 studies eligible for inclusion in the review covering 43 evaluations of neighborhood watch schemes. Table 2 provides a description of these studies.
The first column provides details about the author and year of publication. The majority of studies (n=12) were published in the 1980s, when interest in evaluating neighborhood watch was at its height. The second column shows that nine studies reported findings about neighborhood watch schemes operating in the UK and eight reported findings about schemes operating in the US. The two remaining studies reported findings from Canada and Australia. The third column shows that findings from 18 of the 43 separate evaluations were included in the meta-analysis. The fourth column indicates that the majority of evaluations were based on neighborhood watch schemes combined with at least one other element (n=30). Eight of these included the 'big three' (i.e. neighborhood watch, property marking and security surveys). The fifth column presents information about the research design. All of the evaluations used a pre-test post-test experimental-control design. The sixth column shows the size of the scheme area (i.e. the number of residents, dwellings, roads or census tracts). The seventh column presents information about the characteristics of the area in which the neighborhood watch scheme was operating (i.e. the experimental area). The last column in the table describes the comparison (or control) area in which neighborhood watch schemes were NOT operating.

RESULTS
Two methods can be used to summarize the results of the selected studies. The first is a narrative review, which presents details of the studies and the results obtained. The findings are presented in the form of the relative percentage change in crime in the experimental and control areas. The review also includes the author(s) conclusion and other textual comments found in the research publication. The second method is a meta-analysis, which involves recalculating the published findings to produce a common effect size across studies. The main advantage of a narrative review is that it is possible to include more studies in the review. The main disadvantage is that it is difficult to obtain an overall finding for all studies combined. The main advantage of a meta-analysis is that a single weighted mean effect size can be calculated for groups of studies or all studies combined. The main disadvantage is that it can only be used when there is sufficient information provided in the original report to conduct the analysis. In the following section we present the findings of both methods.

Narrative review
The results of the narrative review are presented in Table 3 below. This is followed by short paragraphs describing the methods and findings of each study in turn.
In order to determine the overall effectiveness of neighborhood watch for the narrative review, it was necessary to determine whether or not the program was effective in reducing crime. In the current review, effectiveness was determined by calculating the relative percentage change of the experimental and comparison area over time. Studies were excluded from the analysis if they did not provide the data that would enable the percentage change to be calculated (e.g. if the results were presented in graphical form only). In total, 24 of the 43 evaluations presented the necessary data. If the experimental area outperformed the comparison area (i.e. crime decreased by more or increased by less), the program was deemed to have a positive effect on crime. If the comparison area outperformed the experimental area, the program was deemed to have a negative effect.
In those cases where there was more than one eligible finding presented within an evaluation, one finding was selected using the system of prioritization outlined above (i.e. findings relating to residential burglary were selected as first priority, followed by findings for property crime and findings for all crimes).
The results of the narrative review show that 19 of the 24 evaluations included in the analysis found that neighborhood watch was associated with a reduction in crime based on relative percentage change (as described above). Conversely, five evaluations found that neighborhood watch was associated with an increase in crime. Technically, it is possible that neighborhood watch could have caused the increase in crime. This could be because watch programs attract offenders as they might suggest that there is something worth stealing in the area. However, it is also possible that watch programs increase recorded crime as a result of increases in reporting rates among residents. Overall, the majority of studies included in the narrative review show that neighborhood watch is associated with a reduction in crime.

Anderton (1985) Anderton (1985) conducted an evaluation of a 'Home Watch' scheme in Northwich in
Cheshire. This was one of the first evaluations of neighbourhood watch in the UK. The study was based on a comparison of police-recorded crimes measured 18 months before and 30 months after the launch of the scheme. The crime rates for Northwich were compared with the crime rates for Cheshire as a whole. The results showed that the number of burglaries in Northwich decreased by 10 per cent, compared with an increase of three per cent across the county as a whole. Anderton (1985) concluded that, 'It appears from the experience in Cheshire so far that Home Watch is one of the most effective, efficient and successful crime prevention initiatives ever undertaken' (p.53). Bennett (1990) evaluated the effectiveness of neighbourhood watch schemes in two areas of London (Wimbledon and Acton). The evaluations were based on crime and public attitude surveys in the two areas before the schemes were implemented and again one year after their implementation. Similar surveys were conducted in matched comparison areas some distance from the experimental areas. In Wimbledon, crime decreased by a greater amount in the control area than in the experimental area (28 per cent compared with 22 per cent). In Acton, crime increased by 37 per cent in the experimental area and decreased by 28 per cent in the control area. The author concluded that the findings were 'not encouraging' (p.110). Overall, the results suggested that residents in the neighbourhood watch areas experienced either no better or worse rates of victimization than in the comparison.

Bennett and Lavrakas (1989)
Bennett and Lavrakas (1989) investigated the effectiveness of neighbourhood watch schemes in 10 US cities (Baltimore, Boston, Bronx, Brooklyn, Cleveland, Miami, Minneapolis, Newark, Philadelphia and Washington). The research was based on a pretest -posttest design with a non-equivalent control groups. The comparison areas were selected by drawing a 'ring' around the experimental area approximately two census tracts wide. Monthly crime statistics revealed no differences between the experimental and control areas in seven of the ten evaluations and a negative differential change (where crime decreased less in the experimental area than in the comparison area) in two of the cities. Only one area showed a positive differential change (where the experimental area experienced a larger decrease in crime than the control). The authors concluded that the programs 'did not seem to achieve the 'ultimate' goal of crime reduction' (p.361). Cirel et al. (1977) conducted one of the first evaluations of the effectiveness of neighbourhood watch in the United States. The evaluation, based in Seattle, Washington, included a telephone and door-to-door surveys of residents one year before the launch of the scheme and one year after. Two census tracts adjacent to the neighbourhood watch area was used as a comparison. The results showed that the rate of burglary decreased by a substantially greater amount in the experimental areas than in the control areas (61 per cent compared with 4 per cent). The authors concluded that participating in community crime prevention, 'significantly reduces the risk of residential burglary victimization' (p.79).

Forrester, Chatterton and Pease (1988)
Forrester, Chatterton and Pease (1988) evaluated a burglary prevention project in Kirkholt, an area of public housing near Rochdale (a town 10 miles north of Manchester) in the UK. A package of measures was introduced as part of the project, including 'cocoon' neighbourhood watch. The evaluation was based on the analysis of pre-and post-test police-recorded crime rates in the experimental area (Kirkholt) which were compared with crime rates in the remainder of the police sub-division. The results showed that domestic burglaries decreased by 38 per cent in the experimental area compared with one per cent in the remainder of the sub-division. The authors concluded that there had been a, 'large absolute and proportionate reduction in domestic burglary during the initiative' (p. 19). Henig (1984) conducted an evaluation of neighbourhood watch in a police district in Washington, DC. The impact of block watch on crime was assessed by examining the levels of police-recorded crime in the sample of participating blocks in the year before and after the scheme had been launched. This was compared with crime rates for the police district as a whole and for the city as a whole. The results showed that over the evaluation period the level of burglary decreased by 100 per cent (from 4 to 0 burglaries) in the sample area and by 35 per cent (from 2745 to 1778 burglaries) in the police district as a whole. The author concluded that neighbourhood watch was associated with a reduction in burglary among participating blocks.

Hulin (1979)
Hulin (1979) evaluated the effectiveness of a neighbourhood watch scheme in a high crime area of Fontana, California. Using police-recorded crime data for the year before and the year after the scheme, the author compared changes in residential burglary rates in Fontana with changes in burglary in four demographically similar control areas with similar pre-test crime rates. The results showed a decrease in residential burglary of more than 25 per cent in the experimental area compared with increases ranging from 10 to 25 per cent in each of the control areas.  concluded that the results were 'positive' and indicated that neighbourhood watch was 'an effective crime prevention instrument' (p.30).

Husain (1990)
Husain (1990) evaluated the effectiveness of neighborhood watch schemes in six UK cities (Birmingham, Brighton, Burnley, Manchester, Preston and Sutton Coldfield). In each evaluation, police-recorded crime data were used to compare crime rates before and after implementation of neighborhood watch in the experimental areas. These changes were then compared with changes in crimes rates in six control areas. The findings were presented mainly in graphical form showing relative percentage change in crime in the program areas.
In three of the six areas, the implementation of neighborhood watch was accompanied by an improvement in crime. In the other three areas, there were no improvements. The author concluded '...in three of the six areas the introduction of NW has been accompanied by some improvement in the crime situation (p.66)', while, '...the results from three other areas are less convincing (p.67).

Jenkins and Latimer (1986)
Jenkins and Latimer (1986) conducted evaluations of neighbourhood watch schemes in four areas of Merseyside in the UK. Each of the four evaluations examined the number of crimes recorded by the police in the year before and the year after the scheme had been implemented. In three of the four areas, the experimental area experienced larger decreases in the number of burglaries than in the sub-division as a whole. In the fourth area (Burford Avenue), burglary increased by more than 1,000 per cent (from 1 to 12). The authors concluded that there is, 'an indication that Homewatch is having an effect, certainly initially, in reducing the instances of burglary within an area and to a lesser extent the total crime' (p.12). However, they warned that results of the Burford Avenue scheme 'should not be ignored and indicate that Homewatch is not a panacea for reducing crime' (p.12).

Knowles, Lesser and McKewen (1983)
Knowles, Lesser and McKewen (1983) evaluated the effectiveness of a neighbourhood watch programme in a residential suburb on the western boundary of Los Angeles County in the USA. The evaluation examined changes in the rate of police-recorded residential burglaries in the 12 months before and the 12 months after the programme had been implemented. These were compared with burglary rates in comparison areas (comprising eight neighbouring jurisdictions). The results showed a decrease in burglary of 28 per cent in the experimental area, compared with an increase of 13 per cent in the comparison area. The authors explained that the atmosphere of co-operation fostered by the programme, 'provided for the achievement of a common goal -crime control' (p 38). Latessa and Travis (1987) conducted an evaluation of a block watch programme implemented in the College Hill area of Cincinnati in the USA. College Hill is described by the authors as the fifth largest community in the city with a population of over 17,000 residents. Using police-recorded crime data, burglary rates in College Hill in the year before and after the scheme were compared with burglary rates in the city of Cincinnati as a whole. The figures show that burglary in the experimental area decreased by 11 per cent, while burglary in Cincinnati as whole decreased by two per cent. The authors concluded that College Hill experienced a decrease in the amount of recorded crime during the course of the programme.

Lewis, Grant and Rosenbaum (1988)
In another US study,  evaluated the effectiveness of five block watch schemes in Chicago, Illinois. Crime and public attitude surveys were conducted in the experimental and matched control areas before the launch of the schemes and again one year after the launch. Only one of the five experimental areas experienced a reduction in victimizations. Two of the experimental areas, however, showed a statistically significant increase in victimizations per respondent. The authors concluded in their original report that the results, 'force us to seriously address the possibility of both theory failure and program failure in this field' (Rosenbaum, Lewis and Grant 1985, p.170).

Lowman (1983)
Lowman (1983) investigated the effectiveness of neighbourhood watch in a residential district of Vancouver, Canada. The evaluation was based on a comparison of crime rates in an experimental area (the neighbourhood watch pilot project area) and three control areas in which neighbourhood watch had not been implemented. The results showed that the number of burglaries decreased by 33 per cent in the experimental area with no change in the comparison areas. The author concluded that the reduction in the experimental area 'may be indicative of a deterrent effect of the program' (p.295).  conducted an evaluation of a neighbourhood watch scheme in the New Parks area of Leicester in the UK. Police-recorded crime data were used to determine changes in crime rates in the experimental area in the 12 months before and after the launch of the scheme. Comparable data were obtained for seven nearby control areas. The results showed that the number of burglaries decreased in the experimental area and increased in the control area. However, in the following year the rate of burglary increased. The authors explained that this reduction in burglary was 'welcome' but somewhat 'short-lived' (p 67).

Matthews and Trickey (1994)
In a second evaluation of neighbourhood watch in Leicester,  evaluated the effectiveness of a neighbourhood watch scheme on the Eyres Monsell housing estate. Police data were used to examine changes in the number of burglaries in the year before the launch of the scheme and in the year following implementation. Data were also collected for four other housing estates in the area close to the Eyres Monsell estate. Over the study period, the number of burglaries on the Eyres Monsell estate increased by 24 per cent. The number of burglaries on the Saffron Lane estate (the estate with the most similar pre-test burglary rate) also increased over the study period, but the increase was approximately half that of the experimental area (12 per cent). The authors concluded that the outcome of the project as a whole was positive, although 'not particularly remarkable' (p.50). However, the rapid increase in the number of burglaries in 1994 was 'a cause of considerable concern' (p.50).

Mukherjee and Wilson (1988)
Mukherjee and Wilson (1988) evaluated the effectiveness of neighborhood watch in Australia focusing on the state of Victoria. Using police data, the authors compared changes in crime rates in areas with high levels of neighborhood watch with areas that had medium, low levels, or no neighborhood watch. The findings were presented in terms of whether neighborhood watch had a 'good', 'average' or 'poor' effect on reducing crime over the two-year period evaluation period. The results showed that police divisions with high levels of neighborhood watch showed greater reductions in residential burglary than those with low levels or no neighborhood watch. The authors concluded that their findings 'lend very reasonable support to the objective of neighborhood watch in suppressing residential burglary' (p.5).

Research and Forecasts Incorporated (1983)
In a US study, Research and Forecasts Incorporated (1983) conducted an evaluation of neighbourhood watch in a residential area of Detroit, Michigan. The study used police data to compare changes in crime rates in 155-block experimental area (Crary-St Mary's) with changes in a matched control area four miles away. In both neighbourhoods, crime rates for the 12 month period before and after implementation of neighbourhood watch were examined. The results showed that burglary rates decreased by a substantially greater amount in the experimental area than in the control area (48 per cent compared with 4 per cent). The authors explained that reported crime statistics showed, 'a substantial reduction in Crary-St Mary's that is not matched by the statistics for the control neighbourhood' (p.34).

Tilley and Webb (1994)
Tilley and Webb (1994) present findings from 11 evaluations of individual burglary reduction schemes implemented as part of the Safer Cities Program in the UK. Three of the 11 evaluations (in Birmingham-Primrose estate, Rochdale-Belfield estate and Rochdale-Back O'Th'Moss estate) met the eligibility criteria for inclusion in this review. Each evaluation employed a pretest-posttest research design and compared crime rates in the experimental area with crime rates in either the remainder of the sub-division or in the city as a whole. In all three evaluations, the experimental area outperformed the control area. In the two Rochdale evaluations, the control area experienced an increase in crime, while the experimental area experienced a decrease or remained stable. In Birmingham, both the experimental and control areas experienced a decrease in crime, but the decrease was greater in the experimental area (41 per cent compared with 11 per cent). The authors described the schemes as a, 'great success' in terms of 'reducing crime and as an example of crime prevention work' (p.4).

Veater (1984)
In an early UK study, Veater (1984) evaluated a neighbourhood watch scheme in Kingsdown, Bristol. The evaluation was based on pre-test and post-test victim and public attitude surveys conducted in the scheme area. A comparison was also made of crime rates in an adjacent area to the neighbourhood watch scheme using police-recorded crimes. The findings showed that crime decreased by 25 per cent in the experimental area, but increased by 31 per cent in the comparison area. The author noted that the increase might be a result of crime displacement. He concluded that, 'the neighbourhood watch concept has potential if adequate resources are made available ...' (p.5).

Meta-analysis
In order to carry out a meta-analysis of the effects of neighborhood watch, a comparable effect size measure is needed for each evaluation, together with its variance (see Lipsey and Wilson, 2001). All evaluations included in the review employed the same research design (pre-test and post-test measures for experimental and control groups). The majority (n=15) of evaluations used police-recorded data to provide an outcome measure of crime. The remainder (n=3) used self-report victimization surveys. The two types of data require different methods to obtain an odds ratio (OR). These methods are described below.
The outcome measure in each study was the number of crimes (i.e. burglaries, property crimes, or all crimes, in that order) recorded by the police (police-recorded crime data) or the number of people victimized (survey data). There were no evaluations included in the review that provided sufficient information (i.e. standard deviations) to allow ORs to be calculated from mean offending rates. Hence, the meta-analysis is based on ORs derived solely from frequencies or proportions.

Police-recorded crime data
The best measure of effect size for findings based on crimes and victimization is the OR. In practice, the differences in the levels of crime in experimental and control areas recorded in police crime data are not strictly speaking ORs as they are based on events rather than proportions of people experiencing the event. However, for simplicity and clarity the term has been used throughout to describe differences found in both police data and survey data.
The OR is calculated as shown in the following table. Experimental  a  b  Control  c  d where a, b, c, d are numbers of crimes OR = a*d/b*c

Before intervention After intervention
The null, or no effect, value of the OR is 1.0. To the extent that the OR exceeds 1.0, it might be concluded that the intervention (i.e. neighborhood watch) was possibly beneficial. To the extent that the OR falls below 1.0, it might be concluded that the intervention was possibly harmful. It is technically possible that some schemes might serve to increase the number of recorded crimes (e.g. it has sometimes been argued that increased surveillance will lead to an increase in the number of crimes reported to the police).
The variance of the OR is calculated from its natural logarithm (LOR).
VAR (LOR) = 1/a + 1/b + 1/c +1/d In order to produce a summary effect size in a meta-analysis, each effect size (here, LOR) is weighted by the inverse of its variance (1/V). This estimate of the variance is based on the assumption that total numbers of crimes (a, b, c, d) have a Poisson distribution. If the number of crimes has a Poisson distribution, its variance should be the same as its mean. However, the large number of changing extraneous factors may cause over-dispersion; that is, where the variance of the number of crimes VAR may exceed the number of crimes N. The analysis was therefore adjusted to deal with the problem of possible 'over-dispersion' (i.e. greater than expected variance). Hence, the standard formula for V(LOR) was multiplied by an overdispersion factor D, where D = VAR/N.
Farrington, Gill, Waples, and Argomaniz (2007) estimated VAR from monthly numbers of crimes and found the following equation: D = .0008*N + 1.2 D increased linearly with N and was correlated (.77) with N. The median number of crimes in their study was 760, suggesting that the median value of D was about 2. However, Farrington et al. (2007) argued that this is an overestimate because the monthly variance is inflated by seasonal variations, which do not apply to N and VAR. Nevertheless, in order to obtain a conservative estimate, V(LOR), calculated from the usual formula above, was doubled in all cases involving police-recorded crime data. This adjustment corrects for overdispersion within studies, not for heterogeneity between studies. In order to test the effects of assuming different over-dispersion factors, the value of V(LOR) was also trebled (rather than doubled) and the results were recalculated. The results showed that there was no change in OR for the fixed effects method and only a very small change in OR for the random effects method. There was also either no change or a very small change in the confidence intervals across the methods.

Before intervention After intervention Crime
No crime The variance of LOR is calculated using the following formula: VAR (LOR) = 1/a1 + 1/b1 + 1/c1 + 1/d1 + 1/a2 + 1/b2 + 1/c2 + 1/d2 This method is based on comparing before and after ORs. This was considered preferable to comparing after ORs only as this would not control for pre-existing differences between the experimental and control areas. Table 4 summarizes the 18 evaluations included in the meta-analysis. The table shows that 15 evaluations had an OR greater than one and three had an OR less than one. Hence, in the majority of evaluations, neighborhood watch was associated with a reduction in crime. Four of the 15 evaluations with an OR of greater than one were statistically significant (Research and Forecasts Inc. 1983, Anderton 1985, Veater 1984and Forrester et al. 1988). This can be seen graphically in the forest plot shown in Figure 1. The graph shows a clear pattern of small positive effects.

Mean effect sizes
An important aim of a meta-analysis is to calculate a weighted mean effect size (here, the OR).
There are two ways of calculating the weighted mean effect size. In the case of the fixed effects (FE) method, each effect size is weighted by the inverse of its variance (1/VAR), so that studies based on larger samples are given greater weight than those based on smaller samples. The FE method is based on the assumption that the effect sizes are homogeneous in the sense that they are all drawn from a random distribution of effect sizes about some mean. However, the effect sizes might violate this assumption and be significantly heterogeneous. One method of addressing the problem of heterogeneity is to use the 'random effects' model. The random effects (RE) method minimizes heterogeneity by adding a constant to the variance of each effect size (for the formula, see Lipsey and Wilson, 2001, p.119). We should emphasize that our estimate of the variance of ORs, while the best available at present, are not exact figures and may be slightly inaccurate. Therefore, there may be some inaccuracy in the weightings used in our meta-analyses. The main consequence of this is that the confidence intervals around the weighted mean effect sizes may be slightly inaccurate. Notes: An odds ratio of 1.19 means that crime increased by 19% in the control area compared with the experimental area or decreased by 16% in the experimental area compared with the control area (1/OR). An odds ratio of 1.36 means that crime increased by 36% in the control area compared with the experimental area or decreased by 26% in the experimental area compared with the control area.

Moderator analyses
Overall, the meta-analysis has shown, using both the FE and RE models, that neighborhood watch was associated with a significant reduction in crime. However, it is possible that the results will vary by specific characteristics of the program being implemented or by the research design of the evaluation. The results of the moderator analyses which investigate this are presented in Table 5 below.

Type of comparison area
It is possible that there are differences in results depending on whether studies used nonequivalent or equivalent comparison areas. It could be argued that research based on nonmatched areas are more likely to produce a positive result due to regression to the mean in the experimental area (which might have been selected at a time when crime was unusually high and likely to fall) but not in the comparison area (which might have been selected at a low point in its crime cycle and likely to rise). In order to test for this, the studies included in the meta-analysis were split into two groups based on the nature of the comparison area (i.e. whether it was 'matched' or 'not matched'). The meta-analysis was then repeated. The results showed that the difference between these ORs was statistically significant with studies based on matched areas showing larger effect sizes than those based on unmatched areas. This finding is counter to the effect hypothesized above. One reason for this result is that the 'not matched' comparison area comprised the broader police area and sometimes included the experimental area. Under these circumstances it is conceivable that any movements in the broader area will be reflected in the experimental area, which would result in no apparent neighborhood watch effect. The differences could also be explained by other factors or unmeasured differences between the groups.

Type of data
It is also possible that the effectiveness of evaluations varies in terms of the type of data collected. It was argued earlier that the method of calculating ORs was slightly different using the police and survey data and that this might result in different findings. The data are also different in that the latter includes non-reported offences. In order to test for this, the 15 evaluations that collected police data were compared with the three evaluations that collected data from self-report surveys. The results showed that the difference between the two ORs was not statistically significant. Hence, the effectiveness of neighborhood watch programs did not vary by the type of data collected. This provides a justification, therefore, for combining police and survey data in the overall analysis.

Type of scheme
It might be expected that NW schemes based on limited versions of the program might be less likely to show an effect than schemes based on more comprehensive versions. In order to test for this, the studies were split into two groups based on program type (i.e. whether it was NW alone or NW with one or more additional elements of the 'big three'). The results showed that the mean difference between ORs was not significant. Hence, the type of program did not independently affect outcome.

Size of scheme area
It could be argued that larger schemes might be more effective than smaller schemes on the grounds that a greater number of neighbors are looking out for suspicious behavior. It could also be argued that smaller schemes might be more effective than larger schemes as the interaction between neighbors who know each other well might be more concentrated. Overall, there was no statistical difference in the ORs of larger and smaller schemes.

Year of publication
It might be the case that early schemes might be more effective than later schemes on the grounds that the motivation and interest in the program was highest at its inception. It is also possible that the reverse might be the case with motivation and expertise increasing over time. The results show that there was no significant difference in the outcomes of earlier compared with later schemes.

Publications status
Another possible variation in results might relate to publication status. It has been hypothesized that publishers are more likely to publish evidence of success than evidence of no effect or failure. This is sometimes referred to as 'publication bias'. In order to test for this, evaluations were identified as published or unpublished. Research was defined as published if it was reported in a book, journal or official government report, as these were likely to have been externally reviewed before distribution. Evaluations were defined as unpublished if they were police reports or reports from survey research companies, as these were less likely to have been externally reviewed before distribution. The mean OR was then calculated for each group. The results showed that the difference between the mean ORs was statistically significant. In other words, the results support the publication bias thesis by showing that published evaluations tended to provide evidence of a stronger neighborhood watch effect than unpublished evaluations.

Country
Finally, it is possible that schemes operating in different countries have different effects as a result of a variety of factors including the environmental context, the nature of the program implemented or the methods of evaluation. The mean OR for studies conducted in the USA and Canada was 1.87 (n=4) compared with 1.18 for the UK (n=14). The difference between the ORs was statistically significant (p<0.05). In other words, evaluations of neighborhood watch conducted in the US and Canada were significantly more likely to show a reduction in crime than studies conducted in the UK. It is difficult to explain such variations because of the large number of factors that could potentially affect outcomes. The main measurable difference between the comparison countries was that there were proportionately more matched studies in the US and Canada (3 of 4) than in the UK (5 of 14). It was shown earlier that matched studies more frequently showed a favorable outcome than non-matched studies. However, there are many other plausible explanations for the difference.

CONCLUSION
The results of previous systematic reviews of neighborhood watch presented in the introduction were divided in terms of the conclusions drawn. Titus (1984) concluded that neighborhood watch was effective, but noted that the research methods used to investigate this were weak.  concluded that there was little evidence that neighborhood watch worked. Sherman and Eck (2002) concluded that neighborhood watch was ineffective in reducing crime.
The main findings of our narrative review were that just over half of the schemes evaluated (19) showed that neighborhood watch was effective in reducing crime, while only six yielded negative effects. The main finding of the meta analysis was that neighborhood watch was associated with a relative reduction in crime of between 16 and 26 per cent. The generally positive findings of the narrative review are consistent with the favorable effect found in the meta analysis. Hence, the dominant finding of our review, using both methods, is that neighborhood watch is effective in reducing crime.
However, the limitations of both the methods used in the original studies and in the narrative review and meta analysis need to be taken into account. A particular problem with the original studies is that the experimental and comparison areas were rarely wholly equivalent and sometimes not equivalent at all. The narrative review is limited in that it is based on simple counts of changes in reported crime that provide only a simplified measure of effectiveness. The main problem with the meta analysis is that it is restricted to a not necessarily representative subset of all studies. Also, the estimates of the variance of ORs might be slightly inaccurate.
One notable problem is that the studies used in the narrative review and the meta analysis were different. In order to examine the effect of this on outcome, we split the studies included in the narrative review into two groups. Studies included in the meta analysis were more likely to show a positive effect (78% positive) than those excluded (28% positive). If all studies in the narrative review had been used in the meta analysis then the result would have been less positive. As the meta analysis provides the stronger overall finding, it would have been more difficult to conclude that neighborhood watch was effective.
It is also not immediately clear from the research evidence why neighborhood watch is associated with a reduction in crime. According to theory (see above) neighborhood watch might be effective in increasing surveillance, reducing opportunities and enhancing informal social control. Unfortunately, very few studies provide information on the mechanisms by which neighborhood watch might reduce crime. It is therefore difficult to determine from current research how neighborhood watch works.

Research implications
There are a number of implications that can be drawn from the review for future research on the effectiveness of neighborhood watch.
First, the review has drawn attention to the common problem of a relatively small number of good-quality studies in terms of research design. Among the 27 studies that were excluded on grounds of methodological quality, 19 had no comparison group and 8 presented only posttest data on crime.
Second, coupled with this, it is unclear why evaluations of neighborhood watch stopped abruptly in the mid 1990s. It is possible that researchers felt that the effectiveness or ineffectiveness of neighborhood watch was already established and that there was no need for further investigation. As a result, the effectiveness of neighborhood watch in more recent times is largely unknown. It would have been helpful if more recent evaluations of neighborhood watch had been conducted in order to determine current effectiveness.
Third, none of the studies was based on random allocation of areas to treatment or control conditions. Instead, all studies were based on some version of a quasi-experimental design. This is almost certainly a result of the difficulties involved in implementing communitybased programs in areas where communities have not requested them. It is difficult to conduct a randomized experiment with areas as the unit of assignment. However, quasiexperimental designs are not ideal and some writers have argued that they can over-estimate the positive effects of schemes as a result of selection effects whereby the subjects or schemes most likely to change are included in the experimental group (for a discussion see Wilson, Mitchell, and MacKenzie, in press).
Fourth, a particularly important problem for the current review was that less than half of the eligible studies reported data that were suitable for a meta-analysis. This was either because studies presented the results using an unusual statistical notation or left out the data entirely (e.g. when the results were presented in graphical form only). It would be helpful if published evaluations included, at a minimum, raw data, cell sizes and other relevant information in order to facilitate future meta-analyses.
Finally, very few evaluations disaggregated the findings in a way that would show differential effects for subgroups and provide detailed information on the features of the program. As there might be variations in outcome according to the type of program implemented or the type of area in which it is implemented it is important that this information should be included in a research report.

Implications for policy
Neighborhood watch has often been described as one of the most widespread methods of reducing crime. It is supported by UK and US governments and is popular among the public and the police (Sims, 2001). The current review provides some support for this level of implementation. However, little is known about the factors that influence whether or not it is effective. The results of this review have shown that there is some variation across schemes in terms of the outcomes achieved. Governments and those responsible for crime prevention policy should investigate differences between more effective and less effective schemes in order to guide good practice.

STATEMENT CONCERNING CONFLICT OF INTEREST
There is no conflict of interest. However, one of the studies included in the review was conducted by Trevor Bennett.

Studies included in the narrative review
Report number