THE BASICS OF REGRESSION - DISCONTINUITY DESIGNS

Claudia Nieves Velasquez ©


Evaluators have developed extensive experience in research designs that might be used to help assess outcomes when randomized experiments cannot be accomplished. These methods are often called quasi-experimental designs; their main difference, is that they do not use random assignment to determine what conditions people receive, but they do use a pre and post test and a no-treatment and comparison group. There are many variations of this type of design, and each of them has different strengths and weaknesses (Trochim, 1991. Developing evaluation culture for international agriculture research.).

The regression-discontinuity design, is one of these variations, which uses the traditional pretest-posttest program-comparison group strategy. Some authors say that this design is a “bridge” between the traditional randomized experiments and the quasi-experiments (Judd. C.M. Kenny, D.A. 1981. Estimating the effects of social interventions; Chapter 5)) . This design was first used in the mid-1970s in the nationwide evaluation system for compennsatory education programs funded under Title I of the Elementary and Secondary Education Act (ESEA) of 1965 and in the last years it has been used mostly in medical trails and sometimes in social program interventions. In RD designs, as they are ussually labeled, participants are assigned to either the program or comparison groups on the basis of a cut-off score on a specific pre test measure. It might also be used when two alternative programs are being compared. The typical assignment rule is that those scoring above or equal to a certain value on some pre-treatment measure will receive the treatment, and those who score below the value will not.

This tutorial explains the basics of this type of design with an example that might help to make the material more clear. If you are interested in detailed information regarding the statistics of this design, check out Professor Bill Trochim's "Statistical Analysis of Regression Discontinuity Design"

The following example will give you a better idea of what are the basic components of this design and about its advantages and disadvantages:

Dr. C. Penagos is a well known cardiologist who works with patients that suffer from diiferent cardiac problems, including high levels of blood cholesterol. Last year he published the results of a study he did with some of his patients, to test the effectiveness of a special low cholesterol diet he developed which helps patients lower their plasma levels of lipids and avoid the so common atherosclerosis. The following is the story of how he selected a group of patients to use this diet and evaluate its effects on cholesterol levels.

The Study: as stated before, Dr. Penagos was interested in studying the effects that the consumption of a low saturated fat and low cholesterol diet had on elevated plasma levels of cholesterol. Since the diet is a very strict one, he did not want it to give it to all of his patients, but just to those ones who really needed to lower their cholesterol levels to avoid a higher risk of developing atherosclerosis, angina or other cardiac complications. But he also wanted to perform a study which will allow him to clearly observe the effects of the diet and be able to conclude that the diet was the one lowering cholesterol and not other factors, in other words, he was looking for a design with strong internal validity. Therefore, he decided to use a regression discontinuity design - which highly enhances internal validity, almost comparable to randomized experiments. This design is highly appropriate when we wish to target a program or treatment to those who most need or deserve it, like Dr. Penago's patients with high cholesterol.

The Cut-off: As part of an annual exam, all of Dr. Penago's patients are done a laboratory analysis of fasting plasma cholesterol levels; the doctor revised the last year files to determine which patients had cholesterol levels above 200 mg/dl and which were below that; cholesterol levels higher than 200 mg/dl are considered above normal by the National Education Cholesterol Program (NECP).

The Groups: with all of these results, he divided the patients in two groups, those with cholesterol levels above 200 mg /dl were labeled the “low fat diet” group, and those with values lower than the cut-off, as the “control diet ”. To diferentiate patients in one or the other group, “X” a dichotomous treatment variable (dummy variable) was used, those who were receiving a treatment, were labelled "1" and "0" was given for those who did not.

The Treatment: Patients in the experimental group were invited to a special event at the clinic, and they received special instructions concerning the diet. The patients went into the diet for 6 months, and attended periodical meetings to discuss questions or concerns about how to follow the diet.

The Measures: Since cholesterol levels were found in last year's files for all patients, such data was considered as the pre-test measure, we will call this the level of assignment variable, “Z”; frequently in RD designs, this is a measure of the outcome variable, or the pre test, and is related to the cut-off point selection, which we will call Zo. Six months after the initial meeting, cholesterol plasma levels were determined again for both groups in order to assess the effectiviness of the diet. This second measure will be called, "Y” the outcome variable: the variable in which we expect treatment effects or to make it simpler, we can say it is sometimes the post-treatment measure .

The effect or results: To describe the results of a classical RD design, a scatter plot between assignment (Z) and outcome (Y) variables is used, and a vertical line is drawn through Zo to illustrate where the cutting point is. Next, parallel regression lines are separately fitted to the data for those above and below Zo.and they are extrapollated until the cutt-off point, the differences in “Y” between the lines at this point is the measure of the treatment effect. In other words, the difference between groups can be viewed as the difference between the “Y” intercepts of the comparison and treatment groups. If there is a program effect on the groups, a jump or discontinuity is observed in the scatterplot at the exact point of the cut-off. If there is no program effect, a continuous line is present in the plot, and no jump is observed. This is more clear if you look at the graphs below:

Figure 1.0

If no special diet were administered, and cholesterol levels were measured at 0 and 6 months in both groups, the data might look like the bivariate distribution shown in Figure 1.0. In this figure, the horizontal axis presents data for the cholesterol measure at “0” months or what we called the assignment variable, and data in the vertical axis shows the measures at “6” months or the outcome variable. The cutt-off point is the black line in the middle of the graphic and shows the value of 200 mg/dl. Patients with a high level of cholesterol will remain high, assuming that no other treatment, such as pills or others are given, and patients with cholesterol below 200 mg/dl will remain low.



This is Mrs. Barillas !! read her story...



Mrs. Barillas is a 52 year old lady who has always had trouble with controlling her cholesterol and her diet, her file for last year reported a cholesterol level of 250 mg.dl, assuming that no treatment is given to her, her cholesterol level 6 months from now, will be around the same value. Her case can be seen in the graphic as point “A”.

- Figure 2.0 shows the situation in which the special diet is followed by the patients. Assuming that every individual in the treated group followed the diet correctlly and that everyone experienced a 20 mg/dl change in their cholesterol level, all points to the right of the cut- off will drop in the vertical axis for 20 points, and all other points in the control group will remain the same. Mrs. Barillas was part of the treatment group given her high level of cholesterol, therefore after 6 months we would expect her to be at 230 mg/dl, even if she is not at the normal level yet, her initial level has decreased and may continue doing so if the treatment is kept longer. Her dot can be found again in Figure 2.0.

- The dashed line in the figure, shows the line that would be expected if there was no special diet, and the plot will then look exactly like the one in figure 1.0.

Although the inferences drawn from a regression discontinuity design are almost as valid as those from a randomized experiment, the conclusion validity is lower. - The RD requires many more subjects than a randomized trial to achieve equal power, and as the cutting point becomes extreme, power is further decreased. - The more uneven the sizes of the treatment groups, the lower the power

The Threat ?: Finally, it is useful to look at a possible threat to internal validity that this design could have, and how Dr.Penagos avoided it. One of the social interaction threats to validity is called "compensatory rivalry", and happens when the control group knows that the treatment group is receiving something special that will help them and they are not. In this case, if the no diet group knows about the low diet one, they might feel that they too have to follow a low diet to show the doctor that they can do it without his help. Since Dr.Penagos has 3 clinics located in different zones around the capital city, he decided to avoid this threat , by selecting people from one of the zones for the control group, and people from another zone for the experimental group. This way, he reduced the chance that people knew each other or saw each other in the clinic, and talked about the treatment; pretty smart ha ?

Back to Project Gallery