Experiments, if conducted correctly can enable a better understanding of the relationship between a causal hypothesis and a particular phenomenon of theoretical or practical interest. One of the biggest challenges is deciding which research methodology to use. “Research that tests the adequacy of research methods does not prove which technique is better; it simply provides evidence relating to the potential strengths and limitations of each approach.” (Howard, 1985).
In research and evaluation, a true experimental design (also known as random experimental design), is the preferred method of research. It provides the highest degree of control over an experiment, enabling the researcher the ability to draw causal inferences with a high degree of confidence.
A true experimental design is a design in which subjects are randomly assigned to program and control groups. With this technique, every member of the target population has an equal chance of being selected for the sample. The fact that every member of the target population has an equal chance of being selected for the sample makes this design the strongest method for establishing equivalence between a program and control group.
Quasi-experimental group design differs from true experimental group design by the omission of random assignment of subjects to a program and control group. As a result, you can not be sure that the program and the control group are equivalent.
The use of random experimental design to randomly assign subjects to a program and control group, controls for all threats to internal validity. Issues of internal validity arise when groups in the study are nonequivalent. Your ability as a researcher to say that your treatment caused the effect is compromised.
In most causal hypothesis tests, the central inferential question is whether any observed outcome differences between groups are attributable to the program or instead to some other factor. In order to argue for the internal validity of an inference the analyst must attempt to demonstrate that the program and not some plausible alternative explanation is responsible for the effect. In the literature on internal validity, these plausible alternative explanations or factors are often termed threats" to internal validity" (Trochim, 1997).
Let us consider an instance in which an investigator wishes to determine if a program designed to reduce prejudice is effective. In this instance, the independent variable is a lecture on prejudice for grammar school students. For the dependent measure, the researcher will use a standard self-report test of prejudice. To conduct the study, the researcher selects a group of students from a local grammar school and administers the prejudice questionnaire to all of them. A week later, all the students receive the lecture on prejudice and, after the lecture, again are tested. The next step is to find out whether the prejudice scores collected before the intervention (call them the pretest scores) are substantially higher than scores obtained following the lecture (the posttest scores). The researcher might conclude that, if the posttest responses are lower than the pretest responses, the intervention has reduced subjects' prejudice. As you can see, what the researcher has done is assume that changes in the dependent variable were caused by the introduction of the independent variable. But what possibilities other than the operation of the independent variable on the dependent variable might explain the observed relationship (Campbell & Stanley, 1963)? The section on experimental design explains several such threats to internal validity .
This is an important point to note. The research designs and methods used in an evaluation have a direct effect on whether or not a program is perceived effective. Did the cause really produced the effect or was it some other plausible explanation? If the cause produced the effect, can it be generalized to a different group in another location? These are questions of validity. "The first thing we have to ask is: "validity of what?" When we think about validity in research, most of us think about research components. We might say that a measure is a valid one, or that a valid sample was drawn, or that the design had strong validity. All of those statements are technically incorrect. Measures, samples and designs don't 'have' validity -- only propositions can be said to be valid. Technically, we should say that a measure leads to valid conclusions or that a sample enables valid inferences, and so on. It is a proposition, inference or conclusion that can 'have' validity" (Trochim, 1997).