There is a high need to research the effects of such curriculum on students. (Aikenhead, 1994). Some research shows positive effects (Aikenhead, 1994) and still some show negative effects. (Mitchner & Anderson, 1989). Which conclusion is valid? How and which Research Designs were used?
My intention in this paper is to suggest Research Designs Pattern that can be used to provide more valid data. I am explaining it in such a way that it forms a tutorial exercise for teachers. Let's start:
O X O (1)
In Program Evaluation, the situation in (1) represents what is called Non Experiment Reserach Design. It means that you have not assigned students in your class into different groups. It is a design because a program is implemented (intervention) over time (temporal precedence). Basically you are looking at the effects of STS science curriculum relative to students' prior knowledge. The higher their scores on posttest relative to pretest, the more effective is the program. In other words you have ensured that other sources of information about STS issues, like TV Program "Science and Technology Weekend" (CNN), do not account for the observed effects, only STS science curriculum does. You can therefore conclude that there is no History threat in your conclusion. But for the results to be more valid, you also need to account for other threats like maturation, testing, instrumentation, mortality, etc. For example, you cannot convince your critiquecs that if you give more program there will be more effects. To account for all these threats, you need at least variation across two groups. This takes us to the next step to do in your class.
C O X O (2) C O O (3)
where in (2), C represents low achievers and is called treatment group because they are the ones who got treatment this time, first "O" represents their first posttest, X still represents STS science curriculum, and second "O" represents their second posttest scores; in (3), C represents high achievers and is called control group because we are using them to check treatment group, first "O" represents their first posttest and second "O" represents their second posttest scores; and movement from left to right of the eqautions represents two months. Note that in (3), C is re-tested without having re-implementation. (This help improve reliability of the test and is called Test-Retest Method of Reliability). The situation depicted by the equations above represents a type of Research Design called Quasi Experiment Design. In this design, students are assigned to groups according to a certain cut-off value (in this case less than and equal to 50%), their perfomance tested, and data analaysed statistically. Suppose treatment group scores similarly matches control group scores. The regression line for the two groups on a graph which was initially a straight line will now show a discontinuity at cut-off value (50%). Because of this, this design is specifically called Regression Discontinuity Quasi Experiment Design. It ensures that the program is received by the needy (ethical advantage) based on a certain criteria (accountability). Even though it seems that students have been assigned to groups biasly, this does not affect the discontinuity of the regression, meaning that the effect is shown only by improved perfomance of low achievers relative to high achievers. It further shows strong internal validity because two groups in class are compared. Normal growth of students cannot be considered a threat because it is controlled by comparison with the control group, and reliability of instrument is ensured by second testing on control group. But it does not account for the apparent (and not real) regression towards the mean by the two groups (regression towards the mean threat). Also, improved perfomance of the treatment group might be due to rivalry against high achievers (compensatory rivalry, or they could have got even more better than what they did were they not compared with high achievers ( resentfull demoralization ) or you are in fact the one who causes improvement in students because of your ability to motivate them (motivation threat). To account for these threats, you need to expand across the boundaries of your classroom. This takes us to the next step.
N C O X O (2) N C O O (3) N C O X O (4) N C O O (5)
where for both (2) and (3) which are the same as above, starting N represents your whole class which is now called non equivalent control group, in (4), N represents non equivalent treatment group which is your colleaque's class and the rest represents similar things to your class but in now for your colleaque's class. The situation depicted in the equations is called Non Equivalent Seperate Pre-Post Samples Quasi Experiment Design. The N groups are not controlled by you (the researcher). Suppose then the two groups (N's) have done well. You can conclude that the program is effective regardless of who implements it. In other words, this design takes care of motivation, history, maturation, testing, instrumentation, regression to the mean, compensatory rivalry, resentfull demoralization and diffusion or imitation threats. But the positive effect might have been caused by the natural intelligence of the two groups (may be admission tests in your school are good, like in private schools). This introduces a selection bias threat. How can you take care of this threat? In the last step, an attempt is made to address it by moving across the boundaries of your school.
R X O (6) R O (7)
where in (6) R represents treatment group, "X" is STS science curriculum, and "O" posttest; and in (7) R represents control group, and "O" their posttest. This situation is called aTrue Experimental Design. Students have been assigned randomly to groups. Take note that there is no pretest. This is because it is assumed that random assignment ensures probabilistic equivalence (a perfect knowldege of what are the chances of observing differences between the two groups). Clearly, this design takes care of selection bias threat.