If you are interested in the school effect in addition to the instruction methods, the Remedial Mathematics Program study is to be 2 (instruction methods) * 2 (performance of exam) * 2 (schools area). These data comprise a set of two 2*2 tables like shown in <Table 2>.
| School Area | Instruction | Pass | Fail | Total |
| Urban | CAI | 34 | 6 | 40 |
| Urban | Tutoring | 18 | 2 | 20 |
| Rural | CAI | 21 | 4 | 25 |
| Rural | Tutoring | 14 | 1 | 15 |
The interest in this study is in whether there are overall differences in the rate of passing the examination. Because the student populations are from different areas, it is concerned that some potential influences of school areas on the performance. By including the school area variable, the researcher can examine associations between the instruction methods and the exam performance while adjusting (controlling) for the effect of school areas.
In this case, the school area variable works as a control variable, so the researcher can control the school area effect that might influence the relationship between the instruction method effect and the exam performance.
Including control variables in categorical data analysis requires more data analysis than the 2*2 contingency table analysis. In the 2*2 table, the chi-square statistic is calculated to test the association between the explanatory and the response variable; in addition, the difference of proportions, the relative risk, the odds ratio are calculated to measure the strength of association. In 2*2*2 table, where the control variable is included, three other test statistics are estimated to investigate an overall association, which are Cochran-Mantel-Haenszel test, Mantel-Haenszel test, and Breslow-Day test.
First, the Cochran-Mantel-Haenszel statistic assumes a common odds ratio and test the null hypothesis that X and Y are conditionally independent, given Z.
In the Remedial Math Program example, the CMH test evaluates whether the conditional odds ratio of the instruction method and the performance in urban schools and that in rural schools equals 1. The small p-value of the CMH (p<.05) means that X (instruction method) and Y(perfomance) are not conditionally independent, controlling for Z (school area).
In short, the purpose of the CMH is to test whether the response is conditionally independent of the explanatory variable when adjusting for the control variable.
Second, the Mantel-Haenszel test measures the strength of association by estimating the common odds ratio. In the 2*2 table, one odds ratio explains the odds of success proportion in row 1 and those of row 2. On the other hand, in the 2*2*2 table, there are two odds ratio, therefore the 2*2*2 table requires to calculate the overall odds ratio to measure the strength of association.
In the Remedial Math Program example, the MH estimates the average conditional association between the instruction method and the performance.
In short, the purpose of the MH is to estimate the average conditional association between the explanatory and the response variable.
Third, the Breslow-Day statistic tests the null hypothesis of homogeneous odds ratio, which means it tests whether the odds ratio between X and Y is the same as in different Z categories. It is a test of homogeneous association.
In the Remedial Math Program example, the B-D tests whether the odds ratio between the instruction methods and the performance in urban schools and rural school area are equal. The B-D statistic tests whether the true odds ratio is the same in both urban and rural school area. If the B-D statistic accepts the null hypothesis of the homogeneity of the odds ratio (high p-value), it is possible to summarize the conditional association by a single odds ratio. The low p-value of the B-D means that there is no homogeneous association between urban and rural school area, so it is not possible to summarize their association with one odds ratio.
In short, when there are more than two explanatory variables (usually one is the explanatory variable (X) and the other is the control variable(Z)), three tests should be done to test the conditional independence of X and Y (CMH test), and to estimate the strength of its association (M-H test), and to test the homogeneity of the odds ratio (B-D test).
Copyright © 1997, Hee-Jae Cho