Research Designs with Categorical Data


1. Research Designs in Terms of Controlling Methods

2. Research Designs in terms of Time Sequences

3. Research Designs in terms of Sampling Methods


1. Research Designs in terms of Controlling Methods

Depending on the controlling method of the researh design, they are categorized into experimental, quasi-experimental, and observational design. The research method is the same as that of the continuous data. There are equivalent designs with continuous data. Only difference is the characteristic of the data, which is categorical data.

Experimental Design

The researcher plans to measure the response variable depending on the explanatory variable. The most important factor in the experimental design is randominization. Clinical trial is one example of the categorical experimental design.

Quasi-Experimental Design

Although the researcher plans to measure the response variable depending on the explanatory variable, there is a lack of randominization in the quasi-experimental design.

Observational Design

The observational study may be either prospective or retrospective design. If the researcher begins to observe, waiting for the results, it is a prospective design. If she gathers data at one time and traces the difference into the past, it is a retrospective design.

<Table 4> Relationships Among Research Designs

  


2. Research Designs in terms of Time Sequences

Prospective Design

In a prospective design, the researcher follows the participants and measures or obseves the behavior of the participants. Depending on the use of randomization, the prospective design is categorized into clinical trials or cohort design. The researcher waits for the future events in both designs.

Retrospective Design

In a retrospective design, the researcher gathers the data at once and classifies the participants simultaneously into the group categories. If there are only two categories such as yes (case) and no (control) group, it is called case-control studies. If there are more than two categories, it is called cross-sectional studies.

 


3. Research Designs in terms of Sampling Methods

Clinical Trial

In clinical trials, the researcher randomly allocates participants to the various groups of interest and measures differences in the future.

For instance, the researcher randomly assigns students to two different math programs. At the end of the semester, she counts the number of students who pass or fail the qualifying math exam in each program.

Pass Fail
New Math Program 178 25
Old Math Program 160 37

The odds ratio is (178*37)/(160*25)=1.6465. The odds of passing in new program is 1.6465 times the odds of passing in old program. It seems that new program is more effective than the old program.

 

Cohort Study

In cohort studies, there is no random assignment. The participants have a right to choose the group they want to join. The researcher measures differences between groups without randominization in the cohort design.

For example, the researcher wants to measure the effect of folic-acid on reducing the risk of stroke among people who have already suffered a stroke. Because of its novelty, she does not want to risk the health of the participants. She recruits volunteers who are willing to take folic-acid. During a follow-up period averaging a couple of years, the number of death due to heart-related disease are counted to measure the effect of folic-acid on the stroke.

Death Survive
Folic-Acid Group 15 672
Vitamine Group 34 682

The odds ratio is (15*682)/(34*672)=.44. The odds of death in Folic-acid group is .44 times the odds of death in Vitamine group. Taking folic-acid seems to reduce the chance of having a stroke again.

There are two important facts to keep in mind when designing the cohort study. First is an ethical problem. When designing the study, it is important to decide whether the treatment may cause long-term side effects. Like the example above, when the medicine is newly invented, and has no research done with human subjects, the cohort design should be considered.

Second is a validity problem. Because of the lack of randominization in the cohort design, its external validity is lower than that of the clinical trial study.

  

Case-Control Study

In case-control studies, the researcher gathers the data at once and then looks into the past of the participants to classify them.

For instance, she wants to know whether smoking status is related to lung cancer. She gathers the data at once and then classifies the participants simultaneously into four categories (smoker-cancer group; nonsmoker-cancer group; smoker-no cancer group; nonsmoker-no cancer group). What she does is to count the number of participants who belong to one of four categories.

Smoking Case (Lung Cancer) Control(No Cancer)
Yes 200 205
No 110 370

The odds ratio is (200*370)/(110*205) = 3.28. The odds of having lung cancer in the smoking group is 3.28 times the odds of having lung cancer in the non-smoking group. Based on this result, smoking seems to be related to lung cancer.

  

Cross-Sectional Study

In cross-sectional studies, the researcher gathers the data at once like case-control studies and then classifies them simultaneously on the classification (more than two categories) and their current responses.

For instance, the researcher wants to know the relationship between mathematics performance and the number of hours spent watching television a day. She gathers information on the average number of hours spent watching TV and the performance of the qualifying mathematics examination. Because she gathers data at once and classifies the students based on the number of hours (more than 2 classifications in the cross-sectional study) and the math performance, it is the cross-sectional design.

Time Pass Fail
Less than l hour a day (Group 1) 20 2
2 hours a day (Group 2) 39 5
3 hours a day (Group 3) 46 7
4 hours a day (Group 4) 57 9
More than 5 hours a day (Group 5) 33 10

In this case, four odds ratios are calcuated. If we regard Group 1 as a baseline group, we can calculate the odds ratios of four groups relative to the baseline group (Group 1). One thing to keep in mind when calculating the odds ratio is that the baseline group is always considered in Row 2 in interpretation. The interpretation is "the odds of success(yes, pass,etc.) in Row 1 (Comparison Group) is 0.** times the odds of success in Row 2 (Baseline group)." What odds ratio calculates is the success rate in Row 1 compared to Row 2 group.

Let's calculate the odds ratio when Group 1 is the baseline. I use the cross-product ratio which is convenient way to calculate the odds ratio.

The odds ratio of Group 1 and Group 2 is (39*2)/(20*5)=0.78. It means that the odds of passing the exam in Group 2 is 0.78 times the odds of passing the exam in Group 1.

The odds ratio of Group 1 and Group 3 is (46*2)/(20*7)=0.66. It means that the odds of passing the exam in Group 3 is 0.66 times the odds of passing the exam in Group 1.

The odds ratio of Group 1 and Group 4 is (57*2)/(20*9)=0.63. It means that the odds of passing the exam in Group 4 is 0.63 times the odds of passing the exam in Group 1.

The odds ratio of Group 1 and Group 5 is (33*2)/(20*10)=0.33. It means that the odds of passing the exam in Group 5 is 0.33 times the odds of passing the exam in Group 1.

It is concluded that the odds of passing the exam is decreased as the number of hours spent watching TV is increased.

If you decide to use Group 5 as the baseline, all other groups are compared with Group 5. In other words, Group 5 is considered to be in Row 2 in interpretation.

 


Go To Home Page

Go To Outline


Copyright © 1997, Hee-Jae Cho