Study designs

Overview

Teaching: 30 min
Exercises: 20 min
Questions
  • How many epidemiological study designs are you familiar with?

  • What is a difference between experimental and observational studies?

  • What are the weaknesses and strengths for each study design?

Objectives
  • By the end of this episode, you will be able to recall the most common study designs used by epidemiologists today,

  • and to discuss appropriate study designs given a research question.



Introduction

There are several types of study designs in epidemiological research. Each study design represents a different way of harvesting information. The selection of one design over another depends on the particular research question, concerns about validity and efficiency, and practical and ethical considerations. Figure 1. represents the most common study designs used by epidemiologists today


Table 1: Summary of epidemiological study types

Epidemiology study types

Adapted from 1


Experimental studies, also known as trials, investigate the role of some factor or agent in the prevention or treatment of a disease. In this type of study, the investigator assigns individuals to two or more groups that either receive or do not receive the preventive or therapeutic agent. Because experimental studies closely resemble controlled laboratory investigations, they are considered the “gold standard” for producing reliable evidence because little is left to chance. But there’s a growing realization that such research is not perfect, and that many questions simply can’t be studied using this approach. Experimental studies are often infeasible because of difficulties enrolling participants, high costs, and ethical issues.

The two principal types of observational studies are cohort and case–control studies. Additional observational study designs include cross-sectional studies and ecological studies.

Most epidemiological research is conducted using an observational study, which is considered a “natural” experiment because the investigator lets nature take its course. Observational studies take advantage of the fact that people are exposed to noxious and/or healthy substances through their personal habits, occupation, place of residence, and so on. The studies provide information on exposures that occur in natural settings, and they are not limited to preventions and treatments. Furthermore, they do not suffer from the ethical and feasibility issues of experimental studies. For example, although it is unethical to conduct an experimental study of the effect of drinking alcohol on the developing fetus by assigning newly pregnant women to either a drinking or nondrinking group, it is perfectly ethical to conduct an observational study by comparing women who choose to drink during pregnancy with those who decide not.


Note

  • In experimental studies, the investigator assigns individuals to two or more groups. Treatment group receives a preventive or therapeutic agent. Comparison group does not receives the agent, or receive a placebo or another active treatment.
  • In observational studies, the investigator acts as a passive observer, merely letting nature take its course.

Study types

In this module, we shall focus on experimental studies, as well as the observational cohort, case-control, and cross-sectional studies.


Figure 1: Overview of common epidemiological study types

Epidemiology study types

1. Experimental studies (trials)

An experimental study, also known as a trial, investigates the role of some agent in the prevention or treatment of a disease. In this type of study, the investigator assigns individuals to two or more groups that either receive or do not receive the preventive or therapeutic agent. The group that is allocated the agent under study is generally called the treatment group, and the group that is not allocated the agent under study is called the comparison group. Depending on the purpose of the trial, the comparison group may receive no treatment at all, an inactive treatment such as a placebo, or another active treatment.

The active manipulation of the agent by the investigator is the hallmark that distinguishes experimental from observational studies.

Experimental studies are commonly classified by their objective, that is, by whether they investigate a measure that prevents disease occurrence (preventive or prophylactic trial) or a measure that treats an existing condition (therapeutic or clinical trial).


Question

A clinical trial is an experimental study that investigates a measure that prevents diease occurrence. True or false?

Answer

False - a clinical trial is an experimental study that investigates a measure that treats an existing condition.


Hydroxyurea in Sickle Cell Disease: Clinical trials

Hydroxyurea (HU) is a medicine that is used to treat some cancers. But it can also help children and adults who have sickle cell disease. The evidence that HU could be helpful in SCD came from clinical trials that were conducted in the USA in the 1980s 2. Multiple clinical trials have since proven the efficacy and safety of HU in treating SCD patients including the REACH 3 and NOHARM 4 trials in Africa, and the BABY-HUG 5 and HUSOFT 6 trials in the USA.


Selection of Study Population

During the recruitment phase of an experimental study, the study population, which is also called the experimental population, is enrolled on the basis of eligibility criteria that reflect the purpose of the trial as well as scientific, safety, and practical considerations. For example, healthy or high-risk individuals are enrolled in prevention trials, whereas individ uals with specific diseases are enrolled in therapeutic trials. Additional inclusion and exclusion criteria may be used to restrict the study population by factors such as gender and age (See Standadization of rates under the episode “Measuring and comparing disease frequencies”). The study population must include an adequate number of individuals to determine whether there is a true difference between the treatment and comparison groups. An investigator determines how many subjects to include by using formulas that take into account the anticipated difference between the groups, the background rate of the outcome, and the probability of making certain statistical errors (See episode on Measuring and comparing disease frequencies). In general, smaller anticipated differences between the treatment and comparison groups require larger sample sizes.

All eligible and willing individuals must give consent to participate in an experimental study. The process of gaining their agreement is known as informed consent. The investigator must describe the nature and objectives of the study, the tasks required of the participants, and the benefits and risks of participating. Individuals are then assigned to receive one of the two or more treatments being compared. Invesigator can decided whether randomization is appropriate or not. Randomization, “an act of assigning or ordering that is the result of a random process


Question: Select the correct option.

In a trial, the investigator manipulates a therapeutic agent in the

Options

  • (A) agent group
  • (B) comparison group
  • (C) experimental group
  • (D) treament group

Answer

  • agent group
  • comparison group
  • experimental group
  • treament group (D)

Treatment Administration

In the next phase of a trial, the treatments are administered according to a specific protocol. For example, in a therapeutic trial, participants may be asked to take either an active drug or an inactive drug known as a placebo. The purpose of placebos is to match as closely as possible the experience of the comparison group with that of the treatment group. Also, the use of placebos means that study participants and investigators may be prevented from knowing whom receives an active agent or not (masking), which helps to prevent biases.

Maintenance and Assessment of Compliance

All experimental studies require the active involvement and cooperation of participants. Although participants are apprised of the study requirements when they enroll, many fail to follow the protocol exactly as required as the trial proceeds. The failure to observe the requirements of the protocol is known as noncompliance, and this may occur in the treatment group, the comparison group, or both. Reasons for not complying include toxic reactions to the treatment, waning interest, and desire to seek other therapies. Noncompliance is problematic because it results in a smaller difference between the treatment and comparison groups than truly exists, thereby diluting the real effect of a treatment. A run-in period is usually used before enrollment and randomization to ascertain which potential participants are able to comply with the study regimen. During this period, participants are placed on the test or comparison treatment to assess their tolerance and acceptance and to obtain information on compliance.

Ascertaining the Outcomes

During the follow-up stage of an experimental study, the treatment and comparison groups are monitored for the outcomes under study. If the study’s goal is to prevent the occurrence of disease, the outcomes may include the precursors of disease or the first occurrence of disease (i.e., incidence). If the study is investigating a new treatment among individuals who already have a disease, the outcomes may include disease recurrence, symptom improvement, length of survival, or side effects The length of follow-up depends on the particular outcome under study. It can range from a few months to a few decades.

Analysis

The classic analytic approach for an experimental study is known as an intent-to-treat or treatment assignment analysis. In this analysis, all individuals who were randomly allocated to a treatment are analyzed regardless of whether they completed the regimen or received the treatment. An intent-to-treat analysis gives information on the effectiveness of a treatment under everyday practice conditions. The alternative to an intent-to-treat analysis is known as an efficacy analysis, which determines the treatment effects under ideal conditions, such as when participants take the full treatment exactly as directed.

Strengths

-Provides most reliable evidence from clinical research -Randomization offers ability to control confounders -Can conclude causal relationship

Limitations

-Costly and time-consuming -May be limited in generalizability


Question

What is masking in clinical trials, and how is it useful?

Answer

Masking is the act of concealing the true identity of an active agent from the participants and investigators in a clinical trial. It helps prevent biases in the ascertainment of outcomes.


2. Observational studies

2.1. Cohort Studies

A cohort is defined as a group of people with a common characteristic or experience. In a cohort study, healthy subjects are defined according to their exposure status and followed over time to determine the incidence of symptoms, disease, or death. The common characteristic for grouping subjects is their exposure level. Usually, two groups are compared: an exposed and an unexposed group. The unexposed group is called the reference, referent, or comparison group. Cohort study is the term that is typically used to describe an epidemiological investigation that follows groups with common characteristics. Other expressions that are used include follow-up, incidence, or longitudinal study. The term fixed cohort is used when the cohort is formed on the basis of an irrevocable event, such as undergoing a medical procedure. Thus, an individual’s exposure in a fixed cohort does not change over time. A closed cohort is used to describe a fixed cohort with no losses to follow-up. In contrast, a cohort study conducted in an open population, also known as a dynamic population, is defined by exposures that can change over time, such as cigarette smoking.

Timing of cohort studies

Three terms are used to describe the timing of events in a cohort study in relation to the initiation of the study: prospective, retrospective, and ambidirectional. At the initiation of a prospective cohort study, participants are grouped on the basis of past or current exposure and are followed into the future to observe the outcomes of interest. When the study commences, the outcomes have not yet developed, and the investigator must wait for them to occur. At the initiation of a retrospective cohort study, both the exposures and outcomes have already occurred when the study begins. Thus, this type of investigation studies only prior and not future outcomes. An ambidirectional cohort study has both prospective and retrospective components.


Questions Choose the correct option.

In a retrospective cohort study

  • (A) past and current exposures, as well as outcome are not known
  • (B) only outcome and future exposure are known
  • (C) exposure and outcome are known
  • (D) exposures are known but not outcome

Answer

C: exposure and outcome are known


Selection of the exposed population

The choice of the exposed group in a cohort study depends on the hypothesis being tested; the exposure frequency; and feasibility considerations, such as the availability of records and ease of follow-up. Special cohorts are used to study the health effects of rare exposures, such as uncommon workplace chemicals, unusual diets, and uncommon lifestyles. General cohorts are typically assembled for common exposures, such as cigarette smoking and alcohol consumption. These cohorts are often selected from professional groups, such as nurses, or from well-defined geographic areas to facilitate follow-up.

Selection of comparison group

There are three sources for the comparison group in a cohort study: an internal comparison group, the general population, and a comparison cohort. An internal comparison group consists of unexposed members of the same cohort. An internal comparison group should be used whenever possible because its characteristics will be the most similar to the exposed group. The general population is used for comparison when it is not possible to find a comparable internal comparison group. The general population comparison is based on preexisting population data on disease incidence and mortality. A comparison cohort consists of members of another cohort. It is the least desirable option because the comparison cohort, although not exposed to the exposure under study, is often exposed to other potentially harmful substances and therefore the results can be difficult to interpret.

Sources of information in cohort studies

Cohort study investigators typically rely on many sources for information on exposures, outcomes, and other key variables. They include medical and employment records, interviews, direct physical examinations, laboratory tests, biological specimens, and environmental monitoring. Some of these sources are preexisting, and others are designed specifically for the study. Because each type of source has advantages and disadvantages, investigators often use several sources to piece together all the necessary information.

Approaches to follow-up

Loss to follow-up occurs either when the participant no longer wishes to take part in the study or he or she cannot be located. Because high rates of follow-up are critical to the success of a cohort study, investigators have developed many methods to maximize retention and trace study participants. For prospective cohort studies, strategies include collection of information (such as full name, Social Security number, and date of birth) that helps locate participants as the study progresses. In addition, regular contact is recommended for participants in prospective studies.

When participants are truly lost to follow-up, investigators employ a number of strategies. They include sending letters to the last known address with “Address Correction requested”; checking telephone directories; directory assistance; Internet resources, such as whitepages.com; vital statistics records; driver’s license rosters; and voter registration records and contacting relatives, friends, and physicians identified at baseline.

Analysis

The primary objective of analyzing cohort study data is to compare the occurrence of symptoms, disease, and death in the exposed and unexposed groups. If it is not possible to find a completely unexposed group to serve as the comparison, then the least exposed group is used

Strengths

-Establishes time sequence of events -Several outcomes can be assessed -Allows assessment of incidence and natural history of disease -Yield incidence, relative risk, attributable risk

Limitations

-Large samples often required -May not be feasible in terms of time and money -Not feasible with rare outcomes- -Potential bias caused by loss to follow-up

2.2. Case-Control Studies

The case–control study has traditionally been viewed as an inferior alternative to the cohort study. In the traditional view, subjects are selected on the basis of whether they have or do not have the disease. An individual who has the disease is termed a case, and someone who does not have the disease is termed a control. The exposure histories of cases and controls are then obtained and compared. Thus, the central feature of the traditional view is the comparison of the exposure histories of the cases and controls. This differs from the logic of experimental and cohort study designs in which the key comparison is disease incidence between the exposed and unexposed (or least exposed) groups. More specifically, a case–control study is a method of sampling a population in which researchers identify and enroll cases of disease and a sample of the source population that gave rise to the cases.

Selection of Cases and controls

The first step in the selection of cases for a case–control study is the formulation of a disease or case definition. A case definition is usually based on a combination of signs and symptoms, physical and pathological examinations, and results of diagnostic tests. Once investigators have created a case definition, they can begin case identification and enrollment. Typical sources for identifying cases are hospital or clinic patient rosters; death certificates; special surveys; and reporting systems, such as cancer or birth defects registries. Another important issue in selecting cases is whether they should be incident or prevalent. Researchers who study the causes of disease prefer incident cases because they are usually interested in the factors that lead to developing a disease rather than factors that affect its duration.

Controls are a sample of the population that produced the cases. The guiding principle for the valid selection of controls is that they come from the same base population as the cases. If this condition is met, then a member of the control group who gets the disease under study would end up as a case in the study. This concept is known as “the would criterion,” and its fulfillment is crucial to the validity of a case–control study. Another important principle is that controls must be sampled independently of exposure status. In other words, exposed and unexposed controls should have the same probability of selection. Epidemiologists use several sources for identifying controls in case–control studies. They may sample (1) individuals from the general population, (2) individuals attending a hospital or clinic, (3) friends or relatives identified by the cases, or (4) individuals who have died.

Analysis

Recall that controls are a sample of the population that produced the cases. However, in most instances, the sampling fraction is not known; therefore, the investigator cannot fill in the total population in the margin of a two-by-two table or obtain the rates and risks of disease. Instead, the researcher obtains a number called an odds, which functions as a rate or risk. In a case–control study, epidemiologists typically calculate the odds of being a case among the exposed (\(\frac{a}{b}\)) compared to the odds of being a case among the nonexposed (\(\frac{c}{d}\)). The ratio of these two odds is expressed as follows: \(\frac{\frac{a}{b}}{\frac{c}{d}}\) OR \(\frac{ad}{bc}\)

This ratio, known as the disease odds ratio, provides an estimate of the relative risk just as the incidence rate ratio and cumulative incidence ratio do. Refer to Measuring and comparing disease frequencies for more on this.

Strengths

-Effective for rare outcomes -Compared with cohort study, it requires less time and money -Yields the odds ratio

Limitations

-Limited to one outcome condition -Does not provide incidence, relative risk, or natural history -Less effective than a cohort study at establishing time sequence of events -Potential recall and interviewer bias


Question:

In a case-control study, a highly protective locus is said to have

  • (A) OR « 1
  • (B) OR = 1
  • (C) OR < 1
  • (D) OR > 1

Answer

A: OR « 1


2.3 Cross-sectional studies

A cross-sectional study “examines the relationship between diseases (or other health-related characteristics) and other variables of interest as they exist in a defined population at one particular time. Unlike populations studied in cohort and case–control studies, cross-sectional study populations are commonly selected without regard to exposure or disease status. Cross-sectional studies typically take a snapshot of a population at a single point in time and therefore usually measure the disease prevalence in relation to the exposure prevalence. In other words, current disease status is usually examined in relation to current exposure level.

Strengths

-Control over study population -Control over measurements -Several association between variables can be studied at the same time -Short time period required -Complete data collection -Produces prevalence

Limitations

-No data on the time relationship between exposure and outcome development -Not feasible with rate exposures or outcomes -Does not yield incidence or relative risk/ -No causal relationship can be made

When is it desirable to use a particular study design?

The goal of every epidemiological study is to gather correct and sharply defined data on the relationship between an exposure and a health-related state or an event in a population. The three main study designs represent different ways of gathering this information.

Figure 2. Decision tree for choosing among study designs

Epidemiology study types

Adapted from 1


Quiz

  1. What is the hallmark that distinguishes experimental from observational studies?

  2. Name one example of a trial on SCD in Africa.

  3. What is the treatment in this trial?

  4. In a cross-sectional study, it is possible to tell whether exposure came before disease when exposure is not a changeable characteristic.

    • (A) True
    • (B) False

Answers

  1. Active manipulation of a therapeutic agent (treatment) by an investigator in experimental studies but not observational studies.
  2. REACH/NOHARM trial
  3. Hydroxyurea (HU)
  4. (A) True

Take home

  • Experimental study: active manipulation of a therapeutic agent by an investigator.
  • Observational study: nature takes its course.
  • Cross-sectional studies: most governmental public health surveys.
  • Ecological studies: think of meta-analyses where comparison is made among populations rather than individuals.

References

  1. Ann Aschengrau and George R. Seage III, Essentials of epidemiology in public health, Fourth edition (2020, Jones & Bartlett Learning)  2

  2. Agrawal, R. K., Patel, R. K., Shah, V., Nainiwal, L., & Trivedi, B. (2014). Hydroxyurea in sickle cell disease: drug review. Indian journal of hematology & blood transfusion : an official journal of Indian Society of Hematology and Blood Transfusion, 30(2), 91–96. https://doi.org/10.1007/s12288-013-0261-4 

  3. McGann, P. T., Williams, T. N., Olupot-Olupot, P., Tomlinson, G. A., Lane, A., Luís Reis da Fonseca, J., Kitenge, R., Mochamah, G., Wabwire, H., Stuber, S., Howard, T. A., McElhinney, K., Aygun, B., Latham, T., Santos, B., Tshilolo, L., Ware, R. E., & REACH Investigators (2018). Realizing effectiveness across continents with hydroxyurea: Enrollment and baseline characteristics of the multicenter REACH study in Sub-Saharan Africa. American journal of hematology, 93(4), 537–545. https://doi.org/10.1002/ajh.25034 

  4. Opoka, R. O., Ndugwa, C. M., Latham, T. S., Lane, A., Hume, H. A., Kasirye, P., Hodges, J. S., Ware, R. E., & John, C. C. (2017). Novel use Of Hydroxyurea in an African Region with Malaria (NOHARM): a trial for children with sickle cell anemia. Blood, 130(24), 2585–2593. https://doi.org/10.1182/blood-2017-06-788935 

  5. Wang, W. C., Ware, R. E., Miller, S. T., Iyer, R. V., Casella, J. F., Minniti, C. P., Rana, S., Thornburg, C. D., Rogers, Z. R., Kalpatthi, R. V., Barredo, J. C., Brown, R. C., Sarnaik, S. A., Howard, T. H., Wynn, L. W., Kutlar, A., Armstrong, F. D., Files, B. A., Goldsmith, J. C., Waclawiw, M. A., … BABY HUG investigators (2011). Hydroxycarbamide in very young children with sickle-cell anaemia: a multicentre, randomised, controlled trial (BABY HUG). Lancet (London, England), 377(9778), 1663–1672. https://doi.org/10.1016/S0140-6736(11)60355-3 

  6. Hankins, J. S., Ware, R. E., Rogers, Z. R., Wynn, L. W., Lane, P. A., Scott, J. P., & Wang, W. C. (2005). Long-term hydroxyurea therapy for infants with sickle cell anemia: the HUSOFT extension study. Blood, 106(7), 2269–2275. https://doi.org/10.1182/Blood-2004-12-4973 

Key Points

  • In experimental studies, an investigator actively manipulates a therapeutic agent.

  • Informed consent is critical for experimental studies.

  • In cohort studies, the starting point is exposure and the endpoint is outcome.

  • In case-control studies, the starting point is outcome and endpoint is exposure.

  • Case definition is an important aspect of case-control studies.

  • Cross-sectional studies take a snapshot of disease prevalence in relation to exposure prevalence at a particular time.

  • In ecological studies, the units of analysis are populations rather than individuals.