Bias, Confounding and Interaction in Epidemiology

In an epidemiological study, the results derived from the investigation reflect the real association between the exposure and development of an outcome. However, it’s important to consider that the findings could also result from a different explanation that could arise due to random error, bias, or confounding. This may lead the researcher to false results and conclusions such as the presence of a statistical association while it doesnโ€™t exist and vice-versa. Notably, these effects of chance, bias, and confounding are mostly found to be prevalent in observational study designs. Therefore, it is paramount to consider these factors during the design and analysis phases to reduce their impact in an epidemiological study.

Bias, Confounding and Interaction in Epidemiology

Possible errors in measurement due to bias can occur at multiple points in an epidemiological investigation and affect both the internal and external validity of the results. Research bias, Confounding variables, and the interaction of variables also influence the establishment and determination of the extent of association and causation in the study.

In this context, the researcher, epidemiologist, and public health personnel should be alarmed to reduce or avoid bias to ensure the reliability and validity of the research findings.ย 

Interesting Science Videos

What is Bias in Epidemiology?

In observational epidemiological studies, bias refers to the systematic error in the study design, data collection, and analysis that impacts the observed estimation of effects in an exposure and an outcome of interest. It results from systematic variation that causes systematic error in measuring association and interaction.

Types of Bias

More than fifty types of bias are reported to be occurring in epidemiology studies due to various errors that arise from the inception of the investigation till the reporting of the results. However, the most common forms of bias found in epidemiological studies are described below: 

Selection Bias

A systematic error that occurs during the selection, identification, or screening of the study population according to the exposure and health outcome is selection bias. The study lacks external validity due to selection bias and derives false conclusions about the research hypothesis that has been set. Hence, the results become irrelevant to any other population and will not represent the true relationship with accuracy.

This type of bias is introduced when some characteristics of people included in the study differ from the population to which the study’s findings are intended to be applied. Those characteristics are either associated with the outcome or exposure under study. Commonly all types of selection bias involve a situation where the relationship between the exposure and outcome varies between the participants included in the research and those who are eligible but not part of the study.

In case-control studies: Considered a common problem in case-control studies, selection bias results in non-comparability between cases and controls. As controls are the representatives of the same population as the cases in this study design, errors may arise when the selected population for controls does not truly represent the population that produced the cases. 

In cohort studies: In cohort studies, the selection of the exposed and unexposed groups takes place before the outcome of interest is developed, hence selection bias is less likely to occur.  Nonetheless, bias may occur if there is variation in the follow-up or identification of cases across different exposure categories. 

In randomized trials: Theoretically, participants are randomly allocated to the study groups in randomized trials study design, therefore selection bias is less of a problem that could occur. However, withdrawals and refusals from participants can influence the results if the reasons are related to exposure or the outcome.

Information bias

The type of error that occurs during the period of data collection is information bias or measurement bias. It brings deviations in the measurement of effects due to inaccurate measurements and classification of key variables such as exposure, outcome, or confounders.

Misclassification bias

When people are categorized into the incorrect classification groups with respect to their exposure and outcome, it is called misclassification bias. In this regard, exposed subjects could be classified as unexposed and vice-versa thereby resulting in inaccurate specificity and sensitivity to detect exposure and effect. It can also be due to missing data, or random errors in data entry. The types of misclassification bias are:ย 

Differential misclassification bias

The bias resulting from the different misclassifications in the groups to be compared is said to be differential misclassification bias. In this, the misclassification of one category (exposure/outcome) is related to the other (outcome/exposure) of the participant.

Non-differential misclassification bias

The bias resulting when misclassifications are the same in the groups to be compared is said to be differential misclassification bias. In this, the misclassification of one category (exposure/outcome) is unrelated to the other (outcome/exposure) of the participant.ย 

Detection Bias

In studies with follow-ups such as cohort and clinical trials, detection bias is more likely to occur. In trials, it occurs when there are variations in the method of collecting outcome information or in the process of verifying outcomes within different groups. It affects the size of the effect either by overestimating or underestimating it. For example, Men with larger prostates have difficulty being accurately diagnosed with prostate cancer through biopsy, leading to a decreased likelihood of accurate diagnosis. As a result, the true association between obesity and the risk of prostate cancer may be underestimated.

Interviewer or Observer Bias

Various factors in the study contribute to the interviewer/observer bias. It arises majorly due to inadequacy in examining the exposure history between cases and controls and inadequate measurement of outcomes between exposed and unexposed groups. Similarly, the information about the hypothesis, the medium of interviewing, prioritizing one question over the other, and exposure-outcome status including intervention affects the data recording, and so on. For example, when a group of the diabetic adult population is given a new drug treatment and another group is provided with the usual drug treatment, bias occurs when the investigator/ observer becomes more careful in evaluating the group receiving the new drug treatment than the other group.ย 

Recall Bias

The type of information bias commonly occurring in case-control studies is the recall bias. When the causes of the health outcome are influenced by the presence of the disease or the exposure influences the cause under consideration or when the individual is already aware of the treatment they would be receiving in a trial may influence the study and result in biases. Recall bias arises due to the inaccuracy of recall between cases and non-cases or between exposed and unexposed groups. Individuals who have experienced a health problem may recall past exposures due to their health concerns. Meanwhile, individuals who were exposed to a certain factor in a study may be more likely to report symptoms or health outcomes, either accurately or with exaggeration, due to their awareness of the exposure.

Reporting Bias

When the answers of participants are influenced by the researcher to obtain desired results or the inclusion of sensitive questions regarding any unacceptable disease, socially undesirable behaviors, and family information may affect the way the participant answers and reacts, thereby resulting in reporting bias.

What is Confounding in Epidemiology?

The term confounding is defined as a situation of distortion in which the observed measures of association between an exposure and outcome are misreported due to the presence of a third, external variable, also called a confounder. The occurrence of a confounding factor can cause significant distortion, potentially changing the direction of the effect. It is of two types: Positive confounding when the observed association is inclined away from the null and Negative confounding when the association is inclined towards the null. 

Confounding Variable

A factor having an association with both the dependent variable or the disease and the independent variable or the factor being studied is called a confounder or confounding variable. It impacts the risk of disease, distorting the effects of other variables on the disease under study. When the presence of such an undesired variable predicts the effect, the study fails to provide the true association between exposure and the result, exaggerating or diluting the actual relationship that exists between the variables under study. Various factors such as individual age, gender, lifestyle, socioeconomic status, ethnic group, etc that have direct causal links with the health outcome are potential confounders. However, a factor is considered a confounder when it meets the following three criteria: 

  1. It should not be a factor resulting from the exposure that leads to the disease i.e., it does not belong to the causal pathway.ย 
  2. It should be associated with both dependent and independent variables i.e., the factor that causes the exposure or is related but does not cause the exposure, however has to cause the outcome.ย 
  3. The distribution and influence of the variable should be unequal in two comparison groups.

For example, a study with the hypothesis that coffee drinkers are more prone to heart disease than the ones who donโ€™t drink coffee could be influenced by a third factor smoking. Coffee drinkers may be more habitual to smoking than non-coffee drinkers, here, smoking is a confounder influencing the association between the disease and coffee drinking habit. Therefore, heart disease might be an outcome of smoking rather than coffee. 

Effect Modification and Interaction

Contrary to bias and confounding, effect modification is a biological phenomenon that refers to a true causal effect in which one exposure variable modifies the impact of another exposure variable on a specific outcome. When effect modification is observed, different population groups have different risk estimates. These two terms are similar to each other and are used interchangeably, however, they are also defined as different concepts. Interaction is a statistical phenomenon that occurs when the combined impact of a risk factor and a confounder is larger than the expected impact based on their individual effects. It occurs when the presence of the third variable controls the effect (magnitude or direction) of an association between two variables. For example, a drug used as a treatment for viral diseases may be functional in adults, however, when used in children becomes ineffective. Here, the effectiveness of a drug is modified by the age factor. Analyzing the associations at every level of the third variable is used as an efficient approach to dealing with interaction. 

Strategies to minimize bias and confounding

The occurrence of errors and bias leads to inaccuracies in measurements of association and is common in all epidemiological studies. However, its impacts on the external and internal validity make the study useless or useful only with great caution. To maintain the specificity, reliability, and accuracy of the study, various strategies should be implemented to minimize bias and confounding. Various ways to minimize bias are as follows:

  • Development of well-standardized protocols handled by trained interviewers and researchers,ย 
  • Use of standard questionnaires with appropriate close-ended questions with specific response options, and consistency in the level of questioning for both comparison groups,ย 
  • Verification of obtained data by tallying with pre-existing documentation and records or evaluating biomarkers,ย 
  • Conducting pilot studies to avoid problems in questionnaires and other measurement tools,ย 
  • Estimating the likelihood of misclassification bias to examine the occurrence of biasย 

Similarly, there are several methods to reduce confounding at both the design phase and data analysis phase of the study. The methods used at the design stage are briefly described below: 

  • Randomization: An ideal method used in clinical trials, randomization involves the random allocation of participants into groups with an equal distribution of variables to limit the potential confounders.ย 
  • Restriction: Participation in the study is reduced to individuals who are alike with respect to the confounding factor.
  • Matching:ย  The controls are selected in such a way that the presence of potential confounders is made similar to that in the cases. It is either done by pair matching or frequency matching.ย 

The methods used at the analysis phase are briefly described below: 

  • Stratification: It involves the evaluation of the association between exposure and outcome within different levels of the confounder such as age or gender.ย 
  • Multivariable analysis: It involves statistical modeling to limit more than one confounding variable at the same time followed by the evaluation of each confounder and its effects.ย 
  • Standardization: It involves the use of a standard reference population to neutralize the effect of confounders between the study groups.ย 

References

  1. Porche, D. J. (2024, January 6). Epidemiologic design bias, confounders, and interaction. Retrieved from https://connect.springerpub.com/content/book/978-0-8261-8514-3/part/part03/chapter/ch14
  2. Chapter 4. Measurement error and bias | The BMJ. (2020, October 28). Retrieved from https://www.bmj.com/about-bmj/resources-readers/publications/epidemiology-uninitiated/4-measurement-error-and-bias
  3. Bias. (n.d.). Retrieved from https://sphweb.bumc.bu.edu/otlt/mph-modules/ep/ep713_bias/ep713_bias_print.html
  4. Biases and confounding | Health knowledge. (n.d.). Retrieved from https://www.healthknowledge.org.uk/public-health-textbook/research-methods/1a-epidemiology/biases
  5. Baker, C. (2023, December 1). The Wrecking Ball: Bias, confounding, interaction and effect modification. Retrieved from https://pressbooks.lib.vt.edu/epidemiology/chapter/the-wrecking-ball-bias-confounding-interaction-and-effect-modification/
  6. Detection bias – Catalog of Bias. (2023, April 17). Retrieved from https://catalogofbias.org/biases/detection-bias/
  7. Delgado-Rodriguez, M. (2004). Bias. Journal of Epidemiology & Community Health. 58(8), 635โ€“641. DOI:10.1136/jech.2003.008466
  8. Confounding in epidemiological studies | Health Knowledge. (n.d.). Retrieved from https://www.healthknowledge.org.uk/node/803
  9. Thomas, L. (2023, June 22). Confounding variables | Definition, examples & controls. Retrieved from https://www.scribbr.com/methodology/confounding-variables/
  10. Bovbjerg, M. L. (2020, October 1). Bias. Retrieved from https://open.oregonstate.education/epidemiology/chapter/bias/
  11. Tulchinsky, T.H. & Varavikova, E.A. (2014). Chapter 3 – Measuring, Monitoring, and Evaluating the Health of a Population. The New Public Health. 3, 91-147. DOI: 10.1016/B978-0-12-415766-8.00003-3

About Author

Photo of author

Dipika Shrestha

Dipika Shrestha is a BSc. Microbiology graduate from St. Xavier's College, Kathmandu. She has a strong grounding in academic research and writing. Over the years, through her involvement in research, she has developed an interest in Epidemiology, Antimicrobial Resistance (AMR), and Public Health. She is passionate about contributing to scientific advancements and leveraging her skills to drive impactful results to build a sustainable community.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.