Types of Bias in Epidemiology

Any trend in the collection, analysis, interpretation, publication, or review of data that can lead to conclusions that are systematically different from the truth can be termed as bias.
Bias is a major consideration in any type of epidemiologic study design.
It has been defined as “any systematic error in the design, conduct or analysis of a study that results in a mistaken estimate of an exposure’s effect on the risk of disease.”
Bias results from systematic errors in the research methodology.
The effect of bias will be an estimate either above or below the true value, depending on the direction of the systematic error.
The magnitude of the bias is generally difficult to quantify, and limited scope exists for the adjustment of most forms of bias at the analysis stage. As a result, careful consideration and control of the ways in which bias may be introduced during the design and conduct of the study are essential in order to limit the effects on the validity of the study results.

Table of Contents

Interesting Science Videos

Common Types of Bias

Types of bias include selection bias, detection bias, information (observation) bias, misclassification, and recall bias.

Selection bias can result when the selection of subjects into a study or their likelihood of being retained in the study leads to a result that is different from what you would have gotten if you had enrolled the entire target population.
If one enrolled the entire population and collected accurate data on exposure and outcome, then one could compute the true measure of association. But we generally don’t enroll the entire population; instead we take samples. This makes it likely for selection bias.
Selection bias occurs when there is a systematic difference between either:
- Those who participate in the study and those who do not (affecting generalizability) or
- Those in the treatment arm of a study and those in the control group (affecting comparability between groups).
That is, there are differences in the characteristics between study groups, and those characteristics are related to either the exposure or outcome under investigation. Selection bias can occur for a number of reasons.

Detection bias occurs where the way in which outcome information is collected differs between groups.
A test or treatment for a disease may perform differently according to some characteristic of the study participant, which itself may influence the likelihood of disease detection or the effectiveness of the treatment.
Detection bias can occur in trials when groups differ in the way outcome information is collected or the way outcomes are verified.

Information bias results from systematic differences in the way data on exposure or outcome are obtained from the various study groups.
This may mean that individuals are assigned to the wrong outcome category, leading to an incorrect estimate of the association between exposure and outcome.
Information bias occurs when information is collected differently between two groups, leading to an error in the conclusion of the association.
Observer bias may be a result of the investigator’s prior knowledge of the hypothesis under investigation or knowledge of an individual’s exposure or disease status.
Such information may influence the way information is collected, measured or interpretation by the investigator for each of the study groups.
Interviewer bias occurs where an interviewer asks leading questions that may systematically influence the responses given by interviewees.

Misclassification refers to the classification of an individual, a value or an attribute into a category other than that to which it should be assigned.
The misclassification of exposure or disease status can be considered as either differential or non-differential.

a) Non-differential (random) misclassification

This exists when misclassifications of disease status or exposure occur with equal probability in all study participants, regardless of the groups being compared.
That is, the probability of exposure being misclassified as independent of disease status and the probability of disease status being misclassified is independent of exposure status.
Non-differential misclassification increases the similarity between the exposed and non-exposed groups and may result in an underestimate (dilution) of the true strength of an association between exposure and disease.

b) Differential (non-random) misclassification

This occurs when the proportion of subjects being misclassified differs between the study groups. That is, the probability of exposure being misclassified is dependent on disease status, or the probability of disease status being misclassified is dependent on exposure status.
This type of error is considered a more serious problem because it may result in and under- or overestimation of the true association.

In a case-control study data on exposure is collected retrospectively. The quality of the data is therefore determined to a large extent on the patient’s ability to accurately recall past exposures.
Recall bias may occur when the information provided on exposure differs between the cases and controls.
For example, an individual with the outcome under investigation (case) may report their exposure experience differently than an individual without the outcome (control) under investigation.
Recall bias may result in either an underestimate or overestimate of the association between exposure and outcome.

Gordis, L. (2014). Epidemiology (Fifth edition.). Philadelphia, PA: Elsevier Saunders.
White, F., Stallones, L., & Last, J. M. (2013). Global public health: Ecological foundations. New York, NY: Oxford University Press.
https://www.jclinepi.com/article/S0895-4356(05)00340-9/abstract
https://www.healthknowledge.org.uk/public-health-textbook/research-methods/1a-epidemiology/biases
Park, K. (n.d.). Park’s textbook of preventive and social medicine.
http://ocw.jhsph.edu/courses/FundEpiII/PDFs/Lecture18.pdf
http://www.ump.edu.pl/files/8_483_errors_in_epidemiological_studies.pdf
https://jech.bmj.com/content/58/8/635