# Error- Types, Sources, and Control

• All epidemiological studies are mostly attempting to establish the presence or absence of a causal relationship, and the results are an estimate of the actual effect or degree of association.
• All studies are subject to error, which can obscure or minimize the truth- the size and nature of a causal relationship.
• Understanding common errors and the means to reduce them improves the precision of estimates.

Image Source: SPM Physics Form 4/Form 5 Revision Notes

Interesting Science Videos

## Types of Error in Epidemiological Studies

There are two basic types of error in epidemiological studies: random error and systematic error.

## A. Random Error

• Random error or chance refers to the fluctuations around a true value.
• The effect of random error may produce an estimate that is different from the true underlying value.
• Note that the effect of random error may result in either an underestimation or overestimation of the true value.
• It can produce type 1 or type 2 errors.
• Type 1: observing a difference when in truth there is none.
• Type 2: failing to observe a difference when there is one.
• Chance is a random error appearing to cause an association between an exposure and an outcome.
• The random error does not often severely affect the results of a study – if it is truly random, then it is randomly distributed among all groups – exposed and non-exposed, those with and those without the outcome of interest.
• Random error occurs because of biologic variation, sampling error, and measurement error.

## Sources of Random Error

1. Biologic Variation: It refers to the fluctuation in biological processes in the same individual over time.
2. Sampling Error: The part of the total estimation error caused by random influences on who or what is selected for the study.
3. Measurement Error: The error resulting from random fluctuations in measurement.

## Minimizing Random Errors

• Some random errors can be addressed.
• Sampling error can be reduced by increasing the size of a sample population – the more individuals drawn from a population, the more likely it is that the sample will reflect the true composition of that population. There are statistical calculations that will provide optimum sample sizes.
• Measurement error can be minimized in the planning of the study, by ensuring that optimal instruments are used to provide the most accurate measurement of the exposure (for example, alcohol consumption or cigarettes smoked) and the outcomes (i.e. disease, injury, state of health or function).
• The resources required to reduce random error are often balanced by the increased precision in the study results.

## B. Systematic Error

• The systematic error refers to any difference between the true value and the actual value obtained in the study that is not the result of random error.
• It is the use of an invalid measure that misclassifies cases in one direction and misclassifies controls in another.
• This occurs when there is a tendency to produce results that differ in a systematic manner from the true values.
• Systematic error, or bias, is more problematic, as it can significantly affect the validity of a study.
• Error is systematic when it is not randomly distributed between exposed and unexposed subjects in a study.

## Main sources of Systematic Error

1. Selection bias
• Selection bias can result when the selection of subjects into a study or their likelihood of being retained in the study leads to a result that is different from what you would have gotten if you had enrolled the entire target population.
• If one enrolled the entire population and collected accurate data on exposure and outcome, then one could compute the true measure of association. But we generally don’t enroll the entire population; instead, we take samples. This makes it likely for selection bias.
• Selection bias occurs when there is a systematic difference between either:
• Those who participate in the study and those who do not (affecting generalizability) or
• Those in the treatment arm of a study and those in the control group (affecting comparability between groups).
• That is, there are differences in the characteristics between study groups, and those characteristics are related to either the exposure or outcome under investigation. Selection bias can occur for a number of reasons.
1. Information bias
• Information bias results from systematic differences in the way data on exposure or outcome are obtained from the various study groups.
• This may mean that individuals are assigned to the wrong outcome category, leading to an incorrect estimate of the association between exposure and outcome.
• Information bias occurs when information is collected differently between two groups, leading to an error in the conclusion of the association.
• Observer bias may be a result of the investigator’s prior knowledge of the hypothesis under investigation or knowledge of an individual’s exposure or disease status.
• Such information may influence the way information is collected, measured or interpretation by the investigator for each of the study groups.
• Interviewer bias occurs where an interviewer asks leading questions that may systematically influence the responses given by interviewees.
1. Confounding
• The word comes from the Latin confundere meaning to mix together.
• It occurs when an unstudied risk factor is associated with both the study exposure and the outcome, which results in a distortion of the estimated effect of an exposure on an outcome.
• A variable is a confounder if the following conditions are met:
1. It is independently associated with the outcome (i.e. is a risk factor).
2. It is associated with the exposure understudy in the source population.
3. It should not lie on the causal pathway between exposure and disease.

## Minimizing Systematic Errors

• Be purposeful in the study design to minimize the chance of bias. Example: use more than one control group
• Clear definition of the study population
• Explicit case, control, and exposure definitions.
• Define, a priori, who is a case or what constitutes exposure so that there is no overlap
• Set up strict guidelines for data collection Train observers or interviewers to obtain data in the same fashion
• Randomly allocate observers/interviewers data collection assignments
• Use multiple sources of information
• Institute a masking process if appropriate
• Build-in methods to minimize loss to follow-up
• Standardize measurement instruments

## References

1. Park, K. (n.d.). Park’s textbook of preventive and social medicine.
2. Hennekens CH, Buring JE. Epidemiology in Medicine, Lippincott Williams & Wilkins, 1987.
3. Gordis, L. (2014). Epidemiology (Fifth edition.). Philadelphia, PA: Elsevier Saunders.
4. White, F., Stallones, L., & Last, J. M. (2013). Global public health: Ecological foundations. New York, NY: Oxford University Press.
5. http://www.ump.edu.pl/files/8_483_errors_in_epidemiological_studies.pdf
6. https://www.healthknowledge.org.uk/e-learning/epidemiology/practitioners/errors-epidemiological-measurements
7. https://cursos.campusvirtualsp.org/mod/tab/view.php?id=34153&forceview=1