Description: publishedVersion English. Related items Showing items related by title, author, creator and subject. John ; Fitzmaurice, David A. Isabel G. Aslam F. These databases provide a low-cost means of accessing rich longitudinal data on large populations for epidemiologic research.
Not simply a digital version of a paper record , EHRs can be linked to contextual data using geographic information systems GIS and combined with self-reported data to address questions about complex networks of causation. Such work has the potential to evolve epidemiologic theory in the twenty-first century 69 , In this review we describe the nature of EHRs and how they have been used in epidemiologic research. Since its recent inception, EHR data have made considerable contributions to a broad population health scholarship, from infectious disease research to social epidemiology.
We summarize this literature and then contrast traditional and EHR-based studies to highlight specific strengths and weaknesses of each with the goal of informing future research.
EHRs were originally developed for billing purposes. Financial incentives to professionals and hospitals for EHR use are tied to existing and emerging requirements. Requirements include standard capture of vital statistics, an up-to-date problem list, and others relevant to patient engagement and data sharing 34 , The implementation of meaningful use will likely accelerate capture and standardization of data and benefit epidemiologic research Parallel changes have unfolded in other industrialized countries, and current usage ranges from lower levels in China and South Korea , to nearly universal adoption in Australia, New Zealand, and northern Europe Although the focus of this article is primarily on the use of EHR data for research in the United States, we draw on relevant research elsewhere.
Data included in EHRs are intended for clinical and administrative use. As discussed below, these data can be used effectively for research purposes, but doing so requires some caution and creativity.
Unlike standardized primary data collection in epidemiologic research, EHR data are collected for the purposes of the clinical encounter. Rather than being driven by research needs, the data collected are directly influenced by patient health status, by how and when they seek care, and by variation in physician care practices and documentation. Accordingly, the patient and physician, not the researcher, stipulate the amount of time a patient is under observation person-time , which impacts calculation of prevalence, incidence, and risk ratios.
EHRs used by different health systems vary in the number of domains e. Over time, systems tend to add functionality to their EHR and expand the number of domains collected Table 1. Longitudinal research is made possible by using the dates associated with specific EHR entries. Doing so allows researchers to study not only disease onset, but also disease severity and progression. Diagnostic codes warrant special consideration in EHR research. Physicians use codes to depict a patient's condition, to document indications for orders i.
The location of a code in the EHR can also provide useful information.
Image and laboratory order codes indicate what the physician suspects about the patient's condition that requires validation or what the physician knows about the patient e. Even though diagnostic codes provide critical information on an individual's health status, providers may not use them consistently, and the meaning of any given code may vary among providers and across time. EHR-based studies involve predominately case series, nested case-control studies, and prospective and retrospective cohorts. Researchers can use EHR data to rapidly identify cases and assess eligibility for individual or frequency matching in nested case-control studies EHRs capture data on an open cohort in which patients may enter or leave care at any time.
A patient can contribute person-time only if they are under observation and are at risk for the outcome of interest. Researchers may find it difficult to interpret gaps in care in the EHR. When a patient lacks data, one cannot distinguish between patients who have left care, who have been well and have not sought care, or who have missed routine visits for other reasons. This ambiguity in whether patients are under observation is relevant to the person-time documentation required for estimating incidence rates.
If patients enter care before an EHR has been implemented in a given system, some domains or events may not be captured and available for study i. Conversely, if they exit care, EHR data will lack information on events occurring after that time i. Outcomes and exposures. EHRs can be used to define disease onset and outcomes and to determine case and control status on a selected outcome, exposure measures, and covariates. For numerous reasons, the single appearance of a diagnostic code does not necessarily indicate that a patient has a disease. The accuracy of disease definition is often improved by using ICD-9 codes and other information over time and is often better in relation to more severe disease e.
Aspects of the EHR may enhance data validity. For example, alerts, commonly used in clinical decision support , can also be used to notify clinicians of input errors to support real-time data correction. Clinical text is also captured in the EHR, often in a notes section. It includes discharge summaries, treatment plans, and progress notes, which can contain information about patients that is useful for research purposes.
However, this information may be inconsistently recorded. For example, Wasserman et al.
One approach to deal with nuanced clinical text is to use open source natural language processing tools. These can extract text relevant to defining disease stage, severity, and progression or symptoms 6 , , which may not be well captured by diagnostic codes.
For instance, Anderson et al. Disease etiology. Whereas disease status is often well documented in EHRs, disease etiology, including fundamental causes of disease 70 e. Some data are not retained, including, for example, residential addresses over time only the current address is used for billing. Researchers have used health insurance status e. Although data on physical activity and other important behaviors and social risks are not routinely captured 2 , 16 , the Institute of Medicine has recommended that these and other domains be integrated into routine EHR data collection, including four existing i.
Researchers have applied extract, transform, and load algorithms to EHRs to assemble study populations from a variety of settings Table 2. The most successful EHR research to date has used deidentified databases in UK and US health care systems whose patient populations receive most or all of their care within the system. Researchers initially used EHRs for comparative effectiveness and health services research, pharmacoepidemiology and genetics epidemiology [e. These efforts have been summarized elsewhere 12 , 49 , 66 , 81 , 95 , and are not covered in this review.
Researchers can use EHRs to form standard cohorts and to assemble groups of patients with specific diseases. DISTANCE involves 20, patients with diabetes and has addressed wide-ranging issues, including diabetes outcomes among Asians and Pacific Islanders 59 , the impact of neighborhood deprivation on cardiometabolic health indicators 68 , and the relationship of SES to risk of hypoglycemia 9. Researchers from two or more health systems are increasingly collaborating and assembling multisystem cohorts; the HMO Research Network has been a leader in this type of research since With Chronic Hepatitis Cohort data, Mahajan et al.
The CPRD, which gathers data from more than UK general practitioners, has data on more than 5 million active pediatric and adult patients. Parallel rise in available EHR data and concern about obesity spurred some of the first population health research with EHRs 13 , 52 , 57 , 64 , 72 , 96 , , , Weight and height used to calculate body mass index BMI is recorded during many clinical encounters. Not surprisingly, few studies have focused on cancer 46 , 62 , , , given the availability of cancer registries worldwide. In the following sections, we provide specific examples of EHR research and their major areas of contribution to date.
Researchers have employed large EHR data sets to reevaluate conclusions drawn from smaller studies. For example, many small studies reported positive or inconsistent associations between midlife BMI and later-life dementia.
Qizilbash et al. They found that late preterm birth compared with birth at term was associated with increased respiratory morbidity, but the association was smaller than reported in prior studies Similarly, studies with small samples from fertility clinics had previously linked celiac disease to infertility.
Dhalwani et al. Using 12 months of EHR data on more than , patients from general practitioners 74 , this team found that green space was protective in 15 of 24 disease clusters, including musculoskeletal and neurological clusters with the strongest associations for anxiety and depression, especially among children and individuals of low SES. In a subgroup analysis, Scherrer et al. Rapsomaniki et al. The study was able to provide an adequate sample size to evaluate important subgroups e. Because the health data covered the majority of the UK population, these findings had excellent external validity and were in contrast to prior studies that evaluated fewer CVDs across narrower age and blood pressure ranges.
Rare disease research can also benefit from EHR data, which help alleviate methodological constraints. Thomas et al. Using patients as their own controls, they observed a fourfold increase in risk of pediatric stroke in the first 0—6 months after chickenpox. This study identified avenues for future research on links between infections and vascular injury and their role in stroke.
Reports on the causes of low vision in England and Wales began in Bald Man 1 episode, She always offers to help, no matter how busy she is. Documents Erasmus, Schoombee Clinical Lecturer. Doney , Helen M. Hirji, Mohamed Associate Clinical Professor.
EHR data sets have allowed environmental and social epidemiologists to leverage data on patients distributed across a wide range of physical, built, and social environments. Because patient addresses are routinely checked and updated at each encounter for billing and communication purposes, researchers can readily link geocoded addresses to location-specific data and use GIS to study an individual's proximity to hazards related to disease.
This process can be used to study negative health impacts from both direct exposure, e. Physical environment. EHR studies have evaluated exposures to risks and resources in the physical environment e. Casey et al. They identified significantly increased odds of preterm birth in women exposed to more unconventional natural gas development activity during their pregnancies. Built environment. Studies of the built environment have focused on land use e. Duncan et al. Social environment. Social epidemiology's rich history of studying the influence of neighborhoods and communities on health 75 , has expanded through the use of EHR data.