Introduction
In recent years, the integration of artificial intelligence (AI) into various facets of health care has promised to revolutionize clinical practices and patient care. One area experiencing significant transformation is clinical documentation, where AI-powered tools are increasingly used to automate the process of documenting patient encounters.1,2 These tools leverage advanced machine learning algorithms and natural language processing capabilities to capture and summarize conversations, thereby streamlining workflows and potentially improving the accuracy of captured information compared with clinician recall. By alleviating administrative burdens and enabling clinicians to focus on patient care, AI-powered clinical documentation tools could play a role in mitigating physician burnout and making healthcare delivery more efficient.3,4
Nuance’s Dragon Ambient eXperience (DAX) Copilot is an electronic health record (EHR)–integrated AI-enabled scribe software. It synthesizes a preliminary outpatient clinical note by “listening” to the conversation between a clinician and a patient during their visit. Starting in June 2023, Atrium Health conducted a rigorous outcomes evaluation of using DAX in primary care settings, to determine whether using DAX improves efficiency for clinicians (as measured by EHR use metrics) and financial performance for the health system. A separately reported substudy evaluated the effects on clinician-reported experience outcomes.5,6
The New England Journal of Medicine reports that Ambient AI scribing does not markedly improve efficiency in electronic health record transcribing.
In this evaluation, we found no statistically significant differences in EHR-related and financial metrics between DAX users and the control group. However, exploratory results suggested that modest reductions in note time could result from using DAX at a high utilization level or deploying DAX to select clinician subgroups. Taken as a whole, these findings suggest that AI-enabled documentation’s efficiencies may translate to decreased markers of burnout for a subset of clinicians, and perhaps more broadly when the implementation of DAX achieves a higher adoption level. However, widespread implementation of DAX in its current form is unlikely to generate appreciable gains for healthcare systems looking to increase productivity.
These results fly in the face of common sense to me. The study was limited to the use of NLP for the EHR. There are many other uses of artificial intelligence where it may be productive. These uses must. also be studied.
The following tables reveal the results
Characteristics of DAX Copilot Participants.
Linear Mixed Models on EHR Use Metrics and Financial Metrics.
Linear Mixed Models on EHR Use Metrics and Financial Metrics by Patient Volume.
Participants
This study enrolled 238 clinicians specializing in family medicine, internal medicine, and general pediatrics (including physicians and advanced practice providers) from outpatient clinics in North Carolina and Georgia. The intervention group was stratified into five waves between June and August 2023 based on their clinic locations. Before their accounts were activated, clinicians underwent a 1-hour training session on DAX. A control group not utilizing DAX was also recruited in five waves using two methods aimed at matching practice locations and specialties with the intervention group: (1) identification and encouragement of clinicians by service-line leaders to participate as controls; and (2) inclusion of clinicians who initially showed interest in DAX but later opted out (Fig. 1). Among all DAX users, clinicians who transferred more than 25% of their DAX notes to EHR system were defined as active DAX users, and those who transferred more than 60% of their DAX notes were considered high DAX users.8 Clinicians were excluded from the analysis for the following reasons: they were preceptors (n=11), they were identified as DAX participants and either never opened DAX or did not open DAX after their training date (n=10), or their age was missing (n=2). The final analytic sample included 112 clinicians in the intervention group and 103 clinicians in the control group. (See Table 1.)
Two sets of primary outcomes were assessed: EHR use metrics and financial metrics. EHR use metrics include time in EHR (EHR-Time8), work time outside of work (WOW8), time in note (Note-Time8), completed appointment rate, same-day closure rate, and note length. Financial metrics include gross revenue per visit and work relative value units (wRVUs) per visit. The index date was defined as the start date of using DAX in the intervention group and the first date of each wave for the control group. Our quantitative endpoints measured 180 days post-index date among both groups. EHR-Time8, WOW8, and Note-Time8 used 8 hours of scheduled patient time to normalize time spent in the EHR as described by the American Medical Association (AMA).9 The rest of the metrics were analyzed using nonnormalized data
Discussion
In this evaluation, we found no statistically significant differences in EHR-related and financial metrics between DAX users and the control group. However, exploratory results suggested that modest reductions in note time could result from using DAX at a high utilization level or deploying DAX to select clinician subgroups. Taken as a whole, these findings suggest that AI-enabled documentation’s efficiencies may translate to decreased markers of burnout for a subset of clinicians, and perhaps more broadly when the implementation of DAX achieves a higher adoption level. However, widespread implementation of DAX in its current form is unlikely to generate appreciable gains for healthcare systems looking to increase productivity.
To our knowledge, this is the first study to investigate whether this ambient-listening AI tool improves both efficiency and financial metrics from a healthcare system standpoint. These findings have important and timely implications as healthcare systems weigh the cost of adopting new technology. Indeed, the hype and novelty of ambient-listening AI tools have outpaced the evidence to support or refute claims that these tools are transformational in terms of time savings and efficiency. Consequently, healthcare systems, which already operate on small margins in hypercompetitive environments, run the risk of overpaying and not realizing the expected benefits. This study also contributes to the evidence-based incorporation of EHR use metrics into the evaluation of emerging technology and creates a foundational standard for comparison of outcomes across future studies.
There are several possible explanations for why we did not see large improvements in either EHR time or financial metrics. For example, while overall note time decreased, clinicians may have simply repurposed that time for other EHR activity they might otherwise not have done. In terms of financial metrics, clinicians with more free time may choose to use that time for other clinical activities or to leave work sooner. While our primary outcome found modest overall time saved (as viewed through a health system lens); our analysis identified subgroups among DAX users that saved substantial time. For example, 18% of participants saw a reduction of more than 1 hour a day in the EHR. While exploratory and unadjusted, these findings raise additional questions for future research to explore which user subgroups might achieve disproportionate benefits from using DAX. Additionally, a subset of clinicians noted that the documentation tool did indeed save them time; but rather than seeing more patients, these time savings allowed them to sleep more, spend less time working at home, and make existing encounters more focused and personal.5 Future research will need to evaluate whether modest time savings can add patient capacity or improve the quality of care for existing patients, to support the expense of adding new technology. Additionally, in our organization, ambulatory clinicians are responsible for determining the level of service for billing encounters. Clinicians indicated that DAX captures more details from the clinician-patient interaction and gives clinicians documentary support for all the condition management they performed, often resulting in more comprehensive notes.5 With the expansion to a larger cohort of users and over a longer time frame, this greater comprehensiveness could lead to more complete billing for services rendered, but additional study will be required to track its financial impact.
No comments:
Post a Comment