Current ambient clinical intelligence technology is not capable of properly recognizing non-lexical conversational sounds (NLCS), particularly those that convey clinically relevant patient information, for EHR documentation, according to a study published in JAMIA.
Ambient clinical documentation technology uses automatic speech recognition (ASR) and natural language processing (NLP) to document patient-provider conversations in the EHR. While the technology is a promising strategy to reduce clinician burden, ensuring the accuracy of EHR documentation is key for patient safety.
Researchers investigated the impact of NLCS, like “Mm-hm” and “Uh-uh,” on ASR performance. Patients and providers commonly use NLCS to convey information in clinical conversations. For instance, a patient may say “Mm-hm” to indicate a “yes” response to the question, “Do you have any allergies to antibiotics?”
Researchers evaluated the performance of two clinical ASR engines (Google Speech-to-Text Clinical Conversation and Amazon Transcribe Medical) for 36 primary care encounters.
The word error rate for all words was about 12 percent for both ASR engines. However, the NLCS word error rate was 40.8 percent and 57.2 percent for Google and Amazon’s engines, respectively.
The word error rate for NLCS that conveyed clinically relevant information was even higher: 94.7 percent and 98.7 percent.
The analysis found that, on average, patients and providers used NLCS more than 30 times per encounter.
“Many NCLS that conveyed clinically relevant information were omitted, and many were substituted with other NLCS or irrelevant words,” the study authors noted. “Such findings are important because, as documented in the literature and demonstrated by our empirical data, NLCS were frequently used by both patients and clinicians to convey important meaning.”
“Some of these NLCS were used to communicate clinically relevant information that, if not properly captured, could result in inaccuracies in clinical documentation and possibly adverse patient safety events,” they wrote.
The study authors noted that the research has several limitations. First, they used transcripts from 36 primary care encounters in evaluating the ASR engines, which was a relatively small sample size. The researchers also only included the primary care setting, which may limit the generalizability of the results.
Additionally, the researchers re-enacted the clinical conversations based on the original transcripts in a professional audio studio.
“Our findings may not reflect the true ASR performance when applied to audio recordings obtained from realistic clinical environments, which are susceptible to multiple dimensions of complications such as background noises, interruptions, different styles of enunciation and intonation used by patients and clinicians, and the possibility that there may be more than two speakers in the room,” they pointed out.
Source: Ehr Intelligence