The digital health space refers to the integration of technology and health care services to improve the overall quality of health care delivery. It encompasses a wide range of innovative and emerging technologies such as wearables, telehealth, artificial intelligence, mobile health, and electronic health records (EHRs). The digital health space offers numerous benefits such as improved patient outcomes, increased access to health care, reduced costs, and improved communication and collaboration between patients and health care providers. For example, patients can now monitor their vital signs such as blood pressure and glucose levels from home using wearable devices and share the data with their doctors in real-time. Telehealth technology allows patients to consult with their health care providers remotely without having to travel to the hospital, making health care more accessible, particularly in remote or rural areas. Artificial intelligence can be used to analyze vast amounts of patient data to identify patterns, predict outcomes, and provide personalized treatment recommendations. Overall, the digital health space is rapidly evolving, and the integration of technology in health

Sunday, January 15, 2023

Google Launches MedPaLM—the AI-Based Healthcare Answer System

Google Launches MedPaLM—the AI-Based Healthcare Answer System

Google Research and DeepMind have launched MedPaLM, an open-sourced large language model platform that is geared toward the medical domain.

According to Interesting Engineering, “It is meant to generate safe and helpful answers in the medical field. It combines HealthSearchQA, a new free-response dataset of medical questions sought online, with six existing open-question answering datasets covering professional medical exams, research, and consumer queries.

MedPaLM addresses multiple-choice questions and questions posed by medical professionals and non-professionals through the delivery of various datasets. These datasets come from MedQA, MedMCQA, PubMedQA, LiveQA, MedicationQA, and MMLU. A new dataset of curated, frequently searched medical inquiries called HealthSearchQA was also added to improve MultiMedQA.

The HealthsearchQA dataset consists of 3375 frequently asked consumer questions. It was collected by using seed medical diagnoses and their related symptoms. This model was developed on PaLM, a 540 billion parameter LLM, and its instruction-tuned variation Flan-PaLM to evaluate LLMs using MultiMedQA.

Med-PaLM currently claims to perform particularly well especially compared to Flan-PaLM. It still, however, needs to outperform a human medical expert’s judgment. Up to now, a group of healthcare professionals determined that 92.6 percent of the Med-PaLM responses were on par with clinician-generated answers (92.9 percent).

This is surprising as only 61.9 percent of the long-form Flan-PaLM answers were deemed to be in line with doctor assessments. Meanwhile, only 5.8 percent of Med-PaLM answers were deemed to potentially contribute to negative consequences, compared to 6.5 percent of clinician-generated answers and 29.7 percent of Flan-PaLM answers. This means that Med-PaLM replies are much safer...

This isn’t the first time Google ventured into AI-based healthcare. In May of 2019, Google joined up with medical researchers to train its deep learning AI to detect lung cancer in CT scans, performing as well as or better than trained radiologists, achieving just over 94 percent accuracy.

In May of 2021, Google rolled out a diagnostic AI for skin conditions on smartphones, which would allow every smartphone owner to have an idea of what their diagnosis might be. The app did not replace the role of a professional dermatologist, but it was a significant step forward for the field of AI healthcare.”

According to a physician blog published on Medium, “While the MedPaLM model performance was impressive and certainly superior to other NLP models investigated to date, it was still inferior to clinicians, particularly in incorrect retrieval of information (16.9% for MedPaLM vs 3.6% for human clinicians), evidence of incorrect reasoning (10.1% vs 2.1%) and inappropriate/incorrect content of responses (18.7% vs. 1.4%).

Bottomline, a huge step forward both in moving the needle towards a viable LLM that can be used for clinical knowledge, as well as in establishing frameworks to evaluate such models.

The high bar of safety that will be expected of such models before they can be used in practice, along with the fact that we need to investigate and shed light on bias and fairness in their functioning, means more work needs to be done. However, health-data trained LLMs, such as MedPaLM and the amusingly-named GatorTron, are paving the way.”


No comments:

Post a Comment