The application of large language models in bariatric surgery: A scoping review

Figures Abstract Background Exploratory applications of large language models within the specialized field of metabolic and bariatric surgery have begun to emerge. Nevertheless, existing research remains fragmented, lacking comprehensive integration. Objective To conduct a scoping review of studies on the application of large language models in the field of metabolic and bariatric surgery, aiming to provide a reference for clinical practice and future research. Methods This scoping review adhered to the Joanna Briggs Institute methodological framework and followed the preferred reporting items for systematic reviews and meta-Analyses extension for scoping reviews (PRISMA-ScR) guidelines.PubMed, Web of Science, The Cochrane Library, Embase, CINAHL, CNKI, Wanfang, and VIP databases were searched for relevant studies, with the search timeframe from database inception to November 2025. The included literature was summarized and analyzed. Results A total of 21 English-language studies were included. LLMs were primarily applied in scenarios such as patient education and information consultation, clinical decision support, and professional knowledge assessment. While LLMs performed well in information-provision tasks, they showed low consistency with expert opinions in complex clinical tasks such as individualized surgical recommendations. Performance varied across different models, with GPT-4 generally demonstrating superior performance, and domain-specific models showing professional potential. Current research still faces challenges regarding information accuracy, readability, and clinical applicability. Conclusion Large language models hold auxiliary potential in the field of metabolic and bariatric surgery, particularly for knowledge dissemination and patient education. However, their reliability in complex clinical decision-making remains limited. Future efforts should focus on conducting high-quality studies, advancing model specialization and standardized evaluation, and exploring safe and effective human-AI collaboration models. Citation: Guo N, Li X, Li X, Kang C, Gong X, Ji X, et al. (2026) The application of large language models in bariatric surgery: A scoping review. PLoS One 21(6): e0350748. https://doi.org/10.1371/journal.pone.0350748 Editor: Hongyang Ma, Peking University, CHINA Received: March 9, 2026; Accepted: May 18, 2026; Published: June 5, 2026 Copyright: © 2026 Guo et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: All relevant data are within the manuscript and its Supporting Information files. Funding: The author(s) received no specific funding for this work. Competing interests: The authors have declared that no competing interests exist. Introduction The global prevalence of obesity continues to rise [1]. Obesity and its associated medical problems, such as type 2 diabetes, hypertension, and metabolic dysfunction-associated steatohepatitis (MASH), have emerged as major public health challenges that impact population health and increase socioeconomic and healthcare burdens [2,3]. Metabolic and bariatric surgery (MBS) has become an effective intervention for severe obesity, demonstrating not only significant weight loss but also improvement in metabolic parameters and a reduction in the incidence of obesity-related diseases. It is currently one of the most effective approaches for treating severe obesity and related metabolic disorders [4]. In recent years, MBS has been widely adopted in China and entered a phase of rapid development. According to statistics from the Chinese Obesity and Metabolic Surgery Database, the total annual number of MBS procedures in China increased to approximately 37,249 in 2025 [5] reflecting the growing clinical demand and vitality of the specialty. However, alongside the sustained growth in surgical volume and the ongoing standardization of the specialty, metabolic and bariatric surgery continues to face a series of clinical challenges and practical pressures. For instance, in patient selection and surgical decision-making, choosing the most appropriate surgical technique remains a complex and contentious process [6]. Furthermore, due to the complexity of bariatric surgery and its long-term outcomes, patients require clear guidance and continuous support throughout the preoperative, intraoperative, and postoperative phases. This underscores the importance of effective communication, education, and accessible resources for improving patient empowerment and clinical outcomes [7]. These challenges constrain further enhancement in the precision of diagnosis and treatment, as well as the quality of long-term patient management, creating an urgent need for novel methods and tools to provide assistance. In recent years, the rapid advancement of artificial intelligence (AI) technology, particularly represented by Large Language Models (LLMs), offers a new perspective and potential to address these challenges. In the medical field, LLMs have already demonstrated significant potential in enhancing the quality of medical education, assisting clinical diagnosis, supporting decision-making, and promoting patient health management, among other areas [8]. Against this backdrop, exploratory applications of LLMs within the specialized field of metabolic and bariatric surgery have begun to emerge. Nevertheless, existing research remains fragmented, lacking comprehensive integration in terms of coverage across different application scenarios, systematic comparison of different model performances, and synthesis of common challenges and future directions faced by the technology. Therefore, this study aims to systematically review the relevant research on the application of LLMs in the field of metabolic and bariatric surgery through a scoping review methodology. Methods Type of review This study was conducted according to the Joanna Briggs Institute methodology for scoping reviews [9]. Reporting adhered to the preferred reporting items for systematic reviews and meta-analyses extension for scoping reviews (PRISMA-ScR) (Fig 1) [10]. Identifying the research question The specific research questions that guided this review were as follows: (i) In which specific scenarios of bariatric surgery are LLMs applied, and how do they perform? (ii) What are the differences in performance among different LLM models when applied in bariatric surgery? (iii) What evidence do existing studies provide regarding the effectiveness of LLMs in bariatric surgery applications? (iv) What are the main challenges currently faced in these applications, and what are the future directions for development? Search strategy A search was conducted in the electronic databases PubMed, Web of Science, The Cochrane Library, Embase, CINAHL, CNKI, Wan Fang, and VIP database, covering literature in both English and Chinese up to November 2025. Common search fields were used, employing a combination of subject headings and free-text keywords. References were also tracked throughout the review process. The full search strategy is provided in Table 1. Literature inclusion and exclusion criteria Inclusion criteria were determined according to the PCC (population, concept, context) principles [11]：(i) Participants (P): Involving clinical practice, patient management, or medical education in the field of metabolic and bariatric surgery;(ii) Concept (C): The core of the study involves the application of Large Language Models (LLMs), including but not limited to their development, deployment, evaluation, or comparison. Application forms include answering patient inquiries, generating educational materials, providing clinical decision support, etc.;(iii) Context (C): The application scenario is explicitly limited to metabolic and bariatric surgery. Study types are limited to original research such as quantitative studies, qualitative studies, and mixed-methods studies.Exclusion criteria: (i) Study content not directly related to bariatric surgery or the application of LLMs;(ii) Conference abstracts, commentaries, editorials, systematic reviews, literature reviews, case reports, study protocols, guidelines, or consensus statements;(iii) Literature for which the full text is unavailable, data cannot be extracted, or is non-peer-reviewed;(iv) Literature not published in Chinese or English. Study selection After removing duplicates using EndNote X9 software, literature screening was performed by two researchers, strictly following the inclusion and exclusion criteria. The title and abstract were reviewed first, and the full text of studies potentially meeting the inclusion criteria was further examined. Any disagreements were discussed to reach an agreement, or a third party was consulted. Results An initial search yielded 1,130 articles. After removing duplicates, screening titles and abstracts, and excluding articles without accessible full texts, 52 articles remained. Following full-text review, 21 English-language articles were ultimately included. The included studies were conducted in multiple countries, including the United States [12–20] (n = 9), China [21,22] (n = 2), Iran [23] (n = 1), Spain [24,25] (n = 2), Turkey [26,27] (n = 2), Canada [28,29] (n = 2), a multinational collaboration [30] (n = 1), Germany [31] (n = 1), and Brazil [32] (n = 1). The basic characteristics of the included studies are presented in Table 2. Application scenarios of LLMs in bariatric surgery This review included a total of 21 studies. The application of Large Language Models in bariatric surgery primarily focused on the following scenarios: thirteen studies [12–16,19,21,22,26–28,31,32] concentrated on the patient education and information consultation scenario. This mainly involved providing patients with knowledge Q&A related to bariatric surgery, generating patient education materials, and optimizing text readability. This scenario represents the most widely explored area of LLM application, particularly demonstrating potential in improving the accessibility of disease-related information and patient engagement. Four studies [20,24,25,30] explored their application in clinical decision support, primarily focusing on recommendations for surgical techniques. However, consistency between LLMs and clinical guidelines or expert consensus in such tasks was low, with particularly limited performance in complex cases. Two studies [17,29] discussed the application of LLMs in bariatric surgery professional knowledge assessment and examinations, using them to simulate specialty board exam questions and evaluate the grasp of medical knowledge. Research indicated that LLMs performed well on standardized test questions but remained insufficient in clinical reasoning questions. One study [23] focused on medical image processing, attempting to use LLMs for bariatric surgery image recognition and generation. However, the results showed low anatomical accuracy, indicating that they are not yet suitable for clinical or educational purposes. One study [18] explored the development of a domain-specific Large Language Model for bariatric surgery. The specialty model, built by fine-tuning a general-purpose model, outperformed the base model in professional text generation tasks, demonstrating the potential of vertical domain optimization. Performance differences among different LLMs The performance of different LLMs in bariatric surgery tasks showed significant variation. Among the ChatGPT series models, GPT-4 demonstrated higher informational accuracy in most studies and outperformed its predecessor GPT-3.5 in providing detailed, contextualized responses [16]. However, this series of models commonly exhibited issues such as initially low readability [15]and knowledge update lag [21]. The performance of other general-purpose models like Gemini, Bard, and DeepSeek varied: Gemini demonstrated higher caution in some studies, sometimes refusing to answer sensitive medical questions directly, though this could also lead to insufficient completeness in its responses [19], DeepSeek slightly outperformed ChatGPT in text readability but lagged significantly behind in information quality and reliability [22], while Bard and Bing models showed unstable performance across multiple studies [28,29], particularly displaying low appropriateness in clinical advice generation tasks [28]. Furthermore, domain-fine-tuned models such as BariatricSurgery GPT excelled in the accuracy of professional terminology and semantic relevance [18], highlighting the potential of enhancing model specialization through vertical domain optimization. Overall, the differences in model performance depend not only on their underlying architecture and training data but are also closely related to the task type, prompt engineering, and evaluation criteria. Evaluation of LLM application effectiveness in bariatric surgery In terms of accuracy, LLMs performed well in common knowledge Q&A tasks. For instance, the ChatGPT series achieved an accuracy rate of up to 86.8% on common bariatric surgery questions [12]. However, in clinical decision support tasks requiring clinical judgment, their agreement rates with expert opinions or real clinical decisions were typically below 40% [24,25,30], with particularly low matching rates for personalized surgical recommendations [20,24]. Readability remains a significant concern. Studies have pointed out that the average reading level of LLM-generated texts often exceeds the 6th to 8th-grade level recommended by the American Medical Association, mostly falling between the 9th-grade and college levels [15,22]. Although targeted prompts can simplify texts to some extent, bringing the readability level down to grades 6–9 [15], consistently meeting patient-friendly reading standards remains challenging, and some responses suffer from structural verbosity and excessive detail [21]. Regarding empathy and patient satisfaction, a few studies have conducted evaluations. One study found that GPT-4o received significantly higher patient ratings for answer clarity, completeness, and empathy, with 64.9% of patients expressing a preference for AI-generated responses [31]. Another study also noted that ChatGPT scored high on empathy, and patient satisfaction and acceptability of AI responses were good [32]. In terms of information quality, ChatGPT’s response quality generally surpassed that of other models [13,22], but issues such as missing citations and lack of transparency regarding information sources remain prevalent. Domain-specific models demonstrated superior performance in the semantic relevance and professionalism of generated content [18]. In terms of efficiency, LLMs demonstrated a clear advantage, with an average response generation time of only a few seconds, significantly shorter than the time required for clinicians to draft similar content [31], suggesting their potential value in improving clinical workflow efficiency. Discussion Performance variation and capability boundaries across application scenarios Current research indicates a clear divergence in the efficacy of LLMs across different application scenarios within bariatric surgery. In the realm of patient education and information consultation, LLMs, leveraging their robust natural language generation capabilities, can provide patients with comprehensive and accurate knowledge-related responses about surgery. Multiple studies demonstrate that their response quality has reached a relatively high level, with some models achieving accuracy rates exceeding 85% in relevant evaluations [12]. This aligns with the findings of Goudrar et al.[33], suggesting that LLMs can serve as a beneficial supplementary resource for patients seeking information on bariatric surgery and possess the potential to become auxiliary clinical education tools. However, in scenarios requiring clinical judgment, particularly for tasks involving personalized surgical plan recommendations and decision support, the performance of existing models is significantly constrained. Their alignment with clinical guidelines or expert consensus is generally low. Lopez-Gonzalez et al.[24] reported only 34.16% concordance between GPT-4 and hospital algorithm decisions. Several factors contributed: ChatGPT-4’s knowledge is updated only until April 2023, it can only access open-access articles, and it explicitly states it is not designed for professional medical use. Sanchez-Cordero et al.[25] found that even after contextual training with 412 scientific articles, concordance improved only from 20.0% to 25.8%. Notably, ChatGPT tended to mirror the global “average” procedure distribution rather than tailoring recommendations to individual patients. The authors also noted that using a single center’s practice as the gold standard may introduce bias, as surgical choices vary across centers. Kahlon et al.[20] observed only fair agreement between ChatGPT-4 and bariatric surgeons, with moderate inconsistency across two runs. Surgeons can integrate nuanced, patient-specific information that AI cannot fully weigh, highlighting a key limitation. Jazi et al.[30] reported that ChatGPT-4 matched expert consensus in only 30% of complex cases, and gave inconsistent answers in 40% of scenarios. The model failed to recognize critical patient-specific risk factors. This issue extends beyond bariatric surgery. Research by de Menezes Torres et al.[34] on the use of the large language model ChatGPT in oral and maxillofacial surgery highlights its limitations in handling complex clinical decisions and providing personalized recommendations for cases such as oral cancer and orthognathic surgery. In summary, current LLMs lack individualized, context-aware reasoning; exhibit output instability; have outdated knowledge bases and limited access to full evidence; and cannot handle multi-criteria trade-offs. This underscores that while LLMs may be useful as educational tools, they are currently not reliable for autonomous surgical decision-making, especially in complex clinical contexts, highlighting the value of human expertise [26]. This performance variation clearly delineates the current capability boundaries of LLMs: they excel at processing and generating structured medical knowledge but currently lack a comprehensive and personalized perspective for tasks requiring the integration of multi-dimensional information, understanding complex clinical contexts, and performing dynamic reasoning, indicating fundamental limitations remain. Potential value and integration positioning in clinical practice LLMs demonstrate multifaceted supportive potential in bariatric surgery. Firstly, they can significantly enhance clinical workflow efficiency, for instance, by automating the generation of patient education materials and rapidly answering common consultation questions, thereby freeing healthcare professionals from repetitive informational tasks. By improving answer efficiency under physician supervision while maintaining accuracy, LLMs can optimize doctors’ time management and enhance patient satisfaction in bariatric care communication [31]. They can serve as supplementary resources for patient education, clinician assistance, and public health promotion [21]. Secondly, they facilitate the dissemination of health information. Particularly in contexts of unevenly distributed medical resources, they can provide patients with timely and reliable basic medical knowledge, helping to bridge service gaps caused by geographical or economic disparities. A study on LLM applications in ophthalmology showed that large language models like GPT-4 could assist ophthalmologists in distinguishing urgent from routine visits, improving remote ophthalmic triage in low-resource settings [35], thereby offering remote services to patients. More importantly, LLMs performed excellently on metabolic and bariatric surgery specialty certification simulation questions, achieving a correct answer rate of 74.1% with consistent performance across different knowledge categories [17], indicating AI’s potential application value in specialty medical education and exam preparation. They can serve as effective adjuncts in medical education, aiding in the training of medical students and residents. It is crucial to clarify that at this stage, LLMs should be positioned as “assistive tools” rather than “decision-making agents.” Their core value lies in augmenting, not replacing, the professional judgment of clinical experts. Challenges and future directions The foremost issues are insufficient information accuracy and timeliness. The trustworthiness and safety of their outputs remain inadequate, posing risks such as hallucination generation, outdated information, inconsistency with the latest guidelines, low readability, and potential biases. This aligns with the findings of Sanker et al.[36]. Secondly, LLMs lack a deep understanding of complex clinical contexts and struggle to handle patient cases involving individualized factors like multiple comorbidities or previous surgical history, limiting their utility in clinical decision support. Furthermore, model outputs often suffer from poor interpretability; the reasoning process behind their generated conclusions is opaque, making it difficult for clinicians to trace and verify the information basis. These technical limitations are closely intertwined with broader ethical and safety considerations, which deserve detailed elaboration. First, data privacy and confidentiality are critical concerns, as noted in multiple studies; patient queries often contain sensitive health information, and inputting such data into public or unencrypted LLM interfaces risks unauthorized access or data breaches, especially because some models store conversations for retraining, and the lack of transparency regarding how platforms handle personal data further exacerbates these risks. Second, algorithmic bias may lead to unfair or inaccurate recommendations; LLMs trained on non‑representative datasets may generate inappropriate recommendations, thereby exacerbating the risk of healthcare disparities. Third, liability and accountability remain unclear; the “black‑box” nature of LLMs makes their reasoning process opaque, and when an LLM contributes to a clinical decision that results in an adverse event, responsibility is ambiguous—it is unclear whether the developer, the hospital, the supervising physician, or the user should be held accountable, as current legal frameworks lack clear guidance for LLM‑assisted surgical decisions. Fourth, over‑reliance on LLMs may negatively affect the patient‑clinician relationship, as patients might reduce trust in human providers or delay necessary consultations. Finally, the most frequently cited issue is the “hallucination” phenomenon—LLMs tend to generate responses with high confidence and coherent structure even when the content is factually incorrect, which may mislead patients and lead uninformed users to adopt erroneous and potentially dangerous advice. Taken together, these risks emphasize that LLMs are not yet mature and must currently be positioned as supportive tools under human supervision, not autonomous advisors. Conclusions Large Language Models have demonstrated potential as auxiliary informational tools and communication mediators in bariatric surgery, showing particular value in enhancing information accessibility, supporting doctor-patient communication, and medical education. However, their current capabilities remain confined to structured knowledge transmission and are not yet reliable for clinical tasks requiring professional judgment, personalized decision-making, and comprehension of complex contexts. Future development should focus on the specialized optimization of models, standardization of evaluation systems, and exploration of clinical integration pathways. This will facilitate the deep integration of AI with bariatric surgery under the premise of ensuring safety, reliability, and equity. Supporting information S1 Checklist. Preferred reporting items for systematic reviews and meta-analyses extension for scoping reviews (PRISMA-ScR) checklist. https://doi.org/10.1371/journal.pone.0350748.s001 (DOCX) References - 1. CDC. Adult obesity facts. Obesity. Accessed 2026 February 11. https://www.cdc.gov/obesity/adult-obesity-facts/index.html - 2. Eisenberg D, Shikora SA, Aarts E, Aminian A, Angrisani L, Cohen RV, et al. 2022 American society of metabolic and bariatric surgery (ASMBS) and international federation for the surgery of obesity and metabolic disorders (IFSO) indications for metabolic and bariatric surgery. Obes Surg. 2023;33(1):3–14. pmid:36336720 - 3. Arterburn DE, Telem DA, Kushner RF, Courcoulas AP. Benefits and risks of bariatric surgery in adults: a review. JAMA. 2020;324(9):879–87. pmid:32870301 - 4. Schauer PR, Bhatt DL, Kirwan JP, Wolski K, Aminian A, Brethauer SA, et al. Bariatric surgery versus intensive medical therapy for diabetes - 5-year outcomes. N Engl J Med. 2017;376(7):641–51. pmid:28199805 - 5. Chinese Society for Metabolic and Bariatric Surgery (CSMBS), Chinese Society for Integrated Health of Metabolic and Bariatric Surgery (CSMBS IH), Chinese Obesity and Metabolic Surgery Collaborative (COMES Collaborative). Chinese obesity and metabolic surgery database: annual report 2023. 2024. https://doi.org/10.3877/cma.j.issn.2095-9605.2024.02.001 - 6. Topart P. Obesity surgery: Which procedure should we choose and why?. J Visc Surg. 2023;160(2S):S30–7. pmid:36725449 - 7. Scarano Pereira JP, Martinino A, Manicone F, Scarano Pereira ML, Iglesias Puzas Á, Pouwels S, et al. Bariatric surgery on social media: A cross-sectional study. Obes Res Clin Pract. 2022;16(2):158–62. pmid:35185001 - 8. Jeblick K, Schachtner B, Dexl J, Mittermeier A, Stüber AT, Topalis J, et al. ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports. Eur Radiol. 2024;34(5):2817–25. pmid:37794249 - 9. Peters MDJ, Marnie C, Tricco AC, Pollock D, Munn Z, Alexander L, et al. Updated methodological guidance for the conduct of scoping reviews. JBI Evid Synth. 2020;18(10):2119–26. pmid:33038124 - 10. Tricco AC, Lillie E, Zarin W, O’Brien KK, Colquhoun H, Levac D, et al. PRISMA extension for scoping reviews (PRISMA-ScR): checklist and explanation. Ann Intern Med. 2018;169(7):467–73. pmid:30178033 - 11. Lockwood C, Dos Santos KB, Pap R. Practical guidance for knowledge synthesis: scoping review methods. Asian Nurs Res (Korean Soc Nurs Sci). 2019;13(5):287–94. pmid:31756513 - 12. Samaan JS, Yeo YH, Rajeev N, Hawley L, Abel S, Ng WH, et al. Assessing the accuracy of responses by the language model ChatGPT to questions regarding bariatric surgery. Obes Surg. 2023;33(6):1790–6. pmid:37106269 - 13. Moazzam Z, Lima HA, Endo Y, Noria S, Needleman B, Pawlik TM. A paradigm shift: online artificial intelligence platforms as an informational resource in bariatric surgery. Obes Surg. 2023;33(8):2611–4. pmid:37322244 - 14. Aburumman R, Al Annan K, Mrad R, Brunaldi VO, Gala K, Abu Dayyeh BK. Assessing ChatGPT vs. standard medical resources for endoscopic sleeve gastroplasty education: a medical professional evaluation study. Obes Surg. 2024;34(7):2718–24. pmid:38758515 - 15. Srinivasan N, Samaan JS, Rajeev ND, Kanu MU, Yeo YH, Samakar K. Large language models and bariatric surgery patient education: a comparative readability analysis of GPT-3.5, GPT-4, Bard, and online institutional resources. Surg Endosc. 2024;38(5):2522–32. pmid:38472531 - 16. Samaan JS, Rajeev N, Ng WH, Srinivasan N, Busam JA, Yeo YH, et al. ChatGPT as a source of information for bariatric surgery patients: a comparative analysis of accuracy and comprehensiveness between GPT-4 and GPT-3.5. Obes Surg. 2024;34(5):1987–9. pmid:38564173 - 17. Sanders A, Lim R, Jones D, Vosburg RW. Artificial intelligence large language model scores highly on focused practice designation in metabolic and bariatric surgery board practice questions. Surg Endosc. 2024;38(11):6678–81. pmid:39317906 - 18. Ozmen BB, Berber I, Dang JT, Schwarz GS, Kroh M. Development of a bariatric surgery specific artificial intelligence large language model: bariatricsurgeryGPT. Surg Innov. 2026;33(3):276–82. pmid:41260227 - 19. Annor E, Atarere J, Ubah N, Jolaoye O, Kunkle B, Egbo O, et al. Assessing online chat-based artificial intelligence models for weight loss recommendation appropriateness and bias in the presence of guideline incongruence. Int J Obes (Lond). 2025;49(5):896–901. pmid:39871015 - 20. Kahlon S, Sleet M, Sujka J, Docimo S, DuCoin C, Dimou F, et al. Evaluating the concordance of ChatGPT and physician recommendations for bariatric surgery. Can J Physiol Pharmacol. 2025;103(2):70–4. pmid:39561352 - 21. Leng Y, Yang Y, Liu J, Jiang J, Zhou C. Evaluating the Feasibility of ChatGPT-4 as a Knowledge Resource in Bariatric Surgery: A Preliminary Assessment. Obes Surg. 2025;35(2):645–50. pmid:39821906 - 22. Guo S, Yang C-L, Lin X-P, Jiang M, Chen J, Tuo K-X, et al. Evaluating artificial intelligence-generated patient education materials for bariatric surgery: comparative analysis of response quality, reliability, and readability across ChatGPT and deepseek models. Obes Surg. 2025;35(11):4628–38. pmid:41014443 - 23. Mahjoubi M, Shahabi S, Sheikhbahaei S, Jazi AHD. Evaluating AI capabilities in bariatric surgery: a study on ChatGPT-4 and DALL·E 3’s recognition and illustration accuracy. Obes Surg. 2025;35(2):638–41. pmid:39733375 - 24. Lopez-Gonzalez R, Sanchez-Cordero S, Pujol-Gebellí J, Castellvi J. Evaluation of the impact of ChatGPT on the selection of surgical technique in bariatric surgery. Obes Surg. 2025;35(1):19–24. pmid:38760650 - 25. Sanchez-Cordero S, Lopez-Gonzalez R, Fernandez H, Pujol-Gebellí J. Training ChatGPT for surgical decisions: Bariatric surgery analysis using algorithms and evidence. Obes Res Clin Pract. 2025;19(4):352–5. pmid:40817014 - 26. Aksoy E. The Performance of artificial intelligence in one anastomosis gastric bypass surgery: comparative efficacy of ChatGPT-4.0, ChatGPT-Omni, and Gemini AI. Obes Surg. 2025;35(4):1469–75. pmid:40100615 - 27. Kumru Yildirim M, Bayam H, Yildiz N, Kahraman Gök F, Batar N. Evaluation of nutritional recommendations provided by the ChatGPT language model to bariatric surgery patients. Rom J Intern Med. 2025;63(2):185–7. pmid:40544494 - 28. Lee Y, Shin T, Tessier L, Javidan A, Jung J, Hong D, et al. Harnessing artificial intelligence in bariatric surgery: comparative analysis of ChatGPT-4, Bing, and Bard in generating clinician-level bariatric surgery recommendations. Surg Obes Relat Dis. 2024;20(7):603–8. pmid:38644078 - 29. Lee Y, Tessier L, Brar K, Malone S, Jin D, McKechnie T, et al. Performance of artificial intelligence in bariatric surgery: comparative analysis of ChatGPT-4, Bing, and Bard in the American Society for Metabolic and Bariatric Surgery textbook of bariatric surgery questions. Surg Obes Relat Dis. 2024;20(7):609–13. pmid:38782611 - 30. Jazi AHD, Mahjoubi M, Shahabi S, Alqahtani AR, Haddad A, Pazouki A, et al. Bariatric evaluation through AI: a survey of expert opinions versus ChatGPT-4 (BETA-SEOV). Obes Surg. 2023;33(12):3971–80. pmid:37889368 - 31. Vedder K, Blank S, Wilhelm T, Fidan D, Pachkiv I, Cao H, et al. Comparative analysis of large language model and physician-generated responses in bariatric patient inquiries: assessing the accuracy and patient satisfaction. Obes Surg. 2025;35(9):3801–9. pmid:40748576 - 32. Bigolin AV, Nunes Chibiaque de Lima J, Machado Grossi JV, Hartmann Rost I, Machado da Rosa M, Piccolotto Concolatto F. Could ChatGPT Be a Tool Capable of Providing Qualified, Empathetic, and Assertive Answers to Patients After Bariatric Surgery? A Comparative Analysis of Its Versions. Obes Surg. 2025;35(11):4605–11. pmid:41085912 - 33. Goudrar R, Zekraoui O, Moussa I, Nguyen D-D, Bouhadana D, Li T, et al. Large language model chatbots for patient education in kidney stones: a scoping review. World J Urol. 2025;43(1):641. pmid:41160267 - 34. de Menezes Torres LM, de Morais EF, Fernandes Almeida DR de M, Pagotto LEC, de Santana Santos T. The impact of the large language model ChatGPT in oral and maxillofacial surgery: a systematic review. Br J Oral Maxillofac Surg. 2025;63(5):357–62. pmid:40251084 - 35. Cohen L, Gupta AR, Patel P, Gill GS, Bains H, Gupta S. The Role of Large Language Models in Ophthalmology: A Review of Current Applications, Performance, and Future Directions. Cureus. 2025;17(11):e97374. pmid:41431521 - 36. Sanker V, Nordin EOR, Heesen P, Elfadali MA, Anwar M, Chintapalli RD, et al. Current trends and future prospects of language models and processing systems in spine surgery - a scoping review. Neurosurg Rev. 2025;48(1):633. pmid:40911114 - 37. Dahiya DS, Ali H, Moond V, Shah MDA, Santana C, Ali N, et al. Large language models in gastroenterology and gastrointestinal surgery: a new frontier in patient communication and education. Gastroenterology Res. 2025;18(2):39–48. pmid:40322195

이 뉴스, 독자들은 어떻게 느꼈나요?

관련 뉴스

'research' 카테고리 뉴스

A Deep Reinforcement Learning (DRL)-Based Transformer Method for Solving the Open Shop Scheduling Problem

UP-NRPA: User Portrait based Nested Rollout Policy Adaptation for Planning with Large Language Models in Goal-oriented Dialogue Systems

History of the Muddy Children Puzzle

PLOS의 다른 기사

Correction: A new criterion for defining tunnel portal failure using the strength reduction method

Drug-induced gastric motility disorders: A disproportionality analysis from the FAERS and CVARD databases

Musculoskeletal surgeons use mixed reasoning rather than pure Bayesian strategies in clinical practice