 The study aimed to evaluate chat GPT's capacity for ongoing clinical decision support by inputting all 36 published clinical vignettes from the Merck-Sharp and Dome, MST, clinical manual into chat GPT, and comparing its accuracy on differential diagnoses, diagnostic testing, final diagnosis, and management based on patient age, gender, and case security. The LLM demonstrated the highest performance in making a final diagnosis with an accuracy of 76.9% and the lowest performance in generating an initial differential diagnosis with an accuracy of 60.3%. Chat GPT achieved an overall accuracy of 71.7% across all clinical vignettes, with increasing strength as it gained more clinical information at its disposal. The study also identified limitations such as possible model hallucinations and the unclear composition of chat GPT's training dataset. This article was authored by Arya Rao, Michael Pang, John Kim, and others. We are article.tv, links in the description below.