 This study analyzes the differences between medical texts written by human experts and those generated by chat GPT, and designs machine learning workflows to effectively detect and differentiate medical texts generated by chat GPT. The results show that medical texts written by humans were more concrete, diverse, and typically contained more useful information, while medical texts generated by chat GPT paid more attention to fluency and logic and usually expressed general terminologies rather than effective information specific to the context of the problem. A bi-directional encoder representations from transformers-based model effectively detected medical texts generated by chat GPT, with an F1 score exceeding 95%. The study provides a pathway toward trustworthy and accountable use of large language models in medicine. This article was authored by Wenxiang Liao, Jinliang Lu, Haisheng Dai, and others.