ACCURACY OF CHINESE PINYIN USING GOOGLE TRANSLATE, BAIDU TRANSLATE, AND CHATGPT TRANSLATION TOOLS

Authors

  • Preeyada Aisamee School of Sinology, Mae Fah Luang University, Chiang Rai 57100 Thailand
  • Apirak Nusitchaiyakarn School of Sinology, Mae Fah Luang University, Chiang Rai 57100 Thailand
  • Kanchaporn Siriwat School of Sinology, Mae Fah Luang University, Chiang Rai 57100 Thailand

Keywords:

Pinyin, Baidu translate, Google translate, ChatGPT, artificial intelligence

Abstract

The increasing use of AI-based translation tools in Chinese language learning has raised concerns regarding the accuracy of Hanyu Pinyin transcription, which plays a crucial role in pronunciation and literacy development. This study aims to analyze transcription errors in Chinese Pinyin generated by Google Translate, Baidu Translate, and ChatGPT, and to compare the accuracy of Pinyin transcription among these tools. The data were drawn from Hanyu Jiaocheng Textbook 1A (汉语教程第一册上; 14 lessons) and Textbook 1B (汉语教程第一册下; 10 lessons), and Pinyin accuracy was evaluated with reference to the Basic Rules of Hanyu Pinyin Orthography (GB/T 16159-2012). The findings revealed a total of 120 transcription errors: 36 errors (30%) from Google Translate, 73 errors (60.83%) from Baidu Translate, and 11 errors (9.17%) from ChatGPT. By category, 12 instances (10%) involved two- or three-syllable lexical compounds, 34 instances (28.33%) involved retroflex finals, 3 instances (2.50%) involved verb usage, 16 instances (13.33%) concerned place names, 25 instances (20.83%) involved sentence-initial capitalization, 22 instances (18.33%) involved the use of Roman letters, and 8 instances (6.67%) involved punctuation marks. The results indicate that AI-based translation tools differ substantially in both accuracy and error patterns. ChatGPT produced the most accurate transcriptions and preserved the original meaning most closely, followed by Google Translate, while Baidu Translate exhibited the highest rate of inaccuracy, particularly in Pinyin word segmentation and sentence-initial capitalization. These findings suggest that ChatGPT is currently the most suitable tool for Chinese phonetic transcription in language learning contexts and provide insights for improving AI-based translation technologies to better align with established linguistic standards.

References

Charoensuk, N. (2018). The impact of using Chinese phonetic transcription on Thai learners’ pronunciation. Journal of Humanities, Srinakharinwirot University, 34, 121–134. (in Thai)

Electronic Transactions Development Agency. (2021, May 22). Artificial intelligence in government services. https://www.etda.or.th/th/Useful-Resource/Knowledge-Sharing/Articles/AI-in-Government-Services.aspx (in Thai)

He, Z. (2015, July 31). Baidu Translate: Research and products. In Proceedings of the ACL 2015 Fourth Workshop on Hybrid Approaches to Translation (HyTra) (pp. 61–62). Association for Computational Linguistics. https://aclanthology.org/W15-4110.pdf (in English)

Ketphan, K. (2016). Attitudes, behaviors, and problems in using Google Translate. Journal of Liberal Arts, Ubon Ratchathani Rajabhat University, 8(2), 147–158. https://so03.tci-thaijo.org/index.php/journal-la/article/download/61858/50987/143556 (in Thai)

Khawbanpaew, K. (2022). An analysis of problems and quality assessment of Thai-Chinese translation using Baidu Translate and Google Translate. Journal of Liberal Arts, Maejo University, 10(1), 109–133. (in Thai)

Liu, X. (2010). Introduction to teaching Chinese as a foreign language. Beijing Language and Culture University Press. (in Chinese)

National Standard GB/T 16159-2012. (2012). Hanyu pinyin zheng cifa jiben guize[Basic Rules of Hanyu Pinyin Orthography]. https://www.sycm.edu.cn/display_son.aspx?DWid=141&Nid=28555&Vid=-1

Pan, Y., & Li, H. (2022). Error analysis of Pinyin and lexical translation in online machine translation tools. Chinese Language Education and Technology Journal, 9(2), 41–52.

Phisetsakunwong, B. (2015). A problem of pronouncing Chinese phonetic alphabets of Kanchanaburi Rajabhat University students [Research report]. Kanchanaburi Rajabhat University. (in Thai)

Royal Society of Thailand. (2020). Dictionary of linguistic terms (Royal Institute edition). Royal Society of Thailand. (in Thai)

Wannasinthop, S. (2011). A study of factors affecting the pronunciation of consonants, vowels, and tones in Mandarin Chinese among students majoring in Chinese and Traditional Chinese Medicine at Huachiew Chalermprakiet University. https://dric.nrct.go.th/Search/SearchDetail/249129 (in Thai)

Yang, H. (Ed.). (2006). Hanyu jiaocheng yi shang (Di san ban.) [Chinese Course, Book 1A (3rd ed.)]. Beijing Language and Culture University Press. (in Chinese)

Yang, H. (Ed.). (2006). Hanyu jiaocheng yi xia (Di san ban.) [Chinese Course, Book 1B (3rd ed.)] . Beijing Language and Culture University Press. (in Chinese)

Yu, L. (2024). Lexical diversity and syntactic complexity in ChatGPT translations. Foreign Language Teaching and Research, 56(2), 297–307+321. https://doi.org/10.19923/j.cnki.fltr.2024.02.005 (in Chinese)

Zhang, L. (2023). A comparative analysis of translation accuracy between Google and Baidu Translators in Chinese–English contexts. Modern Linguistics, 6(3), 74–85.

Zhong, Y., & Chulniam, N. (2023). The study of students’ reading ability in Mandarin through a blended supplementary learning arrangement together with the Mandarin phonetic reading skills exercise (Pinyin). MBU Education Journal, 11(2), 11–20. (in Thai)

Downloads

Published

2026-06-18

How to Cite

Aisamee, P., Nusitchaiyakarn, A., & Siriwat, K. (2026). ACCURACY OF CHINESE PINYIN USING GOOGLE TRANSLATE, BAIDU TRANSLATE, AND CHATGPT TRANSLATION TOOLS. Journal of Sinology (วารสารจีนวิทยา), 20(1), 78–101. retrieved from https://so16.tci-thaijo.org/index.php/JSINO/article/view/3097

Issue

Section

Research Article

Categories