Machine Translation and Transliteration Involving Related, Low-Resource Languages
暫譯: 涉及相關低資源語言的機器翻譯與音譯
Kunchukuttan, Anoop, Bhattacharyya, Pushpak
- 出版商: CRC
- 出版日期: 2021-08-13
- 售價: $6,820
- 貴賓價: 9.5 折 $6,479
- 語言: 英文
- 頁數: 200
- 裝訂: Hardcover - also called cloth, retail trade, or trade
- ISBN: 0367561999
- ISBN-13: 9780367561994
海外代購書籍(需單獨結帳)
商品描述
Machine Translation and Transliteration involving Related, Low-resource Languages discusses an important aspect of natural language processing that has received lesser attention: translation and transliteration involving related languages in a low-resource setting. This is a very relevant real-world scenario for people living in neighbouring states/provinces/countries who speak similar languages and need to communicate with each other, but training data to build supporting MT systems is limited. The book discusses different characteristics of related languages with rich examples and draws connections between two problems: translation for related languages and transliteration. It shows how linguistic similarities can be utilized to learn MT systems for related languages with limited data. It comprehensively discusses the use of subword-level models and multilinguality to utilize these linguistic similarities. The second part of the book explores methods for machine transliteration involving related languages based on multilingual and unsupervised approaches. Through extensive experiments over a wide variety of languages, the efficacy of these methods is established.
Features
- Novel methods for machine translation and transliteration between related languages, supported with experiments on a wide variety of languages.
- An overview of past literature on machine translation for related languages.
- A case study about machine translation for related languages between 10 major languages from India, which is one of the most linguistically diverse country in the world.
The book presents important concepts and methods for machine translation involving related languages. In general, it serves as a good reference to NLP for related languages. It is intended for students, researchers and professionals interested in Machine Translation, Translation Studies, Multilingual Computing Machine and Natural Language Processing. It can be used as reference reading for courses in NLP and machine translation.
Anoop Kunchukuttan is a Senior Applied Researcher at Microsoft India. His research spans various areas on multilingual and low-resource NLP. Pushpak Bhattacharyya is a Professor at the Department of Computer Science, IIT Bombay. His research areas are Natural Language Processing, Machine Learning and AI (NLP-ML-AI). Prof. Bhattacharyya has published more than 350 research papers in various areas of NLP.
商品描述(中文翻譯)
《機器翻譯與相關低資源語言的音譯》探討了自然語言處理中一個較少受到關注的重要方面:在低資源環境中涉及相關語言的翻譯和音譯。這對於居住在相鄰州/省/國的講相似語言的人來說,是一個非常相關的現實情境,他們需要彼此溝通,但用於建立支持機器翻譯系統的訓練數據有限。本書討論了相關語言的不同特徵,並提供了豐富的範例,並將翻譯相關語言和音譯這兩個問題聯繫起來。它展示了如何利用語言相似性來學習有限數據的相關語言的機器翻譯系統。書中全面討論了使用子詞級模型和多語言性來利用這些語言相似性。書的第二部分探討了基於多語言和無監督方法的相關語言機器音譯方法。通過對各種語言進行廣泛的實驗,確立了這些方法的有效性。
特點
- 針對相關語言的機器翻譯和音譯的新方法,並支持對各種語言的實驗。
- 相關語言機器翻譯的過去文獻概述。
- 一個關於印度10種主要語言之間相關語言機器翻譯的案例研究,印度是世界上語言多樣性最豐富的國家之一。
本書介紹了涉及相關語言的機器翻譯的重要概念和方法。總體而言,它作為相關語言自然語言處理的良好參考。它適合對機器翻譯、翻譯研究、多語言計算機和自然語言處理感興趣的學生、研究人員和專業人士。可用作自然語言處理和機器翻譯課程的參考讀物。
Anoop Kunchukuttan 是微軟印度的高級應用研究員。他的研究涵蓋多語言和低資源自然語言處理的各個領域。Pushpak Bhattacharyya 是印度理工學院孟買分校計算機科學系的教授。他的研究領域包括自然語言處理、機器學習和人工智慧(NLP-ML-AI)。Bhattacharyya 教授在自然語言處理的各個領域發表了超過350篇研究論文。
作者簡介
Dr. Anoop Kunchukuttan is a Senior Applied Researcher in the machine translation team at Microsoft India, Hyderabad. He received his Ph.D from the Indian Institute of Technology Bombay. He is broadly interested in natural language processing and machine learning. His research interests include multilingual learning, language relatedness, machine translation, machine transliteration and distributional semantics. He has also explored problems in information extraction, automated grammar correction, multiword expressions and crowdsourcing for NLP. These works have been published in top-tier Natural Language Processing (NLP) conferences and journals. He is passionate about building software and resources for NLP in Indian languages. He actively develops and maintains the Indic NLP Library and the Indic NLP Catalog, and has contributed to the development of resources like the AI4Bharat Indic NLP Suite and the IIT Bombay parallel corpus. He is a co-organizer of the Workshop on Asian Translation and a co-founder of the AI4Bharat NLP Initiative.
Dr. Pushpak Bhattacharyya is Professor of Computer Science and Engineering Department IIT Bombay. His research areas are Natural Language Processing, Machine Learning and AI (NLP-ML-AI). Prof. Bhattacharyya has published more than 350 research papers in various areas of NLP. His textbook 'Machine Translation' sheds light on all paradigms of machine translation with abundant examples from Indian Languages. Two recent monographs co-authored by him called 'Investigations in Computational Sarcasm' and 'Cognitively Inspired Natural Language Processing- An Investigation Based on Eye Tracking' describe cutting edge research in NLP and ML. Prof. Bhattacharyya is Fellow of Indian National Academy of Engineering (FNAE) and Abdul Kalam National Fellow. For sustained contribution to technology he received the Manthan Award of the Ministry of IT, P.K. Patwardhan Award of IIT Bombay and VNMM Award of IIT Roorkey. He is also a Distinguished Alumnus of IIT Kharagpur and past President of Association of Computational Linguistics.
作者簡介(中文翻譯)
阿努普·昆楚庫坦博士是微軟印度海德拉巴機器翻譯團隊的高級應用研究員。他在印度理工學院孟買分校獲得博士學位。他對自然語言處理和機器學習有廣泛的興趣。他的研究興趣包括多語言學習、語言相關性、機器翻譯、機器音譯和分佈語義學。他還探索了信息提取、自動語法修正、多詞表達和自然語言處理的眾包問題。這些研究成果已發表在頂級自然語言處理(NLP)會議和期刊上。他熱衷於為印度語言構建自然語言處理的軟體和資源。他積極開發和維護Indic NLP Library和Indic NLP Catalog,並為AI4Bharat Indic NLP Suite和IIT孟買平行語料庫的資源開發做出了貢獻。他是亞洲翻譯研討會的共同組織者,也是AI4Bharat NLP Initiative的共同創辦人。
普什帕克·巴塔查里亞博士是印度理工學院孟買分校計算機科學與工程系的教授。他的研究領域包括自然語言處理、機器學習和人工智慧(NLP-ML-AI)。巴塔查里亞教授在自然語言處理的各個領域發表了超過350篇研究論文。他的教科書《機器翻譯》闡明了機器翻譯的所有範式,並提供了來自印度語言的豐富範例。他最近共同撰寫的兩本專著《計算諷刺的研究》和《基於眼動追蹤的認知啟發自然語言處理研究》描述了自然語言處理和機器學習的前沿研究。巴塔查里亞教授是印度國家工程院院士(FNAE)和阿布杜勒·卡拉姆國家研究員。因對技術的持續貢獻,他獲得了信息技術部的曼坦獎、IIT孟買的P.K.帕特瓦爾丹獎和IIT魯爾基的VNMM獎。他也是IIT卡哈拉古爾的傑出校友及計算語言學協會的前任會長。