AnAnalysisonComputer—AssistedTranslationthroughGoogleTranslator Toolkit

来源 :校园英语·上旬 | 被引量 : 0次 | 上传用户:lxy850628
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  【Abstract】With the burgeoning of artificial intelligence, computer-assisted translation (CAT) has become more popular than ever before. This paper describes problems and pitfalls encountered within the process of translating on the basis of Google Translator Toolkit platform. In addition, it provides some solutions for those problems and pitfalls as well as some suggestions to how to improve CAT system.
  【Key words】Analysis; Computer-Assisted; Translation
  【作者簡介】张嵩松,贵州省农业对外经济合作中心。
  1. General Information
  The paper makes use of Google Translator Toolkit platform to help translate some legal documents and regulations form English to Chinese. It is universally acknowledged that legal documents and regulations in different industries or fields are quietly similar in their grammatical structure, sentence structure so that machine translation are perfect to aid in translating in these documents (either form English to Chinese or Chinese to English) if an appropriate size of translation memory and glossary is available. Therefore, the paper collects some bilingual legal documents and regulations (Chinese and English) from some official website of famous companies such as Samsung, Apple and Philips. The collected text is produced into translation memory, which the number of English characters is 2, 848 and Chinese is 4.068. What is more, it also selects some frequently used words or terms to make a glossary.
  2. Translation Memory
  In order to achieve the goal of the paper which assists translating in legal documents and regulations, the translation memory of the CAT containing 100 translation units, mainly are collected from some legal documents and regulations. Moreover, the segment size is sentence in each translation unite. And the translation quality of each segment is very professional because all translation units in these documents are collected from official translation. On the other hand, the English text and Chinese text are fully matched. There provide some examples form our translation memory to illustrate.
  (1)Eng:All information, documents, products and services, trademarks, logos, graphics, and images (“Materials”) provided on this site are copyrighted or trademarked and are the property of Samsung Group, Samsung Electronics and its listed subsidiaries.
  CHN:本网中所提供的所有信息、文件、产品, 以及服务、商标、logo、图形,以及图片(以上涉及内容以下简称为“资料”)都是具有版权或已经注册的商标,是三星集团、三星电子及其子公司的财产。
  Example 1, the segment size is a sentence from the official website of Samsung, and its quality of translation is very faithful. In addition, the English text and Chinese text is nearly perfect matched.   (2)ENG:In addition, you may not distribute an End User Product the purpose of which is to replay the courseware, presentations, interactive multimedia material, interactive entertainment products and the like of others.
  CHN:另外, 您不得为播放课件、演示文稿、交互式多媒体资料、交互式娱乐产品等目的而分发“最终用户产品”。
  Example 2 shows that the Chinese sentence structure (target language) is different from English structure because there is a clause in the English sentence (source language). However, the quality of translation is very professional even some different in grammatical structure.
  The data in the translation memory is from official website of some first-class enterprises. Therefore, the source language (English) and target language (Chinese) are accurate and the quality of translation can be guaranteed.
  3. Glossary
  It is well-known that there many terminology and set translation in legal and regulation translation. Hence, colleting some frequently used words and terms in English and Chinese are needed to assist machine translating in the legal documents.
  The glossary contains 104 frequently used words and terms, which are selected from the translation memory. The quality of translation in the glossary is also professional and reliable because the source of the glossary is the same as that of the translation memory. The paper would like to give some examples as following:
  copyright laws—版權法, trademark law—商标法, laws of privacy and communications statutes—通信条例, patent laws—专利法. These terms are all name of law, so their translations are fixed. Therefore, collecting these words can improve the efficiency and accuracy of the translating in the legal documents and regulations.
  4. Evaluation of the System
  The purpose of this evaluation is to assess the translation result of the CAT system. It is clear that the performance of the CAT system is affected by the data collected in translation memory.
  For a test of the performance of the CAT system on legal document, an appropriate evaluation legal textual content is needed. The chosen sample for this evaluation is extracted from the official website of Philips (http://www.philips.com.cn/), including 25 translation units and total 801 English words. Logging in Google Translator Toolkit then uploading the translation memory and glossary respectively which are multilingual texts of a very high quality in both Chinese and English language translated by the professionals. And then upload the sample text to the Google Translator Toolkit to get the machine translation result and compare the MT result with professional translation version.   The quality of MT output is not ideal. Only15% of the pre-translation is from human translation mainly from the glossary uploaded, and 85% (709 words) are from machine translation, in addition, there are only eight words in TM 100% matches, no in context matches, ‘High fuzzy’ TM matches, Repeated text. If texts with unambiguous vocabularies, easy sentence structures and grammar often lead to understandable translation rendered by the machine, allowing readers to understand the general idea of the source language. Nevertheless, texts with terminology, long and complex sentence structure and different punctuation can cause the text to be translated wrongly. The paper set “sentence” as the translation unit, so the translation result is not good enough. However, if the paper set “word” as the translation unit, the translation quality might be largely improved.
  In the paper, none of pre-translation form professional translation can be found because the translation memory is not big enough. And the paper selects the following good translations:
  (1)“Philips is a registered trademark of Philips electronics.” is the source text and the machine translation result is “飛利浦是飞利浦电子公司的注册商标。”The machine translation result is pretty advanced not only in words but also in the sentence structure which can be easily accepted by the readers.
  (2)“Please contact your local Philips business contact for further information.” is translated to”请联系您当地的飞利浦进一步信息的业务联系。” Although the translation result seems unnatural, the meaning of the source text cannot be confused.
  Actually, these suggested good translations in the Google Translator Toolkit are not good enough. Reasons will be greatly confirmed by the following discussion:
  Any perfect match or even fussy match in the global shared translation memory and uploaded translation memory cannot be found because the segment size is sentence and the genre of the sample is legal documents and regulations in which repletion rate is nearly zero. In addition, the size of the translation memory is not big enough to find some previous translations. In addition, Legal documents and regulations involve many grammatically complex and extraordinarily long sentence, a slew of terminologies so that the Google Translator Toolkit are considered unsuitable for legal translation if they are not equipped with adequate legal translation memory and glossary. Some short sentence with few terms or unambiguous words can be properly translated by the toolkit and the quality of translation is acceptable and readable. However, if the sentence is too long and complex, the translator toolkit cannot be translated properly even its quality of translation is not readable. For example:   (1)The source language “In such case, such exclusions or limitations shall be limited to the greatest extent permitted by applicable law.” The machine translation is “這种情况下,这种排除或限制,应仅限于适用法律所允许的最大程度。” and the suggested translation is “在此情况下,此类例外或限制仅限于适用法律所要求的范围。” In this case, the machine translation is readable but uncompressible, which makes any sense in Chinese and meaning is far from the source language.
  (2)The source language is “Philips is in no way responsible for the content of any site owned by a third party that may be linked to the web site via hyperlink, whether or not such hyperlink is provided by the web site or by a third party in accordance with the terms of use.” The machine translation is “飞利浦是绝不可能通过链接的网站链接到由第三方拥有的任何网站的内容负责,不论这种超链接网站或由第三方提供,按照条款使用。” and the suggested translation is “飞利浦对通过超链接连接到本网站的任何第三方所属站点的内容概不负责,无论此类超链接是由本网站还是由第三方根据使用条款提供。” In the example, the structure and grammar of the sentence is more complex than that of example one and machine translation is a simply word-to-word translation which can not acceptable and readable at all in Chinese.
  Therefore, the quality of machine translation is constrained by that of the source text input. A text with proper grammar and unambiguous wording often leads to unreliable translation by the machine, especially in a text with slang, misspelled or ambiguous words and complex or lengthy sentences can easily cause the text to be translated incorrectly.
  5. Conclusion
  Legal documents and regulation texts are conceived to be suitable for machine translation in regard to its use of standard and formal grammar and its non-ambiguous language style and set words and terminology. However, legal translation requires the highest translation quality. The translation quality in the paper is unsatisfactory due to the following three reasons:1. inadequate translation memory and glossary, 2. inappropriate segment size in the translation memory, 3.the limitations in the Google Translator Toolkit.
  Therefore, improving CAT system should firstly reset the segment size from sentence to word so that can help machine translation become more accurate and precise, and enhance translation memory by adding more terminologies and words which can offer more references for the machine. The Google Translator Toolkit cannot translate long and complex sentences properly, however, the toolkit does help improve translation efficiency and its sharing system is quietly useful and convenient for translators.
其他文献
【Abstract】The New Dress by English novelist Virginia Woolf achieves psychological effect through skillful use of symbolism. This paper scrutinizes the symbols and explains their effect in characteriza
【Abstract】English writing plays a significant role in almost every English test, however, the reality is that quite a few students still keep a relatively low standard of writing(羅雨. 2014).And one of
【Abstract】Although English learning in China has undergone a series of reforms on the whole, English grammar teaching is still in a time-consuming and inefficient state. Why do students still make a l
【摘 要】针对中小学英语教学中出现的衔接问题进行了原因分析,给出多个实施有效衔接的对策:加强中小学联系、强化音标教学、优化教学模式、关爱学困生和提升优等生。通过这些策略,相信可以帮助小学生自然过渡,顺利进入初中学习。  【关键词】中小学英语教学;有效衔接;对策  【作者简介】朱淳,江苏省苏州市相城区御窑小学。  《英语新课程标准》的总体思路指出:小学三年级到高中三年级的十年英语学习应该是连贯的,课
【摘要】在大学英语课堂中,合理运用以读促写,可以让学生在充分的语言输入的前提下,使用英语进行书面表达,达到语言输出的目的。而思维导图在大学英语教学中发挥着积极的促进作用,因此,将思维导图运用到大学英语课堂的以读促写环节,从话题、结构、目的三个角度构建可视化思维导图,由输入引渡到输出,可以有效提高学生的英语书面表达能力,进一步强化综合语言运用能力。  【关键词】大学英语;思维导图;以读促写  【作者
【摘要】英语的教育问题一直是我国所强调的教学方面的内容,因为英语在生活中多方面的应用,都显示了其重要的地位与作用。为此,教师不能再延用传统的以讲授为主的教学方法,机械训练式的教学模式,并以贯穿素质教育为原则,围绕培养学习习惯、丰富教学内容、增添课堂趣味等方式对英语学科实行重点教学,实行有效提高英语教学质量的策略。  【关键词】教学策略;初中英语;课堂教学  【作者简介】余国平,浙江省临海市东塍中学
【摘要】诗歌是一种特殊的文本,存在许多模糊点和空白,译诗时自由度较大,以往对诗歌翻译的研究多关注译文,往往忽略译者在其中的主体性。本文以郭沫若译《西风颂》为例,基于“风韵译”和“创作论”,从语言风格、译诗形式等方面探讨郭沫若在诗歌翻译中译者主体性的发挥。  【关键词】诗歌翻译;译者主体性;郭沫若;西风颂  【Abstract】Poetry, as a special text with many
【摘要】在开展英语课堂教学工作的過程中,仅仅依靠着传统教学手段,是无法充分发挥课堂教学成果的,技师学院需要积极转变自身的教学理念和教学思路,创新英语课堂教学模式,采用切实有效的方式和手段提升教学水平。本文主要是从技师学院英语课堂教学创新的意义入手,针对技师学院英语课堂教学现状中存在着的问题进行全面细致的分析和说明,相应的提出了一些科学合理的创新教学模式改进策略。  【关键词】技师学院;英语课堂教学
【摘要】在传统的小学英语教学过程中,单一的教学模式很容易引发学生的厌学情绪,降低学生的学习积极性,而丰富多彩的教学模式,可以很大程度上激发学生的学习热情,帮助他们更好的掌握课堂内容,因此,在英语教学过程中,教师可以根据教学内容和教学目标,进行适当的调整,通过多样化教学模式,提高教学效果和教学效率。本文分析了多样化教学的必要性和重要意义,并提出了在英语教学过程中多样化教学的具体方法。  【关键词】小
【摘要】文体学是以语言为导向对文学文本进行的研究,其主要研究内容是文本体裁的本质、特征及规律,文体学与语言学、文学批评有着与一定的差异,却是联结语言学、文学批评的有效手段。本文以《美国人与土地》为研究对象,从文体学角度,对《美国人与土地》的词汇、语法及篇章进行了分析,并阐述了《美国人与土地》中所体现出的主题。  【关键词】《美国人与土地》;英语;文体学  【作者简介】孔令達,河南师范大学外国语学院