论文部分内容阅读
从本质上说,机器翻译过程是一个多层次消歧过程。介词短语修饰歧义是导致机器翻译中结构歧义的典型原因之一。本文构造了一个基于类似最大似然估计的介词短语修饰消歧模型。该模型利用了词汇的下位词性、语义分类和短语结构信息,同时考虑了低概率事件。利用关于汽车配件的真实语料进行训练该模型,真实训练语料库包括大约 100000个句子,其中构造了大约 3000个测试例子,测试结果获得了 93%的准确率。该消歧技术应用在汽车配件真实受控文本机器翻译系统中,取得了很好的效果。
In essence, the machine translation process is a multi-level disambiguation process. Prepositional phrase modification ambiguity is one of the typical reasons leading to structural ambiguity in machine translation. In this paper, a disambiguation model of prepositional phrase modification based on similar maximum likelihood estimation is constructed. The model makes use of the lexical, semantic classification and phrase structure of the lexicon while taking into account the low probability events. The model was trained using real corpora for auto parts, the real training corpus consisting of approximately 100,000 sentences, of which about 3,000 test cases were constructed and the test results yielded an accuracy of 93%. The disambiguation technology has been applied to the real controlled text machine translation system of auto parts and achieved good results.