论文部分内容阅读
针对词汇化调序模型在机器翻译中存在的上下文无关性及稀疏性问题,提出了基于语义内容进行调序方向及概率预测的调序表重构模型。首先使用连续分布式表示方法获取调序规则的特征向量;然后通过循环神经网络(Recurrent Neural Networks,RNN)对于向量化表示的调序规则进行调序方向及概率预测;最后过滤并重构调序表,赋予原始调序规则更加合理的调序概率分布值,提高调序模型中调序信息的准确度,同时降低调序表规模,加快后续解码速率。实验结果表明,将调序表重构模型应用至汉维机器翻译任务中,BLEU值可以获得0.39的提升。
Aiming at the problem of context-free and sparseness in machine translation of lexicalization and sequencing model, a reconstructed model of the sequence table based on semantic content is proposed. Firstly, the eigenvector of the ordering rule is obtained by using the continuous distributed representation method. Then, the direction and probabilistic prediction of the modulation rules of the vectorized representation are performed by a recurrent neural network (RNN). Finally, the sequence is filtered and reconstructed Table, to give the original ordering rules more reasonable sequence probability distribution, improve the accuracy of the sequence information in the sequence model, while reducing the size of the sequence table to speed up the subsequent decoding rate. The experimental results show that the BLEU value can be improved by 0.39 when the reconstructed model is applied to the machine translation task.