论文部分内容阅读
Tibetan-Chinese comparable corpus extraction is a basis work for Tibetan-Chinese cross language question answering system,information retrieval,machine translation and other researches.This paper is an exploration to solve the scarcity of Tibetan-Chinese comparable corpus.It will promote the knowledge sharing between different languages.In this paper,we propose a method to extract Tibetan-Chinese comparable corpus.The main work is in the following:(1)Tibetan-Chinese comparable corpus extraction model based on multi-feature of bilingual websites(2)Extraction method based on entity link from naturally annotated resources.Finally,the experimental results show our approach is effective.