早教吧 育儿知识 作业答案 考试题库 百科 知识分享

英语翻译中文分词是中文信息处理的基础.在自然语言理解、语言文字研究、中文文本自动标引、信息检索、机器翻译等领域中,中文分词具有不可替代的作用.因此,中文分词的研究至关重要.

题目详情
英语翻译
中文分词是中文信息处理的基础.在自然语言理解、语言文字研究、中文文本自动标引、信息检索、机器翻译等领域中,中文分词具有不可替代的作用.因此,中文分词的研究至关重要.
但是,中文分词的研究水平已经远落后于与它关联的相关技术,成为制约其它技术发展的瓶颈.中文分词的研究过程中遇到了以下问题:语言学方面的困难,新词的不断出现,歧义的判别,分词的标准不统一等;计算机方面的困难,没有合理的自然语言形式模型,没有有效方式对语义进行理解以及形式化等.这些问题将会制约着中文分词的发展.
本文在综合分析现有的中文分词研究成果,重点对基于图的中文分词进行研究,提出了基于S-EK图最短路径的中文分词.研究的主要内容如下:
1.对中文分词的主要的算法进行了研究,比较和分析了常用的三种分词算法:基于字符串匹配的分词算法,基于统计的分词算法和基于知识理解的分词算法,并对它们之间的优缺点进行了总结.最后文章还给出了中文分词的评测标准及其意义.
2.重点在有向图和中文分词结合方面进行了深入研究,对N-最短路径中文分词的算法中的有向图进行改进,提出了S-EK图,并采用N-元统计模型计算出一个词在一定的语境下的概率,并对该值做了平滑处理,把最后的结果作为S-EK图的边的权值.
3.基于S-EK图的优点提出了S-EK最短路径算法.该算法在与N-最短路径算法和Dijkstra算法进行对比,实验和理论推导均证明该算法有一定的优点和价值.
关键词:中文分词;信息处理;S-EK图;最短路径;统计模型
▼优质解答
答案和解析
The Chinese word segmentation is Chinese information processing foundation. In natural language understanding, language research, Chinese text automatic indexing, information retrieval, machine translation, etc, the Chinese word segmentation plays an irreplaceable role. Therefore, the Chinese word segmentation research is very important.
However, the Chinese word segmentation research level is already far behind its associated related technologies, become the bottleneck of restricting the development of other technologies. The Chinese word segmentation research process encountered the following questions: linguistic difficulties, the words appear ceaselessly, ambiguity discriminant, participle standard is not uniform; Computer difficulties, no reasonable natural language form model, no effective way for understanding of the semantic and formalized, etc. These problems will restricts the development of the Chinese word segmentation.
Based on synthetic analysis of existing research results of the Chinese word segmentation, focus on Chinese word segmentation based on graph, is put forward based on S - EK figure shortest path Chinese word segmentation. The main content of the study are as follows:
1. The main for the Chinese word segmentation algorithm was studied, and the comparison and analysis of three commonly used words segmentation algorithm based on string matching, based on statistical words segmentation algorithm and the words segmentation algorithm based on knowledge understanding and of words segmentation algorithm and the advantages and disadvantages of between them are summarized. Finally the paper also gives the assessment of the Chinese word segmentation and its significance.
2. Key in a directed graph and combined Chinese word segmentation is studied, the shortest path to N - the Chinese word segmentation algorithm digraph was improved, puts forward S - EK chart and adopt N - yuan statistical model to compute a word in a certain context, and the probability of made smooth processing, value the final result as S - EK figure edge metric.
3. Based on S - EK proposed graph advantages s-rough shortest path algorithm EK. This algorithm in and N - a shortest path algorithm and Dijkstra algorithm is compared, and the experiment and theoretical derivation proves this algorithm has certain advantages and value.
看了 英语翻译中文分词是中文信息处...的网友还看了以下:

请从下列词语中自由选词(不少于5个词)写一段话,字数不超过100字.束手无策左顾右盼无济于事奢侈忧  2020-05-17 …

请用下列词语中自写选词(不少于5个词)写一段话,不超过100字束手无策左顾右盼无济于事奢侈忧戚束缚  2020-05-17 …

如果下面是你的信息卡,请根据信息卡的内容,写一封电子邮件向外国朋友介绍一下你自己.60词左右Nam  2020-06-03 …

平均互信息量与平均自信息量的区别(详细)平均互信息量与平均自信息量的区别  2020-06-15 …

归来倚仗自叹息中的自如何理解《茅屋为秋风所破歌》从第二节的表达方式来看,主要属于什么?“归来倚仗自  2020-06-16 …

日积月累中的词:雄心壮志坚定不移坚韧不拔自强不息聚沙成塔集腋成裘持之以恒全力以赴知难而退无坚不摧知  2020-06-21 …

下列各句中,加点词语能被括号中的词语替换且符合句意的一项是()A.读者们一天一天更加感觉到周围所进行  2020-11-20 …

下列各句中,画线词语能被括号中的词语替换且符合句意的一项是A.读者们一天一天更加感觉到周围所进行的一  2020-11-20 …

解释下列文言句子中虚词的意义或作用。①自非亭午夜分,不见曦月。自——②安有寄千金而无劵者?安——③惟  2020-12-01 …

政治描述一下自己心梦中的梦想初四就是自己心梦中的梦想好的话多加20现实一点的与自己息息相关的  2020-12-03 …