早教吧 育儿知识 作业答案 考试题库 百科 知识分享

英语翻译中文分词是中文信息处理的基础.在自然语言理解、语言文字研究、中文文本自动标引、信息检索、机器翻译等领域中,中文分词具有不可替代的作用.因此,中文分词的研究至关重要.

题目详情
英语翻译
中文分词是中文信息处理的基础.在自然语言理解、语言文字研究、中文文本自动标引、信息检索、机器翻译等领域中,中文分词具有不可替代的作用.因此,中文分词的研究至关重要.
但是,中文分词的研究水平已经远落后于与它关联的相关技术,成为制约其它技术发展的瓶颈.中文分词的研究过程中遇到了以下问题:语言学方面的困难,新词的不断出现,歧义的判别,分词的标准不统一等;计算机方面的困难,没有合理的自然语言形式模型,没有有效方式对语义进行理解以及形式化等.这些问题将会制约着中文分词的发展.
本文在综合分析现有的中文分词研究成果,重点对基于图的中文分词进行研究,提出了基于S-EK图最短路径的中文分词.研究的主要内容如下:
1.对中文分词的主要的算法进行了研究,比较和分析了常用的三种分词算法:基于字符串匹配的分词算法,基于统计的分词算法和基于知识理解的分词算法,并对它们之间的优缺点进行了总结.最后文章还给出了中文分词的评测标准及其意义.
2.重点在有向图和中文分词结合方面进行了深入研究,对N-最短路径中文分词的算法中的有向图进行改进,提出了S-EK图,并采用N-元统计模型计算出一个词在一定的语境下的概率,并对该值做了平滑处理,把最后的结果作为S-EK图的边的权值.
3.基于S-EK图的优点提出了S-EK最短路径算法.该算法在与N-最短路径算法和Dijkstra算法进行对比,实验和理论推导均证明该算法有一定的优点和价值.
关键词:中文分词;信息处理;S-EK图;最短路径;统计模型
▼优质解答
答案和解析
The Chinese word segmentation is Chinese information processing foundation. In natural language understanding, language research, Chinese text automatic indexing, information retrieval, machine translation, etc, the Chinese word segmentation plays an irreplaceable role. Therefore, the Chinese word segmentation research is very important.
However, the Chinese word segmentation research level is already far behind its associated related technologies, become the bottleneck of restricting the development of other technologies. The Chinese word segmentation research process encountered the following questions: linguistic difficulties, the words appear ceaselessly, ambiguity discriminant, participle standard is not uniform; Computer difficulties, no reasonable natural language form model, no effective way for understanding of the semantic and formalized, etc. These problems will restricts the development of the Chinese word segmentation.
Based on synthetic analysis of existing research results of the Chinese word segmentation, focus on Chinese word segmentation based on graph, is put forward based on S - EK figure shortest path Chinese word segmentation. The main content of the study are as follows:
1. The main for the Chinese word segmentation algorithm was studied, and the comparison and analysis of three commonly used words segmentation algorithm based on string matching, based on statistical words segmentation algorithm and the words segmentation algorithm based on knowledge understanding and of words segmentation algorithm and the advantages and disadvantages of between them are summarized. Finally the paper also gives the assessment of the Chinese word segmentation and its significance.
2. Key in a directed graph and combined Chinese word segmentation is studied, the shortest path to N - the Chinese word segmentation algorithm digraph was improved, puts forward S - EK chart and adopt N - yuan statistical model to compute a word in a certain context, and the probability of made smooth processing, value the final result as S - EK figure edge metric.
3. Based on S - EK proposed graph advantages s-rough shortest path algorithm EK. This algorithm in and N - a shortest path algorithm and Dijkstra algorithm is compared, and the experiment and theoretical derivation proves this algorithm has certain advantages and value.
看了 英语翻译中文分词是中文信息处...的网友还看了以下:

A市在B市的12km的位置,给B市发货的车以40km/时送货,以60km/时返回A市,往返1时10  2020-04-27 …

谁有初一下学期的阅读读本的内容?同时回答以下问题:毕姆小姐的学校的真正宗旨是?校长不再吃讲究的事物  2020-05-16 …

互为近义词的词语懒惰苍白勤劳怜惜调谢红润叮咛幸运属咐不幸怜悯凋零互为互为近义词的词语懒惰苍白勤劳怜  2020-06-19 …

俄罗斯科学家用含20个质子的钙的一种原子轰击含95个质子的镅原子,结果4次成功合成4个第115号元  2020-07-01 …

成语的反义词春风得意的反义词?言行一致的反义词?奴颜婢膝的反义词?蛮不讲理的反义词?欢天喜地的反义  2020-07-03 …

“洋务运动是中国迈向近代化的开端”,对它的本质含义的理解应该是[]A.它是地主阶级的改良运动B.它  2020-07-13 …

甲书架上的书的本数与乙书架上的比是4:5,若从两个书架上各取走150本,则甲书架上的书的本数与乙书  2020-07-19 …

绝句的里的练习题绝句杜甫两个黄鹂鸣翠柳,一行白鹭上青天.窗含西岭千秋雪,门泊东吴万里船.这首诗描绘  2020-07-28 …

引力的本质是电磁力吗?通过介子传播吗?电磁场是通过什么传播的?引力的词条,电磁力的词条是不是有点矛盾  2020-11-28 …

问几个基本的英语句子造句请分别写出下列句子,我是打算考研写作文的时候灵活运用的,麻烦各位老师不要写得  2021-01-25 …