共查询到17条相似文献,搜索用时 765 毫秒
1.
该文以处理大规模真实文本为目标,把句法分析分解为分词/词性标注、短语识别两个部分。首先提出了一个一体化的分词/词性标注方法,该方法在隐马尔科夫模型(HMM)的基础上引入词汇信息,既保留了HMM简单快速的特点,又有效提高了标注精度;然后应用中心驱动模型进行短语识别,这是一个词汇化的英文句法分析模型,该文将其同分词/词性标注模型结合进行汉语句法分析。在公共的测试集上对句法分析器的性能进行了评价,精确率和召回率分别为77.57%和74.96%,这一结果要明显好于目前唯一可比的工作。 相似文献
2.
提出一种基于N元语法的汉语自动分词系统,将分词与标注结合起来,用词性标注来参与评价分词结果.首先基于词典和一元语法统计模型生成N个最优结果作为候选集;然后对候选集进行基于二元语法统计模型的词性标注,最后利用对文本的上下文"理解"信息来确定最佳切分结果.实验结果表明:此方法通过词性标注的反馈有效提高了分词正确率,词性标注对分词有反馈作用. 相似文献
3.
一种新颖的词性标注模型 总被引:4,自引:4,他引:0
文章首次提出一种统计模型,即马氏族模型,该模型假定一个词出现概率既与当前词的词性标记有关,也与它前面的词有关,但其前面的词和该词词性标记关于该词条件独立.将马氏族模型适当加以简化,能成功地用于词性标记,实验结果证明:在相同的测试条件下,这种基于马氏族模型的词性标注方法标记成功率大大高于传统的基于隐马尔可夫模型的词性标注方法.马氏族模型在其它一些自然语言处理领域如分词、句法分析、语音识别、机器翻译也有广泛的应用前景. 相似文献
4.
5.
6.
石翠 《智能计算机与应用》2014,(1):83-84,87
动词细分类属于词性标注的一部分,是自然语言处理的重要内容之一。基于条件随机场在分词和词性标注的基础上对动词进行了更细致的分类。根据动词的语言环境构建条件随机场模型,实验结果表明该方法取得了较高的准确率,最高取得了98.11的F值。 相似文献
7.
基于条件随机场的汉语词性标注 总被引:1,自引:0,他引:1
近年来条件随机场广泛应用于各类序列数据标注中,汉语词性标注中应用条件随机场对上下文建模时会扩展出数以亿计的特征,在深入分析特征产生机理的基础上对特征模板集进行了优化,采用条件随机场进一步研究了汉语词性标注中设定的特征模板集、扩展出的特征数、训练后模型大小、词性标注精度等指标之间的关系.实验结果表明,优化后的特征模板集在模型训练时间、训练后模型大小、标注精度等指标上达到了整体最优. 相似文献
8.
9.
10.
在中文分词领域,基于字标注的方法得到广泛应用,通过字标注分词问题可转换为序列标注问题,现在分词效果最好的是基于条件随机场(CRFs)的标注模型。作战命令的分词是进行作战指令自动生成的基础,在将CRFs模型应用到作战命令分词时,时间和空间复杂度非常高。为提高效率,对模型进行分析,根据特征选择算法选取特征子集,有效降低分词的时间与空间开销。利用CRFs置信度对分词结果进行后处理,进一步提高分词精确度。实验结果表明,特征选择算法及分词后处理方法可提高中文分词识别性能。 相似文献
11.
A new approach for an efficient text analyser is proposed. A prosody generator-driven method is employed to design an efficient text analyser for Mandarin text-to-speech. A simpler structure for text analysis, a more suitable classification of linguistic features and a more efficient contribution of linguistic features to the prosody generator can be achieved. Three heuristic and theoretical methods are used to analyse and examine the capability of each linguistic feature: (1) the contribution of each linguistic feature to the prosody generator is examined experimentally; (2) the cross-influence of each linguistic feature on the prosody generator is analysed; (3) the problem of over- and under-classification of the linguistic features is inspected. Finally, these three analytic results are referenced to design an efficient text analyser. In total 35,243 Chinese characters are employed to examine the performance of our text analyser. Only 79 ms CPU time on a P4-1.4G PC is needed for word segmentation and POS tagging. Correction rates of 97.5% and 93.2% are achieved for word segmentation and POS tagging, respectively. This confirms that the performance of our text analyser is very good. Moreover, a Mandarin text-to-speech system is implemented to inspect the performance of the text analysis and the contribution to the prosody generator. More natural and fluent speech is obtained under the lower computation. The MOS of prosody of the synthesised and original speech are 4.2 and 4.8, respectively, which is reasonably good. 相似文献
12.
跨模态说话人标注旨在利用说话人的不同生物特征进行相互匹配和互标注,可广泛应用于各种人机交互场合。针对人脸和语音两种不同模态生物特征之间存在明显的“语义鸿沟”问题,该文提出一种结合有监督联合一致性自编码器的跨音视频说话人标注方法。首先分别利用卷积神经网络和深度信念网络分别对人脸图像和语音数据进行判别性特征提取,接着在联合自编码器模型的基础上,提出一种新的有监督跨模态神经网络模型,同时嵌入softmax回归模型以保证模态间和模态内样本的相似性,进而扩展为3种有监督一致性自编码器神经网络模型来挖掘音视频异构特征之间的潜在关系,从而有效实现人脸和语音的跨模态相互标注。实验结果表明,该文提出的网络模型能够有效的对说话人进行跨模态标注,效果显著,取得了对姿态变化和样本多样性的鲁棒性。 相似文献
13.
为了分析大视场高空间分辨率红外多光谱扫描仪系统误差的影响,为检校方案的确定提供依据,通过利用严格成像模型对机载摆扫红外扫描仪进行成像仿真分析。针对红外扫描仪摆扫系统中相机投影中心与稳定平台回转中心不重合的设计特点,重点研究相机安置误差与POS系统安置误差的相关关系。仿真实验表明:相机安置误差与POS系统安置误差对定位精度影响规律基本一致,两者存在较强相关性但随着摆扫角度增大而减小;在摆扫幅度小于20 时,相机安置误差可合并到POS系统安置误差。该结论可为后期的检校方案设计提供参考。 相似文献
14.
15.
Gatica-Perez D. Gu C. Sun M.-T. Ruiz-Correa S. 《IEEE transactions on image processing》2001,10(9):1332-1345
The relation between morphological gray-level connected operators and segmentation algorithms based on region merging/classification strategies has been pointed out several times in the literature. However, to the best of our knowledge, the formal relation between them has not been established. This paper presents the link between the two domains based on the observation that both connected operators and segmentation algorithms share a key mechanism: they simultaneously operate on images and on partitions, and therefore they can be described as operations on a joint image-partition model. As a result, we analyze both segmentation algorithms and connected operators by defining operators on complete product lattices, that explicitly model gray-level and partition attributes. In the first place, starting with a complete lattice of partitions, we initially define the concept of the segmentation model as a mapping in a product lattice, whose elements are three-tuples consisting of a partition, an image that models the partition attributes, and an image that represents the gray-level model associated to the segmentation. Then, assuming a conditional ordering relation, we show that any region merging/classification segmentation algorithm can be defined as an extensive operator in such a complete product lattice, in the second place, we proposed a very similar lattice-based extended representation of gray-level functions in the context of connected operators, that highlights the mathematical analogy with segmentation algorithms, but in which the ordering relation is different. We use this framework to show that every region merging/classification segmentation algorithm indeed corresponds to a connected operator. While this result provides an explanation to previous work in the area, it also opens possibilities for further analysis in the two domains. From this perspective, we additionally study some theoretical properties of a general region merging segmentation algorithm. 相似文献
16.
马尔可夫随机场在SAR图像处理中的应用 总被引:5,自引:0,他引:5
马尔可夫随机场(MRF)可以很好地描述空间连续性,选择适当的邻域系统,能对图像的结构特征建模。利用以能量函数表示的联合概率分布,可以使用优化算法进行参数估计。高斯MRF能够准确、简洁地表示图像的纹理,而且具有线性特性,计算方便。本文回顾了在SAR图像处理中使用的MRF模型,详细说明了其中2种在图像复原及分割中的应用。 相似文献