首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 265 毫秒
1.
可靠的语音端点检测算法是稳健语音识别系统所必须的。针对现有算法在噪声环境下的稳健性问题,提出了基于单类SVM(Support Vecfor Machine)的端点检测算法。通过对多特征信息进行在线学习与综合,以及采用双层决策机制,有效提高了语音检测的稳健性。实验表明,算法在多种噪声环境和信噪比条件下有效,明显提高了语音识别系统在噪声环境下的识别率。  相似文献   

2.
针对非平稳噪声环境下的语音活动检测问题,提出了基于在线单类SVM的自适应语音活动检测算法。该算法采用单类SVM对多种特征信息进行在线学习与综合,为非平稳背景噪声建模,并采用双层决策机制,能有效提高语音活动检测的稳健性。在语音识别系统中的实验结果表明,算法在多种噪声环境和信噪比条件下有效,并明显提高了在非平稳噪声环境下的识别率。  相似文献   

3.
说话人识别系统是在语音信号中提取说话人信息来鉴别说话人身份.整个系统以DSP处理器为核心结构,进行训练和识别语音信号.采用LPC和DTW作为核心算法来进行说话人识别.  相似文献   

4.
借助语音增强、基音频率分析和共振峰分析,设计了简单的说话人识别系统.在识别过程中以平均基音频率、共振峰峰值位置作为两种评价标准,交互印证,最终实现了说话人的身份辨认.  相似文献   

5.
基于韵律变换的情感说话人识别   总被引:1,自引:0,他引:1  
为了解决由情感变化引起的说话人识别性能下降问题,提出了一种新的情感说话人识别系统.首先,通过引入情感识别作为前端处理模块,对中性语音和情感语音进行分类.然后,对情感语音进行韵律修正,分别采用高斯归一化、高斯混合模型(GMM)和支持向量回归(SVR)等方法建立情感语音和中性语音的基频映射规则,并根据平均线性变化率对时长进行了修正.最后,对韵律修正后的情感语音进行识别.实验结果表明,提出的情感说话人识别系统可以有效地提高情感说话人识别的性能,识别率相比传统方法有了显著的提高.并且通过基频和时长修正的情感语音更接近于中性语音.  相似文献   

6.
说话人识别是语音识别的一种特殊方式,其目的不是识别语音内容,而是识别说话人是谁,即从语音信号中提取个人特征。采用矢量量化(VQ)可避免困难的语音分段问题和时间归整问题,且作为一种数据压缩手段可大大减少系统所需的数据存储量。本文提出了识别特征选取采用复倒谱特征参数和对应用VQ的说话人识别系统改进的一种方法。当用于训练的数据量较小时,复倒谱特征可以得到比较稳定的识别性能。VQ的改进方法避免了说话人识别系统的训练时间与使用时间相差过长从而导致系统的性能明显下降以及若利用自相关函数带来的大量运算。  相似文献   

7.
语音识别技术近些年来发展非常迅速,并且在许多方面已经有了很好的应用.在C环境下模拟实现一个简单的小词汇量、孤立词语、特定人的音识别系统.该系统具有很好的扩展性,稍微做些改动,就可以设计出各种各样的语音识别系统.  相似文献   

8.
语音识别系统的进展陈英,柯林新发展意味着语音识别系统可能很快成为一种普通工具。几十年来,对能够识别语音的计算机的期望激发了研究人员和科幻作家的想象力。但早期的语音识别系统需要使用者说话非常慢,每个字间都要停顿。现在,语音识别技术可能处于一个转折点。随...  相似文献   

9.
在语音识别系统中,针对汉语普通话语音特点,增加采用分层级多参数加权综合的检测方法,以感知线索为依据,对连续语音流中的辅音性、阻塞性、擦音性、送气性、响音性、延续性、鼻音性、元音性、后位性等区别特征和语音特征的声学地标进行检测和分割。算法充分考虑了不同说话人、语音语境、语速和说话风格对声学地标的影响,这有助于提高检测和分割的准确率和鲁棒性。  相似文献   

10.
声纹识别技术,形象的说法就是说话人识别技术。它是根据人在说话时产生的波形,以及波形中反映人类心理和生理的特征参数来判断说话人的身份的技术。本文所研究的是与文本有关的说话人确认系统。比较了基于声道的线性预测倒谱系数(LPCC)和基于听觉特性的MEL频率倒谱系数(MFCC)参数特征,得出MFCC对环境存在更高的鲁棒性。并运用了隐形马尔可夫模型(HMM)在MATLAB上实现了语音数字的识别仿真。本实验系统的识别率达到了90%,验证了HMM模型识别的准确性。  相似文献   

11.
由于系统噪声的不可避免性,以至于在系统的特性测试中难以保证对有用信号的提取,针对传统的信号提取方式,本文介绍了在存在噪声背景的环境中,选择基于相关原理而构成的滤波器提取有用信号获取系统的频率特性,以达到对系统特性的研究.  相似文献   

12.
The performance of speaker verification systems is often compromised under real-world environments. For example, variations in handset characteristics could cause severe performance degradation. This paper presents a novel method to overcome this problem by using a non-linear handset mapper. Under this method, a mapper is constructed by training an elliptical basis function network using distorted speech features as inputs and the corresponding clean features as the desired outputs. During feature recuperation, clean features are recovered by feeding the distorted features to the feature mapper. The recovered features are then presented to a speaker model as if they were derived from clean speech. Experimental evaluations based on 258 speakers of the TIMIT and NTIMIT corpuses suggest that the feature mappers improve the verification performance remarkably.  相似文献   

13.
The performance of speaker verification systems is often compromised under real-world environments.For example,variations in handset characteristics could cause severe performance degradation.This paper presents a novel method to overcome this problem by using a non-linear handset mapper.Under this method,a mapper is constructed by training an elliptical basis function network using distorted speech features as inputs and the corresponding clean features as the desired outputs.During feature recuperation,clean features are recovered by feeding the distorted features to the feature mapper.The recovered features are then presented to a speaker model as if they were derived from clean speech.Experimental evaluation based on 258 speakers of the TIMIT and NTIMIT corpuses suggest that the feature mappers improve the verification performance remarkably.  相似文献   

14.
以DSP实现基于概率DP匹配算法的说话人识别系统   总被引:1,自引:0,他引:1  
提出了利用概率DP匹配算法进行说话人识别的的设想。并给出了运用TMS320C5416实现说话人自动识别系统的方案。该系统利用一种新的语音信号r阶的倒谱线性回归系数等参数构成识别的特征矢量集,运用提出的概率DP匹配算法进行与文本无关的说话人识别。实验结果表明该系统具有识别精度高、识别速度快,占用系统资源少等特点,是一种有效的说话人自动识别的实现方法。  相似文献   

15.
说话人识别可以看作语音识别的一种,本文研究了MFCC参数的提取方法,并对矢量量化VQ的识别模型进行了讨论,设计出了一种可行的识别方法,通过验证,这种方法对于文本有关的说话人识别,可达到较高的识别率.  相似文献   

16.
基于音素的话者特定英语命令识别   总被引:2,自引:0,他引:2  
1 Introduction Sincethe 195 0s ,speechrecognitiontechnologies ,bothspeaker dependentandspeaker independent ,withsmallorlargevocabulary ,andusingisolatedorconnectedwords,orcontinuousspeech ,havedevel opedandbeenwidelyapplied .Recentlyithasbecomeadominanttechnologyforhuman machineinterface .Speechrecognitionisbasicallytreatedasaproblemofpatternmatching .Thegoalistotakeonepattern ,i .e .,thespeechsignal,andclassifyitasasequenceofpreviouslylearnedpatterns ,e.g .,wordsorsubwordunitssuchsphonems[1…  相似文献   

17.
Shear probe works under a tough environment where the turbulence signals to be measured are very weak.The measured turbulence signals often contain a large amount of noise.Due to wide frequency band,no...  相似文献   

18.
含义是Grice意义理论的核心,包括规约含义和会话含义。由于Grice在界定规约含义时语焉不详,因此如何区分规约含义和会话含义一直存在争议。文章基于Grice意义理论的基本哲学精神,并以近些年来中外含义理论研究的主要成果为依据,探讨了规约含义和会话含义的本质和特征,进而以可鉴别特征为标准区分了规约含义和会话含义,从言者意图的基本事实出发,阐释了二者的共性。  相似文献   

19.
Speaker variability is an important source of speech variations which makes continuous speech recognition a difficult task. Adapting automatic speech recognition (ASR) models to the speaker variations is a well-known strategy to cope with the challenge, Almost all such techniques focus on developing adaptation solutions within the acoustic models of the ASR systems. Although variations of the acoustic features constitute an important portion of the inter-speaker variations, they do not cover variations at the phonetic level. Phonetic variations are known to form an important part of variations which are influenced by both micro-segmental and suprasegmental factors. Inter-speaker phonetic variations are influenced by the structure and anatomy of a speaker's articulatory system and also his/her speaking style which is driven by many speaker background characteristics such as accent, gender, age, socioeconomic and educational class. The effect of inter-speaker variations in the feature space may cause explicit phone recognition errors. These errors can be compensated later by having appropriate pronunciation variants for the lexicon entries which consider likely phone misclassifications besides pronunciation. In this paper, we introduce speaker adaptive dynamic pronunciation models, which generate different lexicons for various speaker clusters and different ranges of speech rate. The models are hybrids of speaker adapted contextual rules and dynamic generalized decision trees, which take into account word phonological structures, rate of speech, unigram probabilities and stress to generate pronunciation variants of words. Employing the set of speaker adapted dynamic lexicons in a Farsi (Persian) continuous speech recognition task results in word error rate reductions of as much as 10.1% in a speaker-dependent scenario and 7.4% in a speaker-independent scenario.  相似文献   

20.
为了提高高校实验设备的综合利用率,给出了一种基于可编程电源的光伏电池阵列模拟系统实现方法。采用上位PC机计算太阳能电池阵列的伏安特性曲线,并通过串行通信控制可编程电源,使其输出电压和电流跟踪该伏安特性曲线,从而用可编程电源实现了光伏电池阵列模拟器的功能。根据光伏电池等效电路模型,导出了基于光伏电池阵列开路电压、短路电流、最大功率点对应电压和电流等参数的光伏电池伏安特性解析表达式。实验验证了该模拟系统的可行性和有效性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号