期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Feature Mapping and Recuperation by Using Elliptical Basis Function Networks for Robust Speaker Verification

李昕郑宇《上海大学学报(英文版)》2002,6(4):331-336

The performance of speaker verification systems is often compromised under real-world environments.For example,variations in handset characteristics could cause severe performance degradation.This paper presents a novel method to overcome this problem by using a non-linear handset mapper.Under this method,a mapper is constructed by training an elliptical basis function network using distorted speech features as inputs and the corresponding clean features as the desired outputs.During feature recuperation,clean features are recovered by feeding the distorted features to the feature mapper.The recovered features are then presented to a speaker model as if they were derived from clean speech.Experimental evaluation based on 258 speakers of the TIMIT and NTIMIT corpuses suggest that the feature mappers improve the verification performance remarkably. 相似文献

2.

以DSP实现基于概率DP匹配算法的说话人识别系统 总被引：1，自引：0，他引：1

周洁赵力邹采荣《实验室研究与探索》2005,24(11):12-14

提出了利用概率DP匹配算法进行说话人识别的的设想。并给出了运用TMS320C5416实现说话人自动识别系统的方案。该系统利用一种新的语音信号r阶的倒谱线性回归系数等参数构成识别的特征矢量集，运用提出的概率DP匹配算法进行与文本无关的说话人识别。实验结果表明该系统具有识别精度高、识别速度快，占用系统资源少等特点，是一种有效的说话人自动识别的实现方法。相似文献

3.

Speaker adapted dynamic lexicons containing phonetic deviations of words

Bahram Vazirnezhad Farshad Almasganj Seyed Mohammad Ahadi Ari Chanen 《浙江大学学报(A卷英文版)》2009,10(10):1461-1475

Speaker variability is an important source of speech variations which makes continuous speech recognition a difficult task. Adapting automatic speech recognition （ASR） models to the speaker variations is a well-known strategy to cope with the challenge, Almost all such techniques focus on developing adaptation solutions within the acoustic models of the ASR systems. Although variations of the acoustic features constitute an important portion of the inter-speaker variations, they do not cover variations at the phonetic level. Phonetic variations are known to form an important part of variations which are influenced by both micro-segmental and suprasegmental factors. Inter-speaker phonetic variations are influenced by the structure and anatomy of a speaker＇s articulatory system and also his/her speaking style which is driven by many speaker background characteristics such as accent, gender, age, socioeconomic and educational class. The effect of inter-speaker variations in the feature space may cause explicit phone recognition errors. These errors can be compensated later by having appropriate pronunciation variants for the lexicon entries which consider likely phone misclassifications besides pronunciation. In this paper, we introduce speaker adaptive dynamic pronunciation models, which generate different lexicons for various speaker clusters and different ranges of speech rate. The models are hybrids of speaker adapted contextual rules and dynamic generalized decision trees, which take into account word phonological structures, rate of speech, unigram probabilities and stress to generate pronunciation variants of words. Employing the set of speaker adapted dynamic lexicons in a Farsi （Persian） continuous speech recognition task results in word error rate reductions of as much as 10.1% in a speaker-dependent scenario and 7.4% in a speaker-independent scenario. 相似文献

4.

噪声环境中基于DTW的说话人识别

张飞云张鹏高建生《许昌学院学报》2011,30(5):68-72

在VC++环境下,提取PLAR特征参数,基于听觉特性和语谱特性的语音增强器作为预处理器,对语音信号首先进行降噪处理,建立了基于DTW的抗噪声说话人识别系统.实验结果表明,即使在信噪比比较低的情况下,该系统都会在一定程度上提高多种噪声环境下说话人识别系统的识别性能. 相似文献

5.

LPC系数的一种压缩方法

裴洪文曾宪权乔木《新乡教育学院学报》2007,20(2):65-66

本文指出了在声码器、语音识别、说话人识别等方面在利用广泛应用的线性预测技术时,压缩LPC系数的重要意义,又在最小均方误差准则下,探讨了将p个LPC系数压缩到q个(1相似文献

6.

Application of formant instantaneous characteristics to speech recognition and speaker identification

侯丽敏胡晓宁谢娟敏《上海大学学报(英文版)》2011,15(2):123-127

This paper proposes a new phase feature derived from the formant instantaneous characteristics for speech recognition (SR) and speaker identification (SI) systems. Using Hilbert transform (HT), the formant characteristics can be represented by instantaneous frequency (IF) and instantaneous bandwidth, namely formant instantaneous characteristics (FIC). In order to explore the importance of FIC both in SR and SI, this paper proposes different features from FIC used for SR and SI systems. When combing these new features with conventional parameters, higher identification rate can be achieved than that of using Mel-frequency cepstral coefficients (MFCC) parameters only. The experiment results show that the new features are effective characteristic parameters and can be treated as the compensation of conventional parameters for SR and SI. 相似文献

7.

基于韵律变换的情感说话人识别 总被引：1，自引：0，他引：1

宋鹏赵力邹采荣《东南大学学报》2011,(4):357-360

为了解决由情感变化引起的说话人识别性能下降问题,提出了一种新的情感说话人识别系统．首先,通过引入情感识别作为前端处理模块,对中性语音和情感语音进行分类．然后,对情感语音进行韵律修正,分别采用高斯归一化、高斯混合模型（GMM）和支持向量回归（SVR）等方法建立情感语音和中性语音的基频映射规则,并根据平均线性变化率对时长进行了修正．最后,对韵律修正后的情感语音进行识别．实验结果表明,提出的情感说话人识别系统可以有效地提高情感说话人识别的性能,识别率相比传统方法有了显著的提高．并且通过基频和时长修正的情感语音更接近于中性语音．相似文献

8.

应用VQ-HMM的汉语数码语音识别

赵力刘怡龙邹采荣吴镇扬《东南大学学报》2000,16(1):20-23

提出了一种新的语音识别方法,该方法综合了VQ和离散HMM算法,在每个状态通过用矢量量化误差值取代传统的HMM输出概率值来建立VQ-HMM.介绍了VQ-HMM,并通过非特定人汉语数码语音识别实验对其识别性能与传统的HMM作了相应的比较.实验结果表明该方法识别效果优于传统的HMM.,In this paper, a new speech recognition method was proposed, which integrated a VQ-distortion measure and a discrete HMM. The VQ-HMM uses a VQ-distortion measure at each state instead of a discrete output probability used by a discrete HMM. The VQ-HMM is described, and its speech recognition performance is compared with the conventional HMMs through the experiments on speaker-independent Chinese spoken digit recognition. The comparisons confirm that the new method over-performed traditional HMMs. 相似文献

9.

汉语词性标注特征模板设定定量分析

郑霞《安阳师范学院学报》2013,(5):53-56

在利用条件随机场（CRFs）进行汉语词性标注时,特征模板的选取是非常重要的一个环节,本文设计了两组特征模板,选取Bakeofl2007的CTB、NCC、PKU三种语料,使用CRF＋＋0.53工具包进行了对比实验,定量分析了影响词性标注的模板参数.通过实验得出以下结论：（1）词性标注的准确率与特征窗口大小不成正比,上文对当前词的词性的影响比下文要大,当前词的词性与其紧邻的前后两个词关系紧密;（2）产生的特征数多的模板训练较难进行;（3）词性转移特征对准确率有一定的影响. 相似文献

10.

基于噪声鲁棒性特征和SVM的耳语音可懂度增强(英文)

周健赵力梁瑞宇方贤勇《东南大学学报》2012,(3):261-265

提出了一种基于机器学习的耳语音可懂度增强方法.该方法利用已经训练好的2类支持向量机来估计一个二元时频掩蔽值,进而合成增强后的耳语音.输入支持向量机的特征向量GFCCs是基于听觉外周模型进行提取的,具有噪声鲁棒特性.在增强仿真实验中,将该算法同传统语音增强算法进行语音可懂度增强性能比较.客观评价和主观听力实验结果均表明,所提出的方法能有效提高含噪耳语音的听觉可懂度;相比谱减法和log-MMSE方法在低信噪比时无法提高语音可懂度,该方法在低信噪比时仍可有效提高含噪耳语音的听觉可懂度.此外,含噪耳语音通过所提出的方法进行增强后,其可懂度比未增强时明显提高. 相似文献