期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

李昕郑宇江芳泽《上海大学学报(英文版)》2002,6(4)

The performance of speaker verification systems is often compromised under real-world environments. For example, variations in handset characteristics could cause severe performance degradation. This paper presents a novel method to overcome this problem by using a non-linear handset mapper. Under this method, a mapper is constructed by training an elliptical basis function network using distorted speech features as inputs and the corresponding clean features as the desired outputs. During feature recuperation, clean features are recovered by feeding the distorted features to the feature mapper. The recovered features are then presented to a speaker model as if they were derived from clean speech. Experimental evaluations based on 258 speakers of the TIMIT and NTIMIT corpuses suggest that the feature mappers improve the verification performance remarkably. 相似文献

2.

以DSP实现基于概率DP匹配算法的说话人识别系统 总被引：1，自引：0，他引：1

周洁赵力邹采荣《实验室研究与探索》2005,24(11):12-14

提出了利用概率DP匹配算法进行说话人识别的的设想。并给出了运用TMS320C5416实现说话人自动识别系统的方案。该系统利用一种新的语音信号r阶的倒谱线性回归系数等参数构成识别的特征矢量集，运用提出的概率DP匹配算法进行与文本无关的说话人识别。实验结果表明该系统具有识别精度高、识别速度快，占用系统资源少等特点，是一种有效的说话人自动识别的实现方法。相似文献

3.

Speaker adapted dynamic lexicons containing phonetic deviations of words

Bahram Vazirnezhad Farshad Almasganj Seyed Mohammad Ahadi Ari Chanen 《浙江大学学报(A卷英文版)》2009,10(10):1461-1475

Speaker variability is an important source of speech variations which makes continuous speech recognition a difficult task. Adapting automatic speech recognition （ASR） models to the speaker variations is a well-known strategy to cope with the challenge, Almost all such techniques focus on developing adaptation solutions within the acoustic models of the ASR systems. Although variations of the acoustic features constitute an important portion of the inter-speaker variations, they do not cover variations at the phonetic level. Phonetic variations are known to form an important part of variations which are influenced by both micro-segmental and suprasegmental factors. Inter-speaker phonetic variations are influenced by the structure and anatomy of a speaker＇s articulatory system and also his/her speaking style which is driven by many speaker background characteristics such as accent, gender, age, socioeconomic and educational class. The effect of inter-speaker variations in the feature space may cause explicit phone recognition errors. These errors can be compensated later by having appropriate pronunciation variants for the lexicon entries which consider likely phone misclassifications besides pronunciation. In this paper, we introduce speaker adaptive dynamic pronunciation models, which generate different lexicons for various speaker clusters and different ranges of speech rate. The models are hybrids of speaker adapted contextual rules and dynamic generalized decision trees, which take into account word phonological structures, rate of speech, unigram probabilities and stress to generate pronunciation variants of words. Employing the set of speaker adapted dynamic lexicons in a Farsi （Persian） continuous speech recognition task results in word error rate reductions of as much as 10.1% in a speaker-dependent scenario and 7.4% in a speaker-independent scenario. 相似文献

4.

噪声环境中基于DTW的说话人识别

张飞云张鹏高建生《许昌学院学报》2011,30(5):68-72

在VC++环境下,提取PLAR特征参数,基于听觉特性和语谱特性的语音增强器作为预处理器,对语音信号首先进行降噪处理,建立了基于DTW的抗噪声说话人识别系统.实验结果表明,即使在信噪比比较低的情况下,该系统都会在一定程度上提高多种噪声环境下说话人识别系统的识别性能. 相似文献

5.

LPC系数的一种压缩方法

裴洪文曾宪权乔木《新乡教育学院学报》2007,20(2):65-66

本文指出了在声码器、语音识别、说话人识别等方面在利用广泛应用的线性预测技术时,压缩LPC系数的重要意义,又在最小均方误差准则下,探讨了将p个LPC系数压缩到q个(1相似文献

6.

Application of formant instantaneous characteristics to speech recognition and speaker identification

侯丽敏胡晓宁谢娟敏《上海大学学报(英文版)》2011,15(2):123-127

This paper proposes a new phase feature derived from the formant instantaneous characteristics for speech recognition (SR) and speaker identification (SI) systems. Using Hilbert transform (HT), the formant characteristics can be represented by instantaneous frequency (IF) and instantaneous bandwidth, namely formant instantaneous characteristics (FIC). In order to explore the importance of FIC both in SR and SI, this paper proposes different features from FIC used for SR and SI systems. When combing these new features with conventional parameters, higher identification rate can be achieved than that of using Mel-frequency cepstral coefficients (MFCC) parameters only. The experiment results show that the new features are effective characteristic parameters and can be treated as the compensation of conventional parameters for SR and SI. 相似文献

7.

基于韵律变换的情感说话人识别 总被引：1，自引：0，他引：1

宋鹏赵力邹采荣《东南大学学报》2011,(4):357-360

为了解决由情感变化引起的说话人识别性能下降问题,提出了一种新的情感说话人识别系统．首先,通过引入情感识别作为前端处理模块,对中性语音和情感语音进行分类．然后,对情感语音进行韵律修正,分别采用高斯归一化、高斯混合模型（GMM）和支持向量回归（SVR）等方法建立情感语音和中性语音的基频映射规则,并根据平均线性变化率对时长进行了修正．最后,对韵律修正后的情感语音进行识别．实验结果表明,提出的情感说话人识别系统可以有效地提高情感说话人识别的性能,识别率相比传统方法有了显著的提高．并且通过基频和时长修正的情感语音更接近于中性语音．相似文献

8.

应用VQ-HMM的汉语数码语音识别

赵力刘怡龙邹采荣吴镇扬《东南大学学报》2000,16(1):20-23

提出了一种新的语音识别方法,该方法综合了VQ和离散HMM算法,在每个状态通过用矢量量化误差值取代传统的HMM输出概率值来建立VQ-HMM.介绍了VQ-HMM,并通过非特定人汉语数码语音识别实验对其识别性能与传统的HMM作了相应的比较.实验结果表明该方法识别效果优于传统的HMM.,In this paper, a new speech recognition method was proposed, which integrated a VQ-distortion measure and a discrete HMM. The VQ-HMM uses a VQ-distortion measure at each state instead of a discrete output probability used by a discrete HMM. The VQ-HMM is described, and its speech recognition performance is compared with the conventional HMMs through the experiments on speaker-independent Chinese spoken digit recognition. The comparisons confirm that the new method over-performed traditional HMMs. 相似文献

9.

汉语词性标注特征模板设定定量分析

郑霞《安阳师范学院学报》2013,(5):53-56

在利用条件随机场（CRFs）进行汉语词性标注时,特征模板的选取是非常重要的一个环节,本文设计了两组特征模板,选取Bakeofl2007的CTB、NCC、PKU三种语料,使用CRF＋＋0.53工具包进行了对比实验,定量分析了影响词性标注的模板参数.通过实验得出以下结论：（1）词性标注的准确率与特征窗口大小不成正比,上文对当前词的词性的影响比下文要大,当前词的词性与其紧邻的前后两个词关系紧密;（2）产生的特征数多的模板训练较难进行;（3）词性转移特征对准确率有一定的影响. 相似文献

10.

基于噪声鲁棒性特征和SVM的耳语音可懂度增强(英文)

周健赵力梁瑞宇方贤勇《东南大学学报》2012,(3):261-265

提出了一种基于机器学习的耳语音可懂度增强方法.该方法利用已经训练好的2类支持向量机来估计一个二元时频掩蔽值,进而合成增强后的耳语音.输入支持向量机的特征向量GFCCs是基于听觉外周模型进行提取的,具有噪声鲁棒特性.在增强仿真实验中,将该算法同传统语音增强算法进行语音可懂度增强性能比较.客观评价和主观听力实验结果均表明,所提出的方法能有效提高含噪耳语音的听觉可懂度;相比谱减法和log-MMSE方法在低信噪比时无法提高语音可懂度,该方法在低信噪比时仍可有效提高含噪耳语音的听觉可懂度.此外,含噪耳语音通过所提出的方法进行增强后,其可懂度比未增强时明显提高. 相似文献

11.

Probability output of multi-class support vector machines 总被引：1，自引：0，他引：1

忻栋吴朝晖潘云鹤《浙江大学学报(A卷英文版)》2002,3(2):131-134

A novel approach to interpret the outputs of multi-class support vector machines is proposed in this paper. Using the geometrical interpretation of the classifying heperplane and the distance of the pattern from the hyperplane, one can calculate the posterior probability in binary classification case. This paper focuses on the probability output in multi-class phase where both the one-against-one and one-against-rest strategies are considered. Experiment on the speaker verification showed that this method has high performance. 相似文献

12.

ICM理论对奥巴马总统2012年胜选演说词的认知分析

钟书能杨细平《嘉应学院学报》2013,(6):73-76

作为一名演说家,美国总统奥巴马在一系列的演讲中展现了他精湛的演说能力。2012年11月7日,奥巴马总统发表了名为"最好的时代尚未到来"的连任胜选演说。在这次出色的演说中,他通过巧妙的运用语言艺术,征服了所有的听众。试图从认知的角度,着重以ICM理论为指导,对奥巴马总统2012年连任胜选演说词内在的语言特点进行认知解读,旨在为英语演说词的研究提供一个新的理论视角。相似文献

13.

Intonation and communicative intent in mothers' speech to infants: is the melody the message? 总被引：7，自引：0，他引：7

A Fernald 《Child development》1989,60(6):1497-1510

This study explores the power of intonation to convey meaningful information about the communicative intent of the speaker in speech addressed to preverbal infants and in speech addressed to adults. Natural samples of infant- and adult-directed speech were recorded from 5 mothers of 12-month-old infants, in 5 standardized interactional contexts: Attention-bid, Approval, Prohibition, Comfort, and Game/Telephone. 25 infant-directed and 25 adult-directed vocalizations were electronically filtered to eliminate linguistic content. The content-filtered speech stimuli were presented to 80 adult subjects: 40 experienced parents and 40 students inexperienced with infants. The subjects' task was to identify the communicative intent of the speaker using only prosodic information, given a 5-alternative forced choice. Listeners were able to use intonation to identify the speaker's intent with significantly higher accuracy in infant-directed speech than in adult-directed speech. These findings suggest that the prosodic patterns of speech to infants are more informative than those of adult-adult speech, and may provide the infant with reliable cues to the communicative intent of the speaker. The interpretation of these results proposed here is that the relation of prosodic form to communicative function is made uniquely salient in the melodies of mothers' speech, and that these characteristic prosodic patterns are potentially meaningful to the preverbal infant. 相似文献

14.

人称代词句末追加结构的话语功能分析———基于《家有儿女》台词的分析

李颖《内江师范学院学报》2014,(5):77-81

在言语交际中,说话人违反合作原则使听话人迫使自己超越话语的表面意义去领悟话语的隐含意义,即会话含义.人称代词复指格式在已经足量表述话语理性意义的情况下重复使用同指向的人称代词,违反了量准则,我们以此为线索研究该结构的话语功能.语料分析结果显示该结构通常用于口语对话体中,常以反问语气在原有语句信息的基础上表现说话人的某种强烈情感,传达[＋否定]、[＋出乎意料]、[＋不满]、[＋强主观性]等情感信息,句末“追加”成分是强烈情感所付托的语言形式,为“情感羡余”. 相似文献

15.

试论演讲受众

赵新战《渭南师范学院学报》2007,22(4):83-84,87

受众是演讲活动的重要要素,在演讲中决定演讲的内容和方式,影响和参与演讲的过程,是演讲目的的实现者;演讲者只有树立正确的受众意识,认真把握演讲受众接受信息的方式,把握受众的基本特点和心理,积极与受众进行现场互动,才能实现演讲的目的,增强演讲的效果。相似文献

16.

语音识别系统软件设计

余尤好《闽江学院学报》2012,33(5):61-65

提取语音信号的MFCC特征参数,用矢量量化（VQ）的LBG算法来建立匹配模板.在MATLAB软件平台上,基于GUI界面实现说话人识别系统软件设计,并通过实验验证其有效性. 相似文献

17.

基于音素的话者特定英语命令识别 总被引：2，自引：0，他引：2

贲俊万旺根余小清《上海大学学报(英文版)》2003,7(2):163-167

1　Introduction　Sincethe 195 0s ,speechrecognitiontechnologies ,bothspeaker dependentandspeaker independent ,withsmallorlargevocabulary ,andusingisolatedorconnectedwords,orcontinuousspeech ,havedevel opedandbeenwidelyapplied .Recentlyithasbecomeadominanttechnologyforhuman machineinterface .Speechrecognitionisbasicallytreatedasaproblemofpatternmatching .Thegoalistotakeonepattern ,i .e .,thespeechsignal,andclassifyitasasequenceofpreviouslylearnedpatterns ,e.g .,wordsorsubwordunitssuchsphonems[1… 相似文献

18.

Extraction of novel features for emotion recognition

李翔郑宇李昕《上海大学学报(英文版)》2011,15(5):479-486

Hilbert-Huang transform method has been widely utilized from its inception because of the superiority in varieties of areas. The Hilbert spectrum thus obtained is able to reflect the distribution of the signal energy in a number of scales accurately. In this paper, a novel feature called ECC is proposed via feature extraction of the Hilbert energy spectrum which describes the distribution of the instantaneous energy. The experimental results conspicuously demonstrate that ECC outperforms the traditional short-term average energy. Combination of the ECC with mel frequency cepstral coefficients (MFCC) delineates the distribution of energy in the time domain and frequency domain, and the features of this group achieve a better recognition effect compared with the feature combination of the short-term average energy, pitch and MFCC. Afterwards, further improvements of ECC are developed. TECC is gained by combining ECC with the teager energy operator, and EFCC is obtained by introducing the instantaneous frequency to the energy. In the experiments, seven status of emotion are selected to be recognized and the highest recognition rate 83.57% is achieved within the classification accuracy of boredom reaching 100%. The numerical results indicate that the proposed features ECC, TECC and EFCC can improve the performance of speech emotion recognition substantially. 相似文献

19.

Developmental Differences in Infant Attention to the Spectral Properties of Infant-directed Speech

Robin Panneton Cooper Richard N. Aslin 《Child development》1994,65(6):1663-1677

Across several independent studies, infants from a few days to 9 months of age have shown preferences for infant-directed (ID) over adult-directed (AD) speech. Moreover, 4-month-olds have been shown to prefer sine-wave analogs of the fundamental frequency of ID speech, suggesting that exaggerated pitch contours are prepotent stimuli for infants. The possibility of similar preferences by 1-month-olds was examined in a series of experiments, using a fixation-based preference procedure. Results from the first 2 experiments showed that 1-month-olds did not prefer the lower-frequency pitch characteristics of ID speech, even though 1-month-olds were able to discriminate low-pass filtered ID and AD speech. Since low-pass filtering may have distorted the fundamental frequency characteristics of ID speech, 1-month-olds were also tested with sine-wave analogs of the fundamental frequencies of the ID utterances. Infants in this third experiment also showed no preference for ID pitch contours. In the fourth experiment, 1-month-olds preferred a natural recording of ID speech over a version which preserved only its lower frequency prosodic features. From these results, it is argued that, although young infants are similar to older infants in their attraction to ID speech, their preferences depend on a wider range of acoustic features (e.g., spectral structure). It is suggested that exaggerated pitch contours which characterize ID speech may become salient communicative signals for infants through language-rich, interactive experiences with caretakers and increased perceptual acuity over the first months after birth. 相似文献

20.

政治演讲中人际意义的身份建构功能分析——以奥巴马在美国海军学院2013届毕业生毕业典礼上的演讲为例

高春慧王健坤《商丘职业技术学院学报》2014,(3):84-86

人际功能是系统功能语言学的三大元功能之一。本文以韩礼德系统功能语法中人际功能为主要理论,在语气和情态两个子系统的框架内,对奥巴马在海军学院2013届毕业生毕业典礼上的演讲进行了人际意义分析。通过分析发现：在政治演讲中,演讲者多使用陈述语气来阐述信念、表达肯定、树立权威;在情态方面,多使用中高量值情态动词来说服听众以响应自己的号召。借此,演讲者在演讲过程中运用人际功能的手段对自我形象进行有效建构,以实现其政治目的。相似文献