论文部分内容阅读
提出了一种基于声韵母能量分布和共振峰结构特性的汉语连续语音声韵母边界检测方法。该方法首先将语音经过Seneff听觉感知模型得到听觉谱,然后基于听觉谱,选取全频带能量、低频带能量、谱重心、高低频能量比、中高频能量等特征参数对各声韵母类别能量分布和共振峰结构特性进行描述,最后根据特征参数变化剧烈的点确定出声韵母边界,并采用包络的一阶差分和基于样点的Kullback-Leibler距离对得到的边界进行修正。实验结果表明,对8 kHz采样的语音边界检测准确率可达到93.7%;信噪比10dB的语音边界检测准确率可达到85.3%以上;经过参数编码后语音边界检测准确率可达86 7%以上。
A method for Chinese continuous vowel boundary detection based on the energy distribution of vowels and the structure of formant was proposed. In this method, we first obtain the auditory spectrum from the speech through the Seneff auditory perception model. Then based on the auditory spectrum, we choose the energy distribution of the vowel constellations, such as full-band energy, low-band energy, spectral centroid, high and low frequency energy ratio, The structure of the formant is described. Finally, the vowels boundaries are determined according to the sharp changes of the characteristic parameters. The first-order difference of the envelopes and the sample-based Kullback-Leibler distance are used to correct the boundaries. The experimental results show that the accuracy of speech boundary detection for 8 kHz sampling can reach 93.7%, the speech boundary detection accuracy of 10dB SNR can reach more than 85.3%, and the speech boundary detection accuracy after parameter coding can reach 86.7% .