This study investigated whether adults who stutter and normal adult speakers differ in the production of stop consonants in fluent reading Chinese Putonghua speech.Voice onset time(VOT) was measured and the spectral moments at the stop burst were calculated for the stutterers(both before and after the speech therapy) and also for the nonstutterers. The statistical results showed that there were no significant differences in VOT between the nonstutterers and stutterers either prior to or after therapy,although the mean VOT of the stutterers was slightly greater than that of the nonstutterers.The results also indicated that both the obstruction place and the subsequent syllabic final exhibited an influence to a greater extent on VOT for the stutterers.In the spectral domain,the spectral mean of the stuttering participants before therapy was significantly different from that of the normal participants, whereas the group difference became insignificant after the therapy session.The smaller spectral mean for the stutterers might be interpreted as a more posterior occlusion in the oral cavity when producing alveolars and velars.In addition,productions of the stutterers scattered with a wider range in the space of spectral moments.Furthermore,the smaller main effect of syllabic finals on the mean spectral frequency of the burst suggested that the stutterers exhibited weaker anticipatory coarticulation than the nonstutterers.
A forced alignment based algorithms to detect Chinese repetitive stuttering is studied. According to the features of repetitions in Chinese stuttered speech,improvement solutions are provided based on the previous research findings.First,a multi-span looping forced alignment decoding networks is designed to detect multi-syllable repetitions in Chinese stuttered speech.Second,branch penalty factor is added in the networks to adjust decoding trend using recursive search in order to reduce the error from the complexity of the decoding networks. Finally,we re-judge the detected stutters by calculating confidence to improve the reliability of the detection result.The experimental results show that compared to previous algorithm,the proposed algorithm can improve system performance significantly,about 18%average detection error rate relatively.
近年来,非负矩阵分解(non-negative matrix factorization,NMF)被广泛应用于单通道语音分离问题。然而,标准的NMF算法假设语音的相邻帧之间是相互独立的,不能表征语音信号的时间连续性信息。为此,该文提出了一种基于NMF和因子条件随机场(factorial conditional random field,FCRF)的语音分离算法,首先将NMF和k均值聚类结合对纯净语音的频谱结构以及时间连续性进行建模,然后利用得到的模型训练FCRF模型,进而对混合语音信号进行分离。结果表明:该算法相比没有考虑语音时间连续特性的基于NMF的算法如激活集牛顿算法(active-set Newton algorithm,ASNA),在客观指标上有明显提高。