期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The impact of background speech varying in intelligibility: effects on cognitive performance and perceived disturbance

Schlittmeier SJ Hellbrück J Thaden R Vorländer M 《Ergonomics》2008,51(5):719-736

Noise abatement in office environments often focuses on the reduction of background speech intelligibility and noise level, as attainable with frequency-specific insulation. However, only limited empirical evidence exists regarding the effects of reducing speech intelligibility on cognitive performance and subjectively perceived disturbance. Three experiments tested the impact of low background speech (35 dB(A)) of both good and poor intelligibility, in comparison to silence and highly intelligible speech not lowered in level (55 dB(A)). The disturbance impact of the latter speech condition on verbal short-term memory (n=20) and mental arithmetic (n=24) was significantly reduced during soft and poorly intelligible speech, but not during soft and highly intelligible speech. No effect of background speech on verbal-logical reasoning performance (n=28) was found. Subjective disturbance ratings, however, were consistent over all three experiments with, for example, soft and poorly intelligible speech rated as the least disturbing speech condition but still disturbing in comparison to silence. It is concluded, therefore, that a combination of objective performance tests and subjective ratings is desirable for the comprehensive evaluation of acoustic office environments and their alterations. 相似文献

2.

The effect of speech and speech intelligibility on task performance

《Ergonomics》2012,55(11):1068-1091

The aim of this study was to find out what are the effects of three different sound environments on performance of cognitive tasks of varying complexity. These three sound environments were ‘speech’, ‘masked speech’ and ‘continuous noise’. They corresponded to poor, acceptable and perfect acoustical privacy in an open-plan office, respectively. The speech transmission indices were 0.00, 0.30 and 0.80, respectively. Sounds environments were presented at 48 dBA. The laboratory experiment on 36 subjects lasted for 4 h for each subject. Proofreading performance deteriorated in the ‘speech’ (p < 0.05) compared to the other two sound environments. Reading comprehension and computer-based tasks (simple and complex reaction time, subtraction, proposition, Stroop and vigilance) remained unaffected. Subjects assessed the ‘speech’ as the most disturbing, most disadvantageous and least pleasant environment (p < 0.01). ‘Continuous noise’ annoyed the least. Subjective arousal was highest in ‘masked speech’ and lowest in ‘continuous noise’ (p < 0.05). Performance in real open-plan offices could be improved by reducing speech intelligibility, e.g. by attenuating speech level and using an appropriate masking environment. 相似文献

3.

Speech intelligibility and speech quality of modified loudspeaker announcements examined in a simulated aircraft cabin

Sibylle Pennig Julia Quehl Martin Wittkowski 《Ergonomics》2014,57(12):1806-1816

Acoustic modifications of loudspeaker announcements were investigated in a simulated aircraft cabin to improve passengers’ speech intelligibility and quality of communication in this specific setting. Four experiments with 278 participants in total were conducted in an acoustic laboratory using a standardised speech test and subjective rating scales. In experiments 1 and 2 the sound pressure level (SPL) of the announcements was varied (ranging from 70 to 85 dB(A)). Experiments 3 and 4 focused on frequency modification (octave bands) of the announcements. All studies used a background noise with the same SPL (74 dB(A)), but recorded at different seat positions in the aircraft cabin (front, rear). The results quantify speech intelligibility improvements with increasing signal-to-noise ratio and amplification of particular octave bands, especially the 2 kHz and the 4 kHz band. Thus, loudspeaker power in an aircraft cabin can be reduced by using appropriate filter settings in the loudspeaker system. 相似文献

4.

Perceptions of performance and satisfaction after relocation to an activity-based office

Linda Rolfö Jörgen Eklund Helena Jahncke 《Ergonomics》2018,61(5):644-657

Many companies move from open-plan offices (OPO) to activity-based workplaces (ABWs). However, few studies examine the benefits and drawbacks following such a change. The aim of this study was to explore how physical conditions, office use, communication, privacy, territoriality, satisfaction and perceived performance change following a company’s relocation from an OPO to an ABW. A mixed methods approach included pre- and post-relocation questionnaires and post-relocation focus groups, individual interviews and observations. The questionnaires enabled comparisons over time (n = 34) and broader analyses based on retrospective ratings of perceived change (n = 66). Results showed that satisfaction with auditory privacy, background noise, air quality, outdoor view and aesthetics increased significantly after relocation. Negative outcomes, such as lack of communication within teams, were perceived as being due to the high people-to-workstation ratio and lack of rules. Overall satisfaction with the physical work environment increased in the ABW compared to the OPO. Perceived performance did not change significantly.

Practitioner Summary: Activity-based workplaces (ABWs) are commonly implemented although their effects on performance and well-being are unclear. This case study gives advice to stakeholders involved in office planning. Despite shortcomings with the people-to-workstation ratio and rules, employees showed improved satisfaction with auditory privacy and aesthetics in the ABW compared with the previous open-plan office. 相似文献

5.

Intelligibility of synthesized voice messages in commercial truck cab noise for normal-hearing and hearing-impaired listeners

H. Boyd Morrison John G. Casali 《International Journal of Speech Technology》1997,2(1):33-44

A human factors experiment was conducted to assess the intelligibility of synthesized speech under a variety of noise conditions for both hearing-impaired and normal-hearing subjects. Modified Rhyme Test stimuli were used to determine intelligibility in four speech-to-noise (S/N) ratios (0, 5, 10, and 15 dB), and three noise types, consisting of fiat-by-octaves (pink) noise, interior noise of a currently produced heavy truck, and truck cab noise with added background speech. A quiet condition was also investigated. During recording of the truck noise for the experiment, in-cab noise measurements were obtained. According to OSHA standards, these data indicated that drivers of the sampled trucks have a minimal risk for noise-induced hearing loss due to in-cab noise exposure when driving at freeway speeds because noise levels were below 80 dBA. In the intelligibility experiment, subjects with hearing loss had significantly lower intelligibility than normal-hearing subjects, both in quiet and in noise, but no interaction with noise type or S/N ratio was found. Intelligibility was significantly lower for the noise with background speech than the other noises, but the truck noise produced intelligibility equal to the pink noise. An analytical prediction of intelligibility using Articulation Index calculations exhibited a high positive correlation with the empirically obtained intelligibility data for both groups of subjects. 相似文献

6.

Intelligibility enhancement of HMM-generated speech in additive noise by modifying Mel cepstral coefficients to increase the glimpse proportion

《Computer Speech and Language》2014,28(2):665-686

This paper describes speech intelligibility enhancement for Hidden Markov Model (HMM) generated synthetic speech in noise. We present a method for modifying the Mel cepstral coefficients generated by statistical parametric models that have been trained on plain speech. We update these coefficients such that the glimpse proportion – an objective measure of the intelligibility of speech in noise – increases, while keeping the speech energy fixed. An acoustic analysis reveals that the modified speech is boosted in the region 1–4 kHz, particularly for vowels, nasals and approximants. Results from listening tests employing speech-shaped noise show that the modified speech is as intelligible as a synthetic voice trained on plain speech whose duration, Mel cepstral coefficients and excitation signal parameters have been adapted to Lombard speech from the same speaker. Our proposed method does not require these additional recordings of Lombard speech. In the presence of a competing talker, both modification and adaptation of spectral coefficients give more modest gains. 相似文献

7.

The effect of speech and speech intelligibility on task performance

Venetjoki N Kaarlela-Tuomaala A Keskinen E Hongisto V 《Ergonomics》2006,49(11):1068-1091

The aim of this study was to find out what are the effects of three different sound environments on performance of cognitive tasks of varying complexity. These three sound environments were 'speech', 'masked speech' and 'continuous noise'. They corresponded to poor, acceptable and perfect acoustical privacy in an open-plan office, respectively. The speech transmission indices were 0.00, 0.30 and 0.80, respectively. Sounds environments were presented at 48 dBA. The laboratory experiment on 36 subjects lasted for 4 h for each subject. Proofreading performance deteriorated in the 'speech' (p < 0.05) compared to the other two sound environments. Reading comprehension and computer-based tasks (simple and complex reaction time, subtraction, proposition, Stroop and vigilance) remained unaffected. Subjects assessed the 'speech' as the most disturbing, most disadvantageous and least pleasant environment (p < 0.01). 'Continuous noise' annoyed the least. Subjective arousal was highest in 'masked speech' and lowest in 'continuous noise' (p < 0.05). Performance in real open-plan offices could be improved by reducing speech intelligibility, e.g. by attenuating speech level and using an appropriate masking environment. 相似文献

8.

Effects of white noise on Callsign Acquisition Test and Modified Rhyme Test scores

Blue-Terry M Letowski T 《Ergonomics》2011,54(2):139-145

The Callsign Acquisition Test (CAT) is a speech intelligibility test developed by the US Army Research Laboratory. The test has been used to evaluate speech transmission through various communication systems but has not been yet sufficiently standardised and validated. The aim of this study was to compare CAT and Modified Rhyme Test (MRT) performance in the presence of white noise across a range of signal-to-noise ratios (SNRs). A group of 16 normal-hearing listeners participated in the study. The speech items were presented at 65 dB(A) in the background of white noise at SNRs of -18, -15, -12, -9 and -6 dB. The results showed a strong positive association (75.14%) between the two tests, but significant differences between the CAT and MRT absolute scores in the range of investigated SNRs. Based on the data, a function to predict CAT scores based on existing MRT scores and vice versa was formulated. STATEMENT OF RELEVANCE: This work compares performance data of a common speech intelligibility test (MRT) with a new test (CAT) in the presence of white noise. The results here can be used as a part of the standardisation procedures and provide insights to the predictive capabilities of the CAT to quantify speech intelligibility communication in high-noise military environments. 相似文献

9.

Synthesis and perception of breathy,normal, and Lombard speech in the presence of noise

《Computer Speech and Language》2014,28(2):648-664

This papers studies the synthesis of speech over a wide vocal effort continuum and its perception in the presence of noise. Three types of speech are recorded and studied along the continuum: breathy, normal, and Lombard speech. Corresponding synthetic voices are created by training and adapting the statistical parametric speech synthesis system GlottHMM. Natural and synthetic speech along the continuum is assessed in listening tests that evaluate the intelligibility, quality, and suitability of speech in three different realistic multichannel noise conditions: silence, moderate street noise, and extreme street noise. The evaluation results show that the synthesized voices with varying vocal effort are rated similarly to their natural counterparts both in terms of intelligibility and suitability. 相似文献

10.

改进的变步长最小均方误差电子耳蜗语音增强算法

徐文超王光艳陈雷《计算机应用》2017,37(4):1212-1216

针对外部强噪声环境下电子耳蜗语音质量受损、适应性差等问题,提出了基于谱减法和变步长最小均方误差（LMS）自适应滤波算法联合去噪的改进方法,并以该方法构建了一个电子耳蜗前端语音预处理系统。利用变步长LMS自适应滤波算法输出误差的平方项来调节步长,采用步长值固定与变化相结合的方法,解决了自适应滤波算法收敛速度慢、稳态误差大的问题,适应性得到提高,提高了语音信号通信质量。该系统以TMS320VC5416和音频编解码芯片TLV320AIC23B为核心,通过多通道缓冲串口（McBSP）和串行外设接口（SPI）实现了语音数据的高速采集和实时处理。实验仿真和测试结果表明该算法消除噪声性能好,信噪比在低输入信噪比情况下提高约10 dB,语音质量感知评价（PESQ）分值也得到较大提高,能有效提高语音信号质量,且该系统性能稳定,能进一步提高耳蜗前端语音的清晰度和可懂度。相似文献

11.

Combined effects of acoustic and visual distraction on cognitive performance and well-being

Liebl A Haller J Jödicke B Baumgartner H Schlittmeier S Hellbrück J 《Applied ergonomics》2012,43(2):424-434

Information work is usually performed in offices and influenced by the combined effects of acoustics, room climate, lighting and air quality. However, the principal part of literature solely focuses on the individual effects of physical parameters. This study (n = 32) investigates the combined effects of acoustic and visual distraction with regard to cognitive performance and well-being. Therefore low level background speech (40 dB(A)) of good or poor intelligibility was combined with either static or dynamic lighting. Experimental testing lasted for approx. 7 h for each participant and was conducted in mock-up offices. No interaction effects of background speech and lighting conditions with regard to cognitive performance were found. However, the results prove that even low level background speech of high intelligibility significantly impairs short-term memory, reasoning ability and well-being. But no effect of background speech on text comprehension and sustained attention was found. Visual distraction due to dynamic lighting caused significant complaints but did not impair performance. An interaction effect of background speech and lighting conditions was found with regard to perceived performance during task processing. Test persons only felt to perform better, if background speech of low intelligibility was combined with static lighting. It is shown that the effects on cognitive performance and well-being must be considered separately since these effects are rarely consistent. 相似文献

12.

基于言语情境分析的数字语音篡改检测

丁琦平西建《计算机应用》2011,31(5):1284-1287

针对使用拼接手段的数字语音篡改,提出一种基于言语情境分析的篡改检测方法。该方法从背景噪声分析和说话人状态特征分析两方面入手,把语音信号分为语音部分和静音部分,对包含噪声的各个静音片段各帧提取时域和频域特征,对各语音片段提取韵律特征和音质特征,并分别基于贝叶斯信息准则检测特征的跳变点,通过综合判断得到篡改检测结果。实验结果表明,该方法能够比较准确地检测和定位语音拼接点。相似文献

13.

基于CycleGAN的语音可懂度关键技术

肖晶刘佳奇李登实赵兰馨王前瑞《计算机系统应用》2022,31(6):1-9

语音可懂度增强是一种在嘈杂环境中再现清晰语音的感知增强技术. 许多研究通过说话风格转换(SSC)来增强语音可懂度, 这种方法仅依靠伦巴第效应, 因此在强噪声干扰下效果不佳. SSC还利用简单的线性变换对基频(F0)的转换进行建模, 并且只映射很少维的梅尔倒谱系数(MCEPs). 因为F0和MCEPs是语音的两个重要特征, 对这些特征进行充分的建模是非常必要的. 因此本文进行了一个创新性研究即通过连续小波变换(CWT)将F0分解为10维来描述不同时间尺度的语音, 以实现F0的有效转换, 而且使用20维表示MCEPs实现MCEPs的转换. 除此之外, 还利用iMetricGAN网络来优化强噪声中的语音可懂度指标. 实验结果表明, 提出的基于CycleGAN使用CWT和iMetricGAN的非平行语音风格转换方法(NS-CiC)在客观和主观评价上均显著提高了强噪声环境下的语音可懂度. 相似文献

14.

Computational auditory models in predicting noise reduction performance for wideband telephony applications

Nazanin Pourmand Vijay Parsa Angela Weaver 《International Journal of Speech Technology》2013,16(4):363-379

The performance of several noise reduction algorithms intended for wideband telephony was evaluated both subjectively and objectively. The chosen algorithms were based on statistical modeling, spectral subtraction, Wiener filtering, or subspace modelling principles. A customized wideband noise reduction database containing speech samples corrupted by three types of background noises at three SNR levels, along with their enhanced versions was created. The overall quality of the speech samples in the database was subsequently rated by a group of listeners with normal hearing capabilities. Comprehensive statistical analyses were performed to assess the reliability of the subjective data, and to assess the performance of noise reduction algorithms across varied noisy conditions. The subjective quality ratings were then used to investigate the performance of several auditory model-based objective quality metrics. Key results from these investigations include: (a) there was a high degree of inter- and intra-subject reliability in the subjective ratings, (b) noise reduction algorithms enhance speech quality for only a subset of the noise conditions, and (c) auditory model-based metrics perform similarly in predicting speech quality ratings, when speech quality scores pertaining to a particular noise condition were averaged. 相似文献

15.

Phoneme Intelligibility of Four Text-to-Speech Products to Nonnative Speakers of English in Noise

H.?S.?Venkatagiri Email author 《International Journal of Speech Technology》2005,8(4):313-321

The study investigated the segmental intelligibility of four text-to-speech (TTS) products under 0 dB and 5 dB signal-to-noise ratios in a group of native and nonnative speakers of English. Each product—AT&T Next-Gen™, Festival version 1.4.2, FlexVoice™ 2, and IBM ViaVoice™ Version 5.1—uses a different algorithm for generating speech from text. The results, which benefit developers of TTS technology as well as developers of products that utilize TTS, showed that (1) all TTS products were less intelligible to nonnative speakers of English than native speakers, (2) the “hybrid” TTS product that combined concatenative and formant synthesis methods was the least intelligible of the four products investigated, (3) the remaining three products, which used formant, concatenative diphone based LPC, and concatenative waveform synthesis methods respectively, were equally intelligible to nonnative speakers, (4) none of the four TTS products was better at resisting intelligibility loss due to noise than others, and (5) listening to currently available unrestricted TTS under high noise conditions would probably require a greater amount of cognitive resources on the part of both native and nonnative speakers of English and may be difficult when other demanding activities are concurrently performed. 相似文献

16.

一种用于因果式语音增强的门控循环神经网络

李江和王玫《计算机工程》2022,48(11):77-82

传统基于深度学习的语音增强方法为了提高网络对带噪语音的建模能力,通常采用非因果式的网络输入,由此导致了固定时延问题,使得语音增强系统实时性较差。提出一种用于因果式语音增强的门控循环神经网络CGRU,以解决实时语音增强系统中的固定时延问题并提高语音增强性能。为了更好地建模带噪语音信号的相关性,网络单元在计算当前时刻的输出时融合上一时刻的输入与输出。此外,采用线性门控机制来控制信息传输,以缓解网络训练过程中的过拟合问题。考虑到因果式语音增强系统对实时性要求较高,在CGRU网络中采用单门控的结构设计,以降低网络的结构复杂度,提高系统的实时性。实验结果表明,CGRU网络在增强后的语音感知质量、语音客观可懂度、分段信噪比指标上均优于GRU、SRNN、SRU等传统网络结构,在信噪比为0 dB的条件下,CGRU的平均语音感知质量和平均语音客观可懂度分别达到2.4和0.786。相似文献

17.

Speech Detection in Non-Stationary Noise Based on the 1/f Process

下载免费PDF全文

王帆郑方吴文虎《计算机科学技术学报》2002,17(1):0-0

In this paper,an effective and robust active speech detection method is proposed based on the 1/f process technique for signals under non-stationary noisy environments.The Gaussian 1/f process ,a mathematical model for statistically self-similar radom processes based on fractals,is selected to model the speech and the background noise.An optimal Bayesian two-class classifier is developed to discriminate them by their 1/f wavelet coefficients with Karhunen-Loeve-type properties.Multiple templates are trained for the speech signal,and the parameters of the background noise can be dynamically adapted in runtime to model the variation of both the speech and the noise.In our experiments,a 10-minute long speech with different types of noises ranging from 20dB to 5dB is tested using this new detection method.A high performance with over 90% detection accuracy is achieved when average SNR is about 10dB. 相似文献

18.

Measuring the naturalness of synthetic speech

Howard C. Nusbaum Alexander L. Francis Anne S. Henly 《International Journal of Speech Technology》1995,1(1):7-19

Even the highest quality synthetic speech generated by rule sounds unlike human sppech. As the intelligibility of rule-based synthetic speech improves, and the number of applications for synthetic speech increases, the naturalness of synthetic speech will become an important factor in determining its use. In order to improve this aspect of the quality of synthetic speech it is necessary to have diagnostic tests that can measure naturalness. Currently, all of the available metrics for evaluating the acceptability of synthetic speech do not distinguish sufficiently between measuring overall acceptability (including naturalness) and simply measuring the ability of listeners to extract intelligible information from the signal. In this paper we propose a new methodology for measuring the naturalness of particular aspects of synthesized speech, independent of the intelligibility of the speech. Although naturalness is a multidimensional, subjective quality of speech, this methodology makes it possible to assess the separate contributions of prosodic, segmental, and source characteristics of the utterance. In two experiments, listeners reliably differentiated the naturalness of speech produced by two male talkers and two text-to-speech systems. Furthermore, they reliably differentiated between the two text-to-speech systems. The results of these experiments demonstrate that perception of naturalness is affected by information contained within the smallest part of speech, the glottal pulse, and by information contained within the prosodic structure of a syllable. These results shown that this new methodology does provide a solid basis for measuring and diagnosing the naturalness of synthetic speech. 相似文献

19.

The perceived rudeness of public cell phone behaviour

《Behaviour & Information Technology》2012,31(10):947-952

We report two studies comparing cell phone conversations with face-to-face conversations. The first (N = 60) measured the volume of cell phone conversations with face-to-face conversations in the same location and found that, controlling for gender, cell phone conversations are slightly (1.90 dB) louder. We then replicated (N = 160) a study that compared rudeness ratings that observers gave cell phone conversations with ratings of face-to-face conversations in which either one or both speakers were audible. We found that, controlling for volume, cell phone conversations were rated significantly ruder than conversations between two audible speakers. But face-to-face conversations in which only one speaker was audible were, controlling for volume, rated as ruder than cell phone conversations. Several observer characteristics (age, gender and amount of cell phone use) had no significant relationship to the observer's rating of the rudeness of the conversation. 相似文献

20.

An effective online teaching method: the combination of collaborative learning with initiation and self-regulation learning with feedback

《Behaviour & Information Technology》2012,31(7):712-723

In modern business environments, work and tasks have become more complex and require more interdisciplinary skills to complete, including collaborative and computing skills for website design. However, the computing education in Taiwan can hardly be recognised as effective in developing and transforming students into competitive employees. In this regard, the author adopted collaborative learning (CL) with initiation and self-regulated learning (SRL) with feedback to develop students' collaborative skills and regular learning habits and further contribute to practical computing skills for website design. This study comprised an experiment that included 279 second-year university students from five class sections, including four experimental groups (CISF group, n = 57; CIS group, n = 53; CI group, n = 68; C group, n = 68), and a control group (T group, n = 33). The results reveal that students who received the combined treatment of online CL with initiation and SRL with feedback attained the best grades for their computing skills for website design among the five groups. The author further discusses the implications for teachers, schools and educators who plan to design practical scenarios and online learning activities for their students. 相似文献