首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 22 毫秒
1.
The authors discuss immersive audio systems and the signal processing issues that pertain to the acquisition and subsequent rendering of 3D sound fields over loudspeakers. On the acquisition side, recent advances in statistical methods for achieving acoustical arrays in audio applications are reviewed. Classical array signal processing addresses two major aspects of spatial filtering, namely localization of a signal of interest, and adaptation of the spatial response of an array of sensors to achieve steering in a given direction. The achieved spatial focusing in the direction of interest makes array signal processing a necessary component in immersive sound acquisition systems. On the rendering side, 3D audio signal processing methods are described that allow rendering of virtual sources around the listener using only two loudspeakers. Finally, the authors discuss the commercial implications of audio DSP  相似文献   

2.
For many years online text chat software such as ICQ has been used as a means of multiuser interaction. As higher bandwidth is available, multipoint videoconferencing systems such as NetMeeting or CUseeMe that allow people from different geographical locations to see and talk to each other are becoming popular. However, these videoconferencing systems lack the sense of immersion because each participant sees other participants in separate windows, and it is often difficult to tell who is talking to whom. Based on user study, Argyle and Dean (1965) suggested that during face-to-face communication people vary physical proximity, eye contact, and other behaviors to optimize an overall level of intimacy. In addition, Tang and Isaacs (1993) found that gaze awareness is important because participants can use the visual cue to determine when another participant is paying attention to them. These studies suggest that a more immersive way of multiuser communication is to connect the users in a three-dimensional (3-D) virtual environment so that users feel that they are communicating face to face. Moving from a text-based chat room environment to a 3-D environment creates new challenges in several areas such as computer animation, signal processing, and computer vision. We introduce immersive interactive technologies used in multiuser 3-D virtual environments. We also survey existing systems  相似文献   

3.
Hybrid in-band on-channel digital audio broadcasting systems deliver digital audio signals in such a way that is backward compatible with existing analog FM transmission. We present a channel error correction and detection system that is well-suited for use with audio source coders, such as the so-called perceptual audio coder (PAC), that have error concealment/mitigation capabilities. Such error mitigation is quite beneficial for high quality audio signals. The proposed system involves an outer cyclic redundancy check (CRC) code that is concatenated with an inner convolutional code. The outer CRC code is used for error detection, providing flags to trigger the error mitigation routines of the audio decoder. The inner convolutional code consists of so-called complementary punctured-pair convolutional codes, which are specifically tailored to combat the unique adjacent channel interference characteristics of the FM band. We introduce a novel decoding method based on the so-called list Viterbi algorithm (LVA). This LVA-based decoding method, which may be viewed as a type of joint or integrated error correction and detection, exploits the concatenated structure of the channel code to provide enhanced decoding performance relative to decoding methods based on the conventional Viterbi algorithm (VA). We also present results of informal listening tests and other simulations on the Gaussian channel. These results include the preferred length of the outer CRC code for 96-kb/s audio coding and demonstrate that LVA-based decoding can significantly reduce the error flag rate relative to conventional VA-based decoding, resulting in dramatically improved decoded audio quality. Finally, we propose a number of methods for screening undetected errors in the audio domain  相似文献   

4.
We present a broad review of the optical wireless field, from early experiments through to today's high-performance systems. Emphasis is placed on understanding the benefits and limitations of optical wireless, all of which ultimately define the applications. A variety of systems are examined, each incorporating a different technological solution to suit the particular application  相似文献   

5.
A temporal domain audio watermarking technique   总被引:4,自引:0,他引:4  
Audio watermarking techniques can be used to embed extra information into audio signals. The goal is to hide prespecified data carrying some information into the audio stream such that it is not audible to the human ear (i.e., transparent) and is, at the same time, resistant to removal attacks (i.e., robust). In the currently known watermarking systems, the above challenges are not always adequately resolved. We present an alternative audio watermarking technique that mitigates these and other related shortcomings. The system is referred to as modified audio signal keying (MASK). In MASK, the short-time envelope of the audio signal is modified in such a way that the change is imperceptible to the human listener. The MASK system can easily be tailored for a wide range of applications. Moreover, informal experimental results show that it has a good robustness and audibility behavior.  相似文献   

6.
Network-centric music performance: practice and experiments   总被引:1,自引:0,他引:1  
Advances in information technology and the great proliferation of the Internet have changed nearly every aspect of the work and life of human beings. Despite progress in networked entertainment, many music professionals and enthusiasts are still sticking to the traditional way of carrying out rehearsals and concerts. Music performance in this way requires physical presence of the participants and has a number of inherent limitations. We introduce a novel system called network-centric music performance (NMP) that enables multiparty music performance through cyberspace. Our target is to support real-time multichannel natural audio streaming over the network, using audio compression schemes that can provide acceptable audio quality. A system like this is bandwidth-demanding and highly delay-sensitive, and requires synchronization of the audio streams. Hence, support from the underlying end systems and networks is critical. However, the current source coding mechanisms and the best effort nature of the internet pose many challenges to achieve the desired quality of service. We have implemented a prototype of NMP, and exploited end system and network influences on NMP. The work was done in a LAN environment using Linux PCs. The system enables two different application scenarios: real-time rehearsal and rehearsal on demand. Real-time multichannel audio transport and different audio compression schemes are supported. Our evaluation results based on both subjective and objective measurements show that the system provides sufficient audio quality level for the target application in such an environment. The scalability test also revealed that the system scales well with increase of clientele. In the future, we extend our system for networks spanning larger distances and experiment with more realistic network conditions in the Internet.  相似文献   

7.
8.
9.
Virtual reality (VR) provides a revolutionary interface between man and machine. However, present display and interface peripherals limit the potential of virtual environments within many activities or scenarios. Mainstream immersive VR is centred on head mounted display (HMD) based solutions in which the user is isolated from their surrounding environment. The occlusion of real world interaction within such systems imposes unnatural social and physical constraints on the user. Media environments can be classified as one form of enhanced reality based around immersive physical spaces intensified for effective collaborative activities. Current research is directed at three forms of enhanced spaces — immersive projected displays, interactive video environments, and immersive desktop environments. While HMD and desktop VR facilitates many collaborative tasks, the synthesis of real and virtual realities within a life-size environment offers distinct advantages within other applications. This paper introduces the concepts behind media environments, reviews current research and presents applications being explored at BT Laboratories.  相似文献   

10.
《Multimedia, IEEE》1995,2(2):7-9
We have efficient systems for transmitting 3-kHz audio signals from point to point, but the actual motives for communication-e.g. the desire to make new friends-are not particularly facilitated by the phone system. The same can be said for fax and e-mail. While invaluable as tools, they must often be coerced into serving an intended purpose, such as telling a joke. From this basis, and without any assumptions of technological solutions, we set about designing a system dedicated to facilitating and promoting social communication, with a particular emphasis on flexible and mobile collaboration. In some sense, our goal was to create a kind of electronic analog to the physical office that might supply some of the same encapsulation and facilitation of social interaction  相似文献   

11.
On the limits of steganography   总被引:17,自引:0,他引:17  
In this paper, we clarify what steganography is and what it can do. We contrast it with the related disciplines of cryptography and traffic security, present a unified terminology agreed at the first international workshop on the subject, and outline a number of approaches-many of them developed to hide encrypted copyright marks or serial numbers in digital audio or video. We then present a number of attacks, some new, on such information hiding schemes. This leads to a discussion of the formidable obstacles that lie in the way of a general theory of information hiding systems (in the sense that Shannon gave us a general theory of secrecy systems). However, theoretical considerations lead to ideas of practical value, such as the use of parity checks to amplify covertness and provide public key steganography. Finally, we show that public key information hiding systems exist, and are not necessarily constrained to the case where the warden is passive  相似文献   

12.
The Remote Media Immersion (RMI) system blends multiple cutting-edge media technologies to create the ultimate digital media delivery platform. Its streaming media server delivers multiple high-bandwidth streams, transmission resilience and flow-control protocols ensure data integrity, and high-definition video combined with immersive audio provide the highest quality rendering.  相似文献   

13.
The popularity of mobile devices, such as PDAs and SmartPhones, has grown rapidly over the last couple of years. Though most users still perform searches using desktop computers, it is expected that more and more people will also search the Web while they are on the move. In addition to text-based keyword queries, mobile devices can support richer and hybrid queries such as images, audio, video, and their combinations. In this paper, we will discuss mobile search systems that support image queries and audio queries, covering typical designs for mobile visual and audio search, as well as the opportunities and challenges. Specifically, we will present an in-depth study of two real systems we have developed: product image categorization and mobile ringtone search, which use image queries and audio queries, respectively. Experimental results on real-life data demonstrate their effectiveness and efficiency.  相似文献   

14.
李祺  徐国爱  田斌  张淼 《中国通信》2011,8(6):51-57
With the development of cloudbased data centers and multimedia technologies , cloudbased multimedia service systems have been paid more and more attention. Audio highlights detection plays an important role in the cloudbased multimedia service system. In this paper, we proposed a novel highlight detection method to extract the audio highlight effects for the cloudbased multimedia service system using the unsupervised approach. In the proposed method, we first extract the audio features for each audio document. Then the spectral clustering scheme was used to decompose the audio document into several audio effects . Then, we introduce the TFIDF method to label the highlight effect. We design some experiments to evaluate the performance of the proposed method, and the experimental results show that our method can achieve satisfying results .  相似文献   

15.
Turner  D.A. Ross  K.W. 《IEEE network》2000,14(4):30-37
This article examines how the Internet's e-mail infrastructure should evolve to better support continuous media e-mail, such as audio and video. We assert that the problem is not simply a matter of adding the obvious new functionality to user agents (mail readers), such as audio and video capture, but requires the adoption a new delivery model. We do this by examining the problems that arise using current methods to deliver video messages, and we show how sender-side storage and media streaming solve these problems. Finally, we describe an implementation strategy that requires changes only to individual sender systems, but enables CM e-mail to be delivered universally to any recipient, thus providing a realistic evolutionary path that can be adopted incrementally by individual mail systems  相似文献   

16.
数字音频水印技术综述   总被引:44,自引:1,他引:43  
介绍了数字音频水印技术的发展背景以及在音频水印系统中广泛应用的音频掩蔽现象和MPEG心理声学模型 I。综述了典型的顽健性和脆弱性音频水印技术,顽健性算法进一步在时间域、频域、压缩域上分别阐述。分析了对数字音频水印系统的攻击,特别是在时间域能够以很小代价击败绝大多数音频水印算法的同步攻击,并讨论了几种可能的解决策略。最后总结了当前存在的问题并对其发展进行了展望。  相似文献   

17.
This article evaluates network and server infrastructure requirements to support real-time flows associated with networked entertainment applications. These include the state information flow to update the status of the virtual environment and immersive communication flows such as voice, video, gesture, and haptics communication. The article demonstrates that scaling these applications to large geographical spreads of participants requires distribution of computation to meet the latency constraints of the applications. This latency-driven distribution of computation is essential even when there are no limitations on the availability of computational resources in one location. The article provides detailed results on distributed server architectures for two of these real-time flows, state information and immersive voice communication. It also identifies a generic set of requirements for the underlying network and server infrastructure to support these applications and propose a new design, called switched overlay networks, for this purpose.  相似文献   

18.
The field of audio forensics involves many topics familiar to the general audio digital signal processing (DSP) community, such as speech recognition, talker identification, and signal quality enhancement. There is potentially much to be gained by applying modern DSP theory to problems of interest to the forensics community, and this article is written to give the DSP audience some insight into the types of problems and challenges that face practitioners in audio forensic laboratories. However, this article must also present several of the frustrations and pitfalls encountered by signal processing experts when dealing with typical forensic material due to the standards and practices of the legal system.  相似文献   

19.
李军 《电声技术》2010,34(6):77-79
目前,E1线路作为一种可靠便捷、经济实用的传输手段,已经越来越为用户所接受。但由于E1线路的带宽限制(2Mbit/s),当用户需要同时解决音频、视频、数据、电话、时钟等业务时,变得困难重重。使用一种设备,将多种不同用途的数字信号复合起来通过E1线路进行传输,既可降低设备成本,同时节省通道的占用。对相关实现技术进行了分析.并对已投入运行的系统功能进行了介绍。  相似文献   

20.
《Multimedia, IEEE》2004,11(2):10-13
In the early days much multimedia research focused on developing computer environments that interpret, manipulate, or generate audiovisual media in manual, semiautomatic, or automatic ways. Two major methodologies emerged, emphasizing either particular intrinsic aspects of the target media, or particular processes that users can perform on or with that media. These technological advances steadily infiltrated everyday media environments, including image editing tools (such as Photoshop; Illustrator; the GNU Image Manipulation Program, or GIMP; and Maya), audio systems (such as Cubase VST), new media authoring tools (such as Director/Shockwave, Flash, Dreamweaver, and FrontPage), and Web presentation technology (such as HTML and SMIL). The results deeply changed how we exchange information.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号