首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
《Advanced Robotics》2013,27(3-4):293-328
This paper presents a method of controlling robot manipulators with fuzzy voice commands. Recently, there has been some research on controlling robots using information-rich fuzzy voice commands such as 'go little slowly' and learning from such commands. However, the scope of all those works was limited to basic fuzzy voice motion commands. In this paper, we introduce a method of controlling the posture of a manipulator using complex fuzzy voice commands. A complex fuzzy voice command is composed of a set of fuzzy voice joint commands. Complex fuzzy voice commands can be used for complicated maneuvering of a manipulator, while fuzzy voice joint commands affect only a single joint. Once joint commands are learned, any complex command can be learned as a combination of some or all of them, so that, using the learned complex commands, a human user can control the manipulator in a complicated manner with natural language commands. Learning of complex commands is discussed in the framework of fuzzy coach–player model. The proposed idea is demonstrated with a PA-10 redundant manipulator.  相似文献   

2.
This article proposes a method for adapting a robot’s perception of fuzzy linguistic information by evaluating vocal cues. The robot’s perception of fuzzy linguistic information such as “very little” depends on the environmental arrangements and the user’s expectations. Therefore, the robot’s perception of the corresponding environment is modified by acquiring the user’s perception through vocal cues. Fuzzy linguistic information related to primitive movements is evaluated by a behavior evaluation network (BEN). A vocal cue evaluation system (VCES) is used to evaluate the vocal cues for modifying the BEN. The user’s satisfactory level for the robot’s movements and the user’s willingness to change the robot’s perception are identified based on a series of vocal cues to improve the adaptation process. A situation of cooperative rearrangement of the user’s working space is used to illustrate the proposed system by a PA-10 robot manipulator.  相似文献   

3.
In this article, a fuzzy neural network (FNN)-based approach is presented to interpret imprecise natural language (NL) commands for controlling a machine. This system, (1) interprets fuzzy linguistic information in NL commands for machines, (2) introduces a methodology to implement the contextual meaning of NL commands, and (3) recognizes machine-sensitive words from the running utterances which consist of both in-vocabulary and out-of-vocabulary words. The system achieves these capabilities through a FNN, which is used to interpret fuzzy linguistic information, a hidden Markov model-based key-word spotting system, which is used to identify machine-sensitive words among unrestricted user utterances, and a possible framework to insert the contextual meaning of words into the knowledge base employed in the fuzzy reasoning process. The system is a complete system integration which converts imprecise NL command inputs into their corresponding output signals in order to control a machine. The performance of the system specifications is examined by navigating a mobile robot in real time by unconditional speech utterances. This work was presented, in part, at the Seventh International Symposium on Artificial Life and Robotics, Oita, Japan, January 16–18, 2002  相似文献   

4.
Adaptive fuzzy command acquisition with reinforcement learning   总被引:2,自引:0,他引:2  
Proposes a four-layered adaptive fuzzy command acquisition network (AFCAN) for adaptively acquiring fuzzy command via interactions with the user or environment. It can catch the intended information from a sentence (command) given in natural language with fuzzy predicates. The intended information includes a meaningful semantic action and the fuzzy linguistic information of that action. The proposed AFCAN has three important features. First, we can make no restrictions whatever on the fuzzy command input, which is used to specify the desired information, and the network requires no acoustic, prosodic, grammar, and syntactic structure, Second, the linguistic information of an action is learned adaptively and it is represented by fuzzy numbers based on α-level sets. Third, the network can learn during the course of performing the task. The AFCAN can perform off-line as well as online learning. For the off-line learning, the mutual-information (MI) supervised learning scheme and the fuzzy backpropagation (FBP) learning scheme are employed when the training data are available in advance. The former learning scheme is used to learn meaningful semantic actions and the latter learn linguistic information. The AFCAN can also perform online learning interactively when it is in use for fuzzy command acquisition. For the online learning, the MI-reinforcement learning scheme and the fuzzy reinforcement learning scheme are developed for the online learning of meaningful actions and linguistic information, respectively. An experimental system is constructed to illustrate the performance and applicability of the proposed AFCAN  相似文献   

5.
A fuzzy logic-based methodology is proposed to model the organization level of an intelligent robotic system. The user input commands to the system organizer are linguistic in nature and the primitive events-tasks from the task domain of the system are, in general, interpreted via fuzzy sets. Fuzzy relations are introduced to connect every event with a specific user input command. Approximate reasoning is accomplished via a modifier and the compositional rule of inference, whereas the application of the conjunction rule generates those fuzzy sets with elements all possible (crisp) plans. Themost possible plan among all those generated, that is optimal under an application dependent criterion, is chosen and communicated to the coordination level. Off-line feedback information from the lower levels is considered asa-priori known and is used to update all organization level information. An example demonstrates the applicability of the proposed algorithm to intelligent robotic systems.  相似文献   

6.
This article describes a multimodal command language for home robot users, and a robot system which interprets users’ messages in the language through microphones, visual and tactile sensors, and control buttons. The command language comprises a set of grammar rules, a lexicon, and nonverbal events detected in hand gestures, readings of tactile sensors attached to the robots, and buttons on the controllers in the users’ hands. Prototype humanoid systems which immediately execute commands in the language are also presented, along with preliminary experiments of faceto-face interactions and teleoperations. Subjects unfamiliar with the language were able to command humanoids and complete their tasks with brief documents at hand, given a short demonstration beforehand. The command understanding system operating on PCs responded to multimodal commands without significant delay. This work was presented in part at the 13th International Symposium on Artificial Life and Robotics, Oita, Japan, January 31–February 2, 2008  相似文献   

7.
Success rates in a multimodal command language for home robot users   总被引:1,自引:1,他引:0  
This article considers the success rates in a multimodal command language for home robot users. In the command language, the user specifies action types and action parameter values to direct robots in multiple modes such as speech, touch, and gesture. The success rates of commands in the language can be estimated by user evaluations in several ways. This article presents some user evaluation methods, as well as results from recent studies on command success rates. The results show that the language enables users without much training to command home robots at success rates as high as 88%–100%. It is also shown that multimodal commands combining speech and button-press actions included fewer words and were significantly more successful than single-modal spoken commands.  相似文献   

8.
Natural language commands are generated by intelligent human beings. As a result, they contain a lot of information. Therefore, if it is possible to learn from such commands and reuse that knowledge, it will be a very efficient process. In this paper, learning from such information rich voice commands for controlling a robot is studied. First, new concepts of fuzzy coach-player system and sub-coach are proposed for controlling robots with natural language commands. Then, the characteristics of the subjective human decision making process are discussed and a Probabilistic Neural Network (PNN) based learning method is proposed to learn from such commands and to reuse the acquired knowledge. Finally, the proposed concept is demonstrated and confirmed with experiments conducted using a PA-10 redundant manipulator.  相似文献   

9.
针对ARINC661座舱显示系统中显示控制单元在开发阶段验证指令困难的问题,设计了一种解决方法.该方法不需要搭建联试环境,仅在控制指令开发的计算机上即可完成验证.该设计充分利用ARINC661通用内核进行扩展,在通用内核DF加载、指令解析、人机交互、画面渲染等基本功能的基础上,增加显示设备、虚拟UA的管理功能,既保证了辅助组件与实际显示画面的一致性,又能方便地模拟各种机载显示设备.该设计用直观的显示结果来验证控制指令的正确性,使设计人员能够方便快速地测试已开发指令的正确性,从而缩短试验周期,具有良好的适用性、扩展性和可靠性.  相似文献   

10.
In this work we propose a new approach based on fuzzy concepts and heuristic reasoning to deal with the visual data association problem in real time, considering the particular conditions of the visual data segmented from images, and the integration of higher-level information in the tracking process such as trajectory smoothness, consistency of information, and protection against predictable interactions such as overlap/occlusion, etc. The objects’ features are estimated from the segmented images using a Bayesian formulation, and the regions assigned to update the tracks are computed through a fuzzy system to integrate all the information. The algorithm is scalable, requiring linear computing resources with respect to the complexity of scenarios, and shows competitive performance with respect to other classical methods in which the number of evaluated alternatives grows exponentially with the number of objects.  相似文献   

11.
Public self-service kiosks provide key services such as ticket sales, airport check-in and general information. Such kiosks must be universally designed to be used by society at large, irrespective of the individual users’ physical and cognitive abilities, level of education and familiarity with the system. The noble goal of universal accessibility is hard to achieve. This study reports experiences with a universally designed kiosk prototype based on a multimodal intelligent user interface that adapts to the user’s physical characteristics. The user interacts with the system via a tall rectangular touch-sensitive display where the interaction area is adjusted to fit the user’s height. A digital camera is used to measure the user’s approximate reading distance from the display such that the text size can be adjusted accordingly. The user’s touch target accuracy is measured, and the target sizes are increased for users with motor difficulties. A Byzantine visualization technique is employed to exploit unused and unreachable screen real estate to provide the user with additional visual cues. The techniques explored in this study have potential for most public self-service kiosks.  相似文献   

12.
In the few past decades, several international researchers have worked to develop intelligent wheelchairs for the people with reduced mobility. For many of these projects, the structured set of commands is based on a sensor-based command. Many types of commands are available but the final decision is to be made by the user. A former work established a behaviour-based multi-agent form of control ensuring that the user selects the best option for him/her in relation to his/her preferences or requirements. This type of command aims at “merging” this user and his/her machine—a kind of symbiotic relationship making the machine more amenable and the command more effective. In this contribution, the approach is based on a curve matching procedure to provide comprehensive assistance to the user. This new agent, using a modelization of the paths that are most frequently used, assists the user during navigation by proposing the direction to be taken when the path has been recognized. This approach will spare the user the effort of determining a new direction—which might be a major benefit in the case of severe disabilities. The approach considered uses particle filtering to implement the recognition of the most frequent paths according to a topological map of the environment.  相似文献   

13.
A way for recognizing voice commands (VCs) in the noises with a probability of proper recognition higher than 92% and a signal/noise ratio of 1–6 dB, if the library of pattern voice commands has been generated directly before recognition, is presented in [1]. This method is based on transformation of voice signals into a 2D image: autocorrelation portrait (ACP). The results become significantly worse if the library is prepared long before the recognition, and this is a disadvantage of this method. In this paper we describe the procedure for generating another type of voice command image, which eliminates (to a considerable degree) this disadvantage.  相似文献   

14.
In this paper, a new framework called fuzzy relevance feedback in interactive content-based image retrieval (CBIR) systems is introduced. Conventional binary labeling scheme in relevance feedback requires a crisp decision to be made on the relevance of the retrieved images. However, it is inflexible as user interpretation of visual content varies with respect to different information needs and perceptual subjectivity. In addition, users tend to learn from the retrieval results to further refine their information requests. It is, therefore, inadequate to describe the user’s fuzzy perception of image similarity with crisp logic. In view of this, we propose a fuzzy relevance feedback approach which enables the user to make a fuzzy judgement. It integrates the user’s fuzzy interpretation of visual content into the notion of relevance feedback. An efficient learning approach is proposed using a fuzzy radial basis function (FRBF) network. The network is constructed based on the user’s feedbacks. The underlying network parameters are optimized by adopting a gradient-descent training strategy due to its computational efficiency. Experimental results using a database of 10,000 images demonstrate the effectiveness of the proposed method.
Kim-Hui Yap (Corresponding author)Email:
  相似文献   

15.
Computer games are now a part of our modern culture. However, certain categories of people are excluded from this form of entertainment and social interaction because they are unable to use the interface of the games. The reason for this can be deficits in motor control, vision or hearing. By using automatic speech recognition systems (ASR), voice driven commands can be used to control the game, which can thus open up the possibility for people with motor system difficulty to be included in game communities. This paper aims at find a standard way of using voice commands in games which uses a speech recognition system in the backend, and that can be universally applied for designing inclusive games. Present speech recognition systems however, do not support emotions, attitudes, tones etc. This is a drawback because such expressions can be vital for gaming. Taking multiple types of existing genres of games into account and analyzing their voice command requirements, a general ASRS module is proposed which can work as a common platform for designing inclusive games. A fuzzy logic controller proposed then is to enhance the system. The standard voice driven module can be based on algorithm or fuzzy controller which can be used to design software plug-ins or can be included in microchip. It then can be integrated with the game engines; creating the possibility of voice driven universal access for controlling games.  相似文献   

16.
基于模糊模拟的加权偏爱浏览模式的挖掘   总被引:1,自引:0,他引:1  
每个网页由不同的专家给出语义上的重要性评估,这些语义评估再被刻画成相应的模糊语言变量,通过模糊模拟的方法,这些模糊语言变量被转化成表示网页重要性的权重。此外,简单地认为用户的访问频度反映了用户的访问兴趣是不准确的,因此在提出的加权支持度和偏爱度概念的基础上,从建立的包含了所有用户浏览信息的FLaAT(Frequent Link and Access Tree)上,挖掘用户偏爱的加权浏览模式。试验证明该算法是行之有效的。  相似文献   

17.
This paper provides an overview of a multi-modal wearable computer system, SNAP&TELL. The system performs real-time gesture tracking, combined with audio-based control commands, in order to recognize objects in an environment, including outdoor landmarks. The system uses a single camera to capture images, which are then processed to perform color segmentation, fingertip shape analysis, robust tracking, and invariant object recognition, in order to quickly identify the objects encircled and SNAPped by the user’s pointing gesture. In addition, the system returns an audio narration, TELLing the user information concerning the object’s classification, historical facts, usage, etc. This system provides enabling technology for the design of intelligent assistants to support “Web-On-The-World” applications, with potential uses such as travel assistance, business advertisement, the design of smart living and working spaces, and pervasive wireless services and internet vehicles. An erratum to this article can be found at An erratum to this article can be found at  相似文献   

18.
When a database increases in size, retrieving the data becomes a major problem. An approach based on data visualization and visual reasoning is described. The main idea is to transform the data objects and present sample data objects in a visual space. The user can use a visual language to incrementally formulate the information retrieval request in the visual space. A prototype system is described with the following features: (1) it is built on top of the SIL-ICON visual language compiler and therefore can be customized for different application domains; (2) it supports a fuzzy icon grammar to define reasonable visual sentences; (3) it incorporates a semantic model of the database for fuzzy visual query translation; and (4) it incorporates a VisualNet which stores the knowledge learned by the system in its interaction with the user so that the VisualReasoner can adapt its behavior  相似文献   

19.
提出了一个基于J2ME平台的手机语音控制系统,该系统结合语音识别和自然语言处理技术,处理手机用户的语音输入,抽取语义信息并显示在手机终端.本系统采用C/S架构,客户端为手机终端,服务器端为PC.在客户端,收集语音输入流并发送给服务器,接收服务器发回的语义信息并显示;在服务器端,接收手机客户端传来的语音流,进行语音识别,自然语言处理,将处理的语义信息发回客户端.该系统能处理同一种手机控制命令的多种自然语言表达方式,能极大方便手机用户的使用.  相似文献   

20.
This study presents a user interface that was intentionally designed to support multimodal interaction by compensating for the weaknesses of speech compared with pen input and vice versa. The test application was email using a web pad with pen and speech input. In the case of pen input, information was represented as visual objects, which were easily accessible. Graphical metaphors were used to enable faster and easier manipulation of data. Speech input was facilitated by displaying the system speech vocabulary to the user. All commands and accessible fields with text labels could be spoken in by name. Commands and objects that the user could access via speech input were shown on a dynamic basis in a window. Multimodal interaction was further enhanced by creating a flexible object-action order such that the user could utter or select a command with a pen followed by the object which was to be enacted upon, or the other way round (e.g., New Message or Message New). The flexible action-object interaction design combined with voice and pen input led to eight possible action-object-modality combinations. The complexity of the multimodal interface was further reduced by making generic commands such as New applicable across corresponding objects. Use of generic commands led to a simplification of menu structures by reducing the number of instances in which actions appeared. In this manner, more content information could be made visible and consistently accessible via pen and speech input. Results of a controlled experiment indicated that the shortest task completion times for the eight possible input conditions were when speech-only was used to refer to an object followed by the action to be performed. Speech-only input with action-object order was also relatively fast. In the case of pen input-only, the shortest task completion times were found when an object was selected first followed by the action to be performed. In multimodal trials in which both pen and speech were used, no significant effect was found for object-action order, suggesting benefits of providing users with a flexible action-object interaction style in multimodal or speech-only systems.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号