首页 | 官方网站   微博 | 高级检索  
     


Mímir: An open-source semantic search framework for interactive information seeking and discovery
Affiliation:1. Department of Haematology and Wellcome and MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom;;2. Haematopathology & Oncology Diagnostics Service, Department of Haematology Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom; and;3. Cambridge Blood and Stem Cell Biobank, University of Cambridge, United Kingdom;1. Ghent University–iMinds, Sint-Pietersnieuwstraat 41, 9000 Ghent, Belgium;2. Hasso Plattner Institute, University of Potsdam, Prof.-Dr.-Helmert-Straße 2-3, 14482 Potsdam, Germany;1. Centre d’Estudis Epidemiològics sobre les Infeccions de Transmissió Sexual i Sida de Catalunya (CEEISCAT), Departament de Salut, Generalitat de Catalunya, Badalona, Barcelona, Spain;2. CIBER Epidemiología y Salud Pública (CIBERESP), Madrid, Spain;3. Stop Sida, Barcelona, Spain;4. Associació Antisida Lleida, Lleida, Spain;5. Àmbit Prevenció, Barelona, Spain;6. Actuavallès, Sabadell, Barcelona, Spain;7. Institute of Tropical Medicine, Department of Clinical Sciences, Antwerp, Belgium;8. Departament de Pediatria, Obstetricia i Ginecologia i de Medicina Preventiva, Universitat Autònoma de Barcelona, Badalona, Barcelona, Spain
Abstract:Semantic search is gradually establishing itself as the next generation search paradigm, which meets better a wider range of information needs, as compared to traditional full-text search. At the same time, however, expanding search towards document structure and external, formal knowledge sources (e.g. LOD resources) remains challenging, especially with respect to efficiency, usability, and scalability.This paper introduces Mímir—an open-source framework for integrated semantic search over text, document structure, linguistic annotations, and formal semantic knowledge. Mímir supports complex structural queries, as well as basic keyword search.Exploratory search and sense-making are supported through information visualisation interfaces, such as co-occurrence matrices and term clouds. There is also an interactive retrieval interface, where users can save, refine, and analyse the results of a semantic search over time. The more well-studied precision-oriented information seeking searches are also well supported.The generic and extensible nature of the Mímir platform is demonstrated through three different, real-world applications, one of which required indexing and search over tens of millions of documents and fifty to hundred times as many semantic annotations. Scaling up to over 150 million documents was also accomplished, via index federation and cloud-based deployment.
Keywords:Natural language processing  Semantic search  Scalable semantic search framework  Expressive semantic queries  Integrated semantic search
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号