Dissemination: Acknowledgement Search Engine and Next Generation Search Engines

1)  AckSeer is a beta automatic acknowledgment indexing search engine that explores automatic identification, entity extraction and indexing of acknowledgements from papers. In addition acknowledged entities are extracted within the acknowledgment passages. Currently, AckSeer indexes acknowledgments from more than 500,000 papers in CiteSeerX. These acknowledgements contain more than 4 million acknowledged entities with approximately 2 million of them unique. Entity extraction is based on AlchemyAPI and OpenCalais. Acknowledged entities are ranked by citation. Feedback is most welcomed.

2) Next Generation Search Engines: Advanced Models for Information Retrieval

© 2012; Publication Date: March 2012; 560 pages

ISBN: 978-1-4666-0330-1; EISBN: 978-1-4666-0331-8

Published by IGI Publishing, Hershey-New York, USA


Editors: Christophe Jouis, Universite Paris III, France and LIP6-Universite Pierre et Marie Curie, France; Ismail Biskri, Universite du Quebec A Trois Rivieres, Canada; Jean-Gabriel Ganascia, LIP6 and CNRS-Universite Pierre et Marie Curie, France; and Magali Roux, LIP6 and CNRS-Universite Pierre et Marie Curie, France



Indexing the World Wide Web: The Journey So Far

Abhishek Das, Google Inc., USA

Ankit Jain, Google Inc., USA

As the World Wide Web has grown, one notes a significant change and improvement in technologies of indexation. In this chapter, the authors describe in detail the key indexing technologies behind today’s web-scale search engines. They are used to provide a better understanding of how web indexes are utilized. An overview of the infrastructure needed to support the growth of web search engines to modern scales is also given. Finally, the authors outline the potential future directions for search engines, particularly in real-time and social contexts.

To obtain a copy of the entire chapter, click on the link below.



Decentralized Search and the Clustering Paradox in Large Scale Information Networks

Weimao Ke, College of Information Science and Technology, Drexel University, USA

The Web poses great challenges for information retrieval because of its size, dynamics, and heterogeneity. Centralized IR systems are becoming inefficient in the face of continued Web growth and a fully distributed architecture seems to be desirable. Without a centralized information repository and global control, a new distributed architecture can take advantage of distributed computing power and can allow a large number of systems to participate in the decision making for finding relevant information. In this chapter, the author presents a decentralized, organic view of information systems pertaining to searching in large-scale networks. The Clustering Paradox phenomenon is discussed.

To obtain a copy of the entire chapter, click on the link below.



Metadata for Search Engines: What can be Learned from e-Sciences?

Magali Roux, Laboratoire d’Informatique de Paris VI, France

Petabytes of data are generated by data-intensive sciences, also known as e-sciences. These data have to be searched to further perform multifarious analyses, including disparate data aggregation, in order to produce new knowledge. To achieve this, e-sciences have developed various strategies, mostly based on metadata, to deal with data complexity and specificities. In this chapter, Nuclear Physics, Geosciences and Biology, which are three seminal domains of e-sciences, are considered with regards to the strategies they have developed to search complex data. Metadata, which are data about data, were given a pivotal role in most of these approaches. The structure and the organization of metadata-based retrieval approaches are discussed.

To obtain a copy of the entire chapter, click on the link below.



Crosslingual Access to Photo Databases

Christian Fluhr, GEOL Semantics, France

For several years, normalized vocabulary has provided an unambiguous description of photos for users’ queries. One could imagine that indexes are made by professionals that control normalized vocabulary. However, according to the author, this is only an ideal view far from the reality of the actual indexation process. The description of photos is done by photographers who have no knowledge of information retrieval or of normalized vocabulary. Moreover, the description does not take into account aspects such as semantic ambiguities, cross-lingual querying, etc. In this chapter, the author presents an experience in which all these limitations are avoided.

To obtain a copy of the entire chapter, click on the link below.



Fuzzy Ontologies Building Platform for Semantic Web: FOB Platform

Hanêne Ghorbel, University of Sfax, Tunisia

Afef Bahri, University of Sfax, Tunisia

Rafik Bouaziz, University of Sfax, Tunisia

To improve the quality of information retrieval systems, a lot of research has been conducted over the last decade, which resulted in the development of Semantic Web techniques. It includes models and languages for the description of Web resources on the one hand and ontologies for describing resources on the other hand. Although ontologies mainly consist of hierarchical descriptions of domain concepts, some domains cannot be precisely and adequately formalized in classic ontology description languages. To overcome those limitations, promising research is being conducted on fuzzy ontologies. In this chapter, the authors propose a definition for a fuzzy ontological model based on fuzzy description logic, along with a methodology for building fuzzy ontologies and platforms.

To obtain a copy of the entire chapter, click on the link below.




Searching and Mining with Semantic Categories

Brahim Djioua, University of Paris-Sorbonne, France

Jean-Pierre Desclés, University of Paris-Sorbonne, France

Motasem Alrahabi, University of Paris-Sorbonne, France

In this chapter, the authors present a new approach for the design of web search engines that uses semantic and discourse annotations according to certain points of view, which has the advantage of focusing on the user interests. The semantic and discourse annotations are provided by means of the contextual exploration method. This method describes the discursive organization of texts by using linguistic knowledge present in the textual context. This knowledge takes the form of lists of linguistic markers and contextual exploration rules of each linguistic marker. The linguistic markers and the contextual exploration rules can help to retrieve relevant information like causality relations, definitions of concepts or quotations, etc., which are difficult to capture with classical methods using keywords.

To obtain a copy of the entire chapter, click on the link below.



Semantic Models in Information Retrieval

Edmond Lassalle, Oranges Labs, France

Emmanuel Lassalle, Université Paris 7, France

In this chapter, the authors propose a new descriptive model for semantics dedicated to Information Retrieval. Every object is considered as a concept. Indeed, the model associates concepts to words. It analyzes every word of a document within its context and translates it into a concept, which will be the meaning of the word. The model is evaluated and documents are classified in categories by using their conceptual representations.

To obtain a copy of the entire chapter, click on the link below.



The Use of Text Mining Techniques in Electronic Discovery for Legal Matters

Michael W. Berry, University of Tennessee, USA

Reed Esau, Catalyst Repository Systems, USA

Bruce Kiefer, Catalyst Repository Systems, USA

In this chapter the authors discuss the electronic discovery (eDiscovery), which consists of the process of collecting and analyzing electronic documents to determine their relevance to a legal matter. At first glance, the large volumes of data needed to be reviewed seem to lend themselves very well to traditional informational retrieval and text mining techniques. However, the noisy and ever-changing aspects of the collections of documents and the particularities of the domain cause the results to be inconsistent using existing tools. Therefore, new tools that take these specific elements into consideration need to be developed. Starting with the history of the collection process of legal documents, the authors then examine how text mining and information retrieval tools are used to deal with the collection process and further propose some research directions to improve it, such as collaborative filtering and cloud computing.

To obtain a copy of the entire chapter, click on the link below.



Intelligent Semantic Search Engines for Opinion and Sentiment Mining

Mona Sleem-Amer, Pertimm, France

Ivan Bigorgne, Lutin, France

Stéphanie Brizard, Arisem, France

Leeley Daio Pires Dos Santos, EDF, France

Yacine El Bouhairi, Thales, France

Bénédicte Goujon, Thales, France

Stéphane Lorin, Thales, France

Claude Martineau, LIGM, France

Loïs Rigouste, Pertimm, France

Lidia Varga, LIGM, France

With the tremendous rise in popularity of social media web over the last few years, enterprises are showing more and more interest in the exploitation of opinions and sentiments expressed by the users about their products and services in the content of social media. Indeed, it contains precious and strategic data for product marketing and business intelligence. However, conventional search engines are inadequate for this task, as they are not designed to retrieve these particular kinds of data. Consequently, the field of opinion mining and retrieval is getting increasing amounts of attention. In this chapter, the authors present the Doxa project, a work in progress that aims to build a semantic enterprise search engine with integrated business intelligence technology and state of the art opinion and sentiment extraction, analysis and querying of electronic text in French.

To obtain a copy of the entire chapter, click on the link below.




Human-Centred Web Search

Orland Hoeber, Memorial University of Newfoundland, Canada

In the Internet era, searching information on the Web has become an essential part of the lives for many people. Research on information retrieval in recent years has mainly focused on addressing issues such as document indexation, document ranking and on providing simple and quick means to search the Web, in an attempt to provide fast and high-quality results to user queries. Despite the great progress made in regard to those aspects and the success of many search engines, people still commonly have difficulties retrieving the information they are seeking, especially when they are unable to formulate an appropriate query or are overwhelmed by results. More needs to be done to include the user into the search process and assist them into the crafting and refinement of their queries and the exploration of the results. This chapter discusses the state-of-the-art research in the field of human-centered Web search.

To obtain a copy of the entire chapter, click on the link below.



Extensions of Web Browsers Useful to Knowledge Workers

Sarah Vert, Centre Virtuel de la Connaissance sur l’Europe (CVCE), Luxembourg

In this chapter the author illustrates the customization of the web browser from the perspective of users who work at any of the tasks of using, planning, acquiring, searching, analyzing, organizing, storing, programming, distributing, marketing, or otherwise contributing to the transformation and commerce of information. In fact, the browser and its various possible parameterizations seem to be an important factor that allows a user to better meet its task. An analysis of the customization of web browsers for knowledge workers is proposed. It demonstrates that a browser offering the possibility of add-ons is an application that is highly adaptable in meeting the specific requirements of its users.

To obtain a copy of the entire chapter, click on the link below.



Next Generation Search Engine for the Result Clustering Technology

Lin-Chih Chen, National Dong Hwa University, Taiwan

When using search engines, users tend to input very short and thus often ambiguous queries. Therefore, identifying the correct user’s search needs is not always an easy task. In order to solve this issue, the next generation of search engines will assist the users in dealing with large sets of results by offering various post-search tools such as result clustering, which has received a lot of attention recently. It consists of clustering search results into a hierarchical labeled tree so the users can customize their view of search results by navigating through it. In this chapter, the author presents WSC, a high-performance result clustering system, based on a mixed clustering method and a genuine divisive hierarchical clustering algorithm to organize the labels into a hierarchical tree. The author also shows that WSC achieves better performances than current commercial and academic systems.

To obtain a copy of the entire chapter, click on the link below.



Using Association Rules for Query Reformulation

Ismaïl Biskri, University of Quebec at Trois-Rivieres, Canada

Louis Rompré, University of Quebec at Montreal, Canada

To express their needs, users formulate queries that often take the form of keywords submitted to an information retrieval system based either on a Boolean model, on a vector model, or on a probabilistic model. It is often difficult for users to find key words that express their exact needs. In many cases, the users are confronted on the one hand with a lack of knowledge on the subject of interest in their information search and on the other hand with biases that may affect the results. Thus, retrieving relevant documents in just one pass is almost impossible. There is a need to carry out a reformulation of the query either by using completely different keywords, or by expanding the initial query with the addition of new keywords. In this chapter, authors present a semi-automatic method of reformulation of queries based on the combination of two methods of data mining: text classification and maximal association rules.

To obtain a copy of the entire chapter, click on the link below.



Question Answering

Ivan Habernal, University of West Bohemia, Czech Republic

Miloslav Konopík, University of West Bohemia, Czech Republic

Ondřej Rohlík, University of West Bohemia, Czech Republic

In order to provide a more sophisticated and satisfactory answer to informational needs, question answering systems aim to give one or more answers in the form of precise and concise sentences to a question asked by a user in natural language, instead of only a set of documents as a result to a query as in a traditional retrieval information system. Therefore, Question Answering systems rely heavily on natural language processing techniques for syntactic and semantic analysis and for the construction of appropriate answers. This chapter presents the state of the art in the field of question answering, within which the authors cover all types of promising QA systems, techniques and approaches for the next generation of search engines, focusing mainly on systems aimed at the (semantic) web.

To obtain a copy of the entire chapter, click on the link below.



Finding Answers to Questions, in Text Collections or Web, in Open Domain or Specialty Domains

Brigitte Grau, LIMSI-CNRS and ENSIIE, France

This chapter is dedicated to factual question-answering in open domains and in specialty domains. In querying a database, it is expected that factual questions will yield short answers that give precise information. However, with a web environment, topics are not limited and knowledge is not structured. Finding answers requires analyzing texts. In fact, the problem of finding answers to questions consists of, in this context, extracting a piece of information from a text. In this chapter, the author presents question-answering systems that extract answers from web documents in a fixed multilingual collection.

To obtain a copy of the entire chapter, click on the link below.



Context-Aware Mobile Search Engine

Jawad Berri, College of Computing and Information Sciences, King Saud University, Saudi Arabia

Rachid Benlamri, Lakehead University, Canada

The recent emergence of mobile handsets as a new means of information exchange has led up to the need for information retrieval systems specialized for mobile users. Lately, a lot of efforts have been put into the development of robust mobile search engines capable of providing attractive and practical services to mobile users, such as tools that provide directions to business locations according to the user location or voice speech search that uses speech recognition technologies. However, the capabilities of current mobile search engines are still limited. In particular, enhancements are made possible by exploiting information about the current context of the users and providing this to search engines to improve the relevance of the results. In this chapter, a context model and an architecture that promote the integration of contextual information are presented through a case study.

To obtain a copy of the entire chapter, click on the link below.



Spatio-Temporal Based Personalization for Mobile Search

Ourdia Bouidghaghen, IRIT-CNRS-University Paul Sabatier of Toulouse, France

Lynda Tamine, IRIT-CNRS-University Paul Sabatier of Toulouse, France

The explosion of information available on the Internet and its heterogeneity has considerably reduced the effectiveness of traditional information retrieval systems. In recent years, much research has been devoted to develop contextual information retrieval technologies. Moreover, from the proliferation of new means of communication and information access, such as mobile devices, have emerged new needs in IR. In this chapter, the authors discuss this specific issue with respect to mobile information retrieval, followed by a presentation of a model of spatio-temporal-based personalization for mobile search, using contextual data such as location and time in order to dynamically select the most appropriate profile from a given situation. Each profile contains user interests learnt according to searches in past individual explorations. They also propose a novel evaluation scenario for mobile search based on diary study entries.

To obtain a copy of the entire chapter, click on the link below.




Studying Web Search Engines from a User Perspective: Key Concepts and Main Approaches

Stéphane Chaudiron, University of Lille 3, France

Madjid Ihadjadene, University of Paris 8, France

In this chapter, the user perspective is highlighted. Some recent challenges in search engine evolution change users’ information behavior. The authors identify four major trends in the “user-oriented approach” that focus respectively on strategies and tactics, cognitive and psychological approaches, management, and consumer and marketing approaches. However, the authors note that there is a need to better understand the dynamics and the nature of the interaction between Web searching and users. Also, other aspects such as ethics, cultural issues, growing social networks, etc. need to be considered.

To obtain a copy of the entire chapter, click on the link below.



Artificial Intelligence Enabled Search Engines (AIESE) and the Implications

Faruk Karaman, Gedik University, Turkey

Nowadays, search engines constitute the main means of classifying, sorting, and delivering information to users over the Internet. As time progresses, advances in Artificial Intelligence will be made and thus new artificial intelligence technologies will be developed to enhance the sophistication of the search engines. This future generation of search engines, called artificial intelligence enabled search engines, will be compelled to play an even more crucial role for information retrieval, but this will not be without any consequences. Through this chapter, the author analyzes the concept of technological singularity, discusses the direct and indirect impacts of the development of new technologies and artificial intelligence, notably regarding search engines, and proposes a four-stage evolution model of search engines.

To obtain a copy of the entire chapter, click on the link below.



A Framework for Evaluating the Retrieval Effectiveness of Search Engines

Dirk Lewandowski, Hamburg University of Applied Sciences, Germany

The evaluation of information retrieval systems and search engines in development or already on the market is a crucial process for the improvement of the quality of the search results. Quality measures for most evaluations consist of calculating precision and recall using a set of ad-hoc queries and assume that common users examine every result returned by a search engine in the same order they are presented. While this may be true in some contexts, it has been shown that it is not necessarily the case in Web searches, where modern Web search engines present results in various and enriched forms and where the users are typically interested only in a few highly relevant results and examine them as they see fit. Therefore, there is a need for new extended evaluation models for Web search engines. To this end, the author proposes a framework for evaluating the retrieval effectiveness of next-generation search engines.

To obtain a copy of the entire chapter, click on the link below.



Hardcover Price: $195.00

Online Perpetual Access Price: $295

Print + Online Perpetual Access Price: $390

Available for purchase on IGI Global’s Web site at: http://www.igi-global.com/book/next-generation-search-engines/59723

Also available through major online book retailers such as Amazon and Barnes & Noble

This book is also included in the IGI Global aggregated “InfoSci-Books” database: http://www.igi-global.com/isb.

Leave a Reply

Your email address will not be published. Required fields are marked *