English»Classic Information Retrieval | searchivarius.org
log in | about 

Information retrieval (IR) is the science of searching for information in documents, searching for documents themselves, searching for metadata which describe documents, or searching within databases, whether relational stand-alone databases or hypertext networked databases such as the Internet or intranets, for text, sound, images or data. There is a common confusion, however, between data retrieval, document retrieval, information retrieval, and text retrieval, and each of these have their own bodies of literature, theory, praxis and technologies. Read more

Approximate Fulltext String Searching

Automatic thesaurus construction


Cross Language Information Retrieval (CLIR)

Duplicate Detection

Dynamic & static pruning


Hypertext systems
Link Popularity and Citation Index, Entity linking

Indexing Techniques
Efficient Intersection of Inverted Lists, Efficiency, Inverted vs Signature Files...

IR (language) models
Temporal Models

Learning to rank
Direct Optimization of IR Metrics

Meta and Federated Search

Modelling Score Distribution

Morphology and Stemming

Proximity scoring and Phrase Indexing

Query Reformulation, Expansion, and Autocompletion


Retrieval-based Question Answering

Search Engines
Crawlers, Forward & Graph Indices, Indri & Lemur...

Shallow Semantics


User Behavior Analysis


Apache Lucene 4 (white paper)  Andrzej Białecki, Robert Muir, Grant Ingersol
Managing Gigabytes (second edition)  Ian H. Witten, Alistair Moffat, Timothy C. Bell - Compressing and Indexing Documents and Images
Relevance: The Whole History  Stefano Mizzaro
The Fundumentals of Enterprise Search  Avi Rappoport