What is bm25
Last updated: April 1, 2026
Key Facts
- BM25 stands for 'Best Match 25' and is based on the Probabilistic Relevance Framework
- Developed in the 1990s and became standard in many search engines and information retrieval systems
- Considers document length, term frequency, and term rarity when calculating relevance scores
- Widely used by search engines, Elasticsearch, Lucene, and other information retrieval technologies
- More sophisticated than simple term frequency matching, providing better relevance rankings for complex queries
What is BM25?
BM25 is a ranking function used in information retrieval and search engines to assess the relevance of documents to user search queries. The acronym stands for 'Best Match 25,' referring to the 25th iteration of algorithms developed within the Probabilistic Relevance Framework. BM25 has become one of the most widely adopted ranking algorithms in the search industry due to its effectiveness in producing relevant search results.
How BM25 Works
BM25 calculates relevance scores by analyzing multiple factors:
- Term Frequency: How often a search term appears in a document
- Inverse Document Frequency: How rare or common the search term is across all documents
- Document Length Normalization: Adjusting scores based on document length to prevent bias toward longer documents
- Field-Specific Scoring: Weighing terms differently based on where they appear (title vs. body text)
Historical Development
BM25 evolved from probabilistic information retrieval research conducted in the 1990s. Researchers at City University London developed successive iterations of the algorithm, with BM25 representing a mature, effective version that balanced accuracy with computational efficiency. Its success led to widespread adoption across the search and information retrieval industry.
Applications and Adoption
BM25 is the default ranking function in major search and retrieval technologies including Elasticsearch, Apache Lucene, Solr, and many other enterprise search platforms. Web search engines, internal company search systems, and research databases frequently employ BM25 or variations of the algorithm to rank documents and return relevant results to users.
Advantages Over Simpler Methods
Unlike basic keyword matching that treats all terms equally, BM25 provides more nuanced relevance scoring. It prevents manipulation through term repetition, accounts for document structure, and adapts to query complexity. This makes BM25 superior for returning highly relevant results even with complex, multi-word queries.
Related Questions
Why is BM25 better than simple keyword matching?
BM25 considers term frequency, inverse document frequency, and document length normalization, providing more nuanced relevance scores that better match user intent compared to simple keyword counting.
What search engines use BM25?
BM25 is used in Elasticsearch, Apache Lucene, Solr, and many enterprise search platforms. Many major search engines and information retrieval systems employ BM25 or similar probabilistic algorithms.
Can BM25 be customized?
Yes, BM25 includes tunable parameters (k1 and b parameters) that allow customization for different types of documents and search scenarios to optimize relevance results.
More What Is in Daily Life
Also in Daily Life
More "What Is" Questions
Trending on WhatAnswers
Browse by Topic
Browse by Question Type
Sources
- Wikipedia - Okapi BM25CC-BY-SA-4.0
- Elasticsearch BM25 DocumentationCopyright Elastic