Skip to content

Commit a724ee7

Browse files
committed
update results of BM25
1 parent b87f966 commit a724ee7

1 file changed

Lines changed: 5 additions & 1 deletion

File tree

FlagEmbedding/BGE_M3/README.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -223,9 +223,13 @@ We compare BGE-M3 with some popular methods, including BM25, openAI embedding, e
223223
- NarritiveQA:
224224
![avatar](./imgs/nqa.jpg)
225225

226-
- BM25
226+
- Comparison with BM25
227227

228228
We utilized Pyserini to implement BM25, and the test results can be reproduced by this [script](https://github.com/FlagOpen/FlagEmbedding/tree/master/C_MTEB/MLDR#bm25-baseline).
229+
We tested BM25 using two different tokenizers:
230+
one using Lucene Analyzer and the other using the same tokenizer as M3 (i.e., the tokenizer of xlm-roberta).
231+
The results indicate that BM25 remains a competitive baseline,
232+
especially in long document retrieval.
229233

230234
![avatar](./imgs/bm25.jpg)
231235

0 commit comments

Comments
 (0)