Skip to content

Commit e0b614c

Browse files
authored
Merge pull request #929 from hanhainebula/master
Update MIRACL evaluation results of BGE-M3
2 parents 1470fc6 + ce2f759 commit e0b614c

3 files changed

Lines changed: 9 additions & 1 deletion

File tree

FlagEmbedding/BGE_M3/README.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,14 @@ Utilizing the re-ranking model (e.g., [bge-reranker](https://github.com/FlagOpen
2323

2424

2525
## News:
26+
27+
- 2024/7/1: **We update the MIRACL evaluation results of BGE-M3**. To reproduce the new results, you can refer to: [bge-m3_miracl_2cr](https://huggingface.co/datasets/hanhainebula/bge-m3_miracl_2cr). We have also updated our [paper](https://arxiv.org/pdf/2402.03216) on arXiv.
28+
<details>
29+
<summary> Details </summary>
30+
31+
> The previous test results were lower because we mistakenly removed the passages that have the same id as the query from the search results. After correcting this mistake, the overall performance of BGE-M3 on MIRACL is higher than the previous results, but the experimental conclusion remains unchanged. The other results are not affected by this mistake. To reproduce the previous lower results, you need to add the `--remove-query` parameter when using `pyserini.search.faiss` or `pyserini.search.lucene` to search the passages.
32+
33+
</details>
2634
- 2024/3/20: **Thanks Milvus team!** Now you can use hybrid retrieval of bge-m3 in Milvus: [pymilvus/examples
2735
/hello_hybrid_sparse_dense.py](https://github.com/milvus-io/pymilvus/blob/master/examples/hello_hybrid_sparse_dense.py).
2836
- 2024/3/8: **Thanks for the [experimental results](https://towardsdatascience.com/openai-vs-open-source-multilingual-embedding-models-e5ccb7c90f05) from @[Yannael](https://huggingface.co/Yannael). In this benchmark, BGE-M3 achieves top performance in both English and other languages, surpassing models such as OpenAI.**
@@ -213,7 +221,7 @@ We provide the evaluation script for [MKQA](https://github.com/FlagOpen/FlagEmbe
213221

214222

215223
### Our results
216-
- Multilingual (Miracl dataset)
224+
- Multilingual (MIRACL dataset)
217225

218226
![avatar](./imgs/miracl.jpg)
219227

FlagEmbedding/BGE_M3/imgs/bm25.jpg

61.4 KB
Loading
126 KB
Loading

0 commit comments

Comments
 (0)