update README and README_zh

ZiyiXia · ZiyiXia · commit e22aaad2675a · 2024-08-30T16:41:38.000+08:00
diff --git a/README.md b/README.md
@@ -46,8 +46,6 @@ FlagEmbedding focuses on retrieval-augmented LLMs, consisting of the following p
 - 7/26/2024: Release a new embedding model [bge-multilingual-gemma2](https://huggingface.co/BAAI/bge-multilingual-gemma2), a multilingual embedding model based on gemma-2-9b, which supports multiple languages and diverse downstream tasks, achieving new SOTA on multilingual benchmarks (MIRACL, MTEB-fr, and MTEB-pl). :fire:
 - 7/26/2024: Release a new lightweight reranker [bge-reranker-v2.5-gemma2-lightweight](https://huggingface.co/BAAI/bge-reranker-v2.5-gemma2-lightweight), a lightweight reranker based on gemma-2-9b, which supports token compression and layerwise lightweight operations, can still ensure good performance while saving a significant amount of resources. :fire:
 
-
-
 <details>
   <summary>More</summary>
 <!-- ### More -->
@@ -79,12 +77,13 @@ It is the first embedding model which supports all three retrieval methods, achi
 </details>
 
 ## Installation
-- Using pip:
+### Using pip:
 ```
 pip install -U FlagEmbedding
 ```
-- Install from sources:
-Clone the repository
+### Install from sources:
+
+Clone the repository and install
 ```
 git clone https://github.com/FlagOpen/FlagEmbedding.git
 cd FlagEmbedding
@@ -111,7 +110,7 @@ sentences_2 = ["I love BGE", "I love text retrieval"]
 embeddings_1 = model.encode(sentences_1)
 embeddings_2 = model.encode(sentences_2)
 ```
-Once we get the embeddings, we can compute similarity.
+Once we get the embeddings, we can compute similarity by inner product:
 ```
 similarity = embeddings_1 @ embeddings_2.T
 print(similarity)
diff --git a/README_zh.md b/README_zh.md
@@ -1,24 +1,30 @@
 <h1 align="center">FlagEmbedding</h1>
 <p align="center">
-    <a href="https://www.python.org/">
-            <img alt="Build" src="https://img.shields.io/badge/Made with-Python-purple">
+    <a href="https://huggingface.co/collections/BAAI/bge-66797a74476eb1f085c7446d">
+        <img alt="Build" src="https://img.shields.io/badge/BGE_series-🤗-yellow">
+    </a>
+    <a href="https://github.com/FlagOpen/FlagEmbedding">
+            <img alt="Build" src="https://img.shields.io/badge/Contribution-Welcome-blue">
     </a>
     <a href="https://github.com/FlagOpen/FlagEmbedding/blob/master/LICENSE">
         <img alt="License" src="https://img.shields.io/badge/LICENSE-MIT-green">
     </a>
     <a href="https://huggingface.co/C-MTEB">
-        <img alt="License" src="https://img.shields.io/badge/C_MTEB-🤗-yellow">
+        <img alt="Build" src="https://img.shields.io/badge/C_MTEB-🤗-yellow">
     </a>
-    <a href="https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding">
-        <img alt="License" src="https://img.shields.io/badge/universal embedding-1.1-red">
+    <a href="https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/baai_general_embedding">
+        <img alt="Build" src="https://img.shields.io/badge/FlagEmbedding-1.1-red">
     </a>
 </p>
 
 <h4 align="center">
     <p>
         <a href=#更新>更新</a> |
+        <a href=#安装>安装</a> |
+        <a href=#快速开始>快速开始</a> |
         <a href="#项目">项目</a> |
         <a href="#模型列表">模型列表</a> |
+        <a href=#贡献者>贡献者</a> |
         <a href="#citation">Citation</a> |
         <a href="#license">License</a> 
     <p>
@@ -39,6 +45,10 @@ FlagEmbedding专注于检索增强llm领域，目前包括以下项目:
 - 7/26/2024：发布[bge-en-icl](https://huggingface.co/BAAI/bge-en-icl)。这是一个结合了上下文学习能力的文本检索模型，通过提供与任务相关的查询-回答示例，可以编码语义更丰富的查询，进一步增强嵌入的语义表征能力。 :fire:
 - 7/26/2024: 发布[bge-multilingual-gemma2](https://huggingface.co/BAAI/bge-multilingual-gemma2)。这是一个基于gemma-2-9b的多语言文本向量模型，同时支持多种语言和多样的下游任务，在多语言检索数据集 MIRACL, MTEB-fr, MTEB-pl 上取得了迄今最好的实验结果。 :fire:
 - 7/26/2024：发布新的轻量级重排器[bge-reranker-v2.5-gemma2-lightweight](https://huggingface.co/BAAI/bge-reranker-v2.5-gemma2-lightweight)。这是一个基于gemma-2-9b的轻量级重排器，支持令牌压缩和分层轻量操作，在节省大量资源的同时，仍能确保良好的性能。:fire:
+
+<details>
+  <summary>More</summary>
+
 - 6/7/2024: 发布首个专为长视频理解设计的全面评测基准[MLVU](https://github.com/JUNJIE99/MLVU)。MLVU拥有丰富的视频时长范围，多样化的视频来源，以及多个专为长视频理解设计的评估任务。 :fire:
 - 5/21/2024：联合 Jina AI、Zilliz、HuggingFace 等机构发布评测基准 [AIR-Bench](https://github.com/AIR-Bench/AIR-Bench)，针对检索任务和 RAG 场景设计。AIR-Bench 首次提出在检索任务中使用 LLMs 自动化生产评估数据，避免模型过拟合测试数据。AIR-Bench 不需要人工参与标注数据，因而可以更灵活覆盖更多垂直领域和不同语种。同时 AIR-Bench 会定期进行更新从而满足社区不断变化的评测需求。[Leaderboard](https://huggingface.co/spaces/AIR-Bench/leaderboard) :fire:
 - 4/30/2024: 发布[Llama-3-8B-Instruct-80K-QLoRA](https://huggingface.co/namespace-Pt/Llama-3-8B-Instruct-80K-QLoRA), 其通过在少量合成的长文本数据上的QLoRA训练，有效地将Llama-3-8B-Instruct的上下文长度从8K扩展到80K。详见[代码](https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/longllm_qlora) :fire:
@@ -59,8 +69,48 @@ FlagEmbedding专注于检索增强llm领域，目前包括以下项目:
 - 08/02/2023: :tada: :tada: 发布中英文向量模型BGE(BAAI General Embedding的缩写), **在MTEB和C-MTEB榜单上取得最好的性能** 
 - 08/01/2023: 发布大规模中文文本向量[评测榜单](https://github.com/FlagOpen/FlagEmbedding/blob/master/C_MTEB) (**C-MTEB**), 其包括31个测试任务.   
 
+</details>
 
 
+## 安装
+### 使用pip:
+```
+pip install -U FlagEmbedding
+```
+### 从源文件安装部署:
+
+克隆并安装FlagEmbedding：
+```
+git clone https://github.com/FlagOpen/FlagEmbedding.git
+cd FlagEmbedding
+pip install  .
+```
+在可编辑模式下安装:
+```
+pip install -e .
+```
+
+## 快速开始
+首先，加载一个BGE向量模型：
+```
+from FlagEmbedding import FlagModel
+
+model = FlagModel('BAAI/bge-base-en-v1.5',
+                  query_instruction_for_retrieval="Represent this sentence for searching relevant passages:",
+                  use_fp16=True)
+```
+将语句作为模型输入，得到向量：
+```
+sentences_1 = ["I love NLP", "I love machine learning"]
+sentences_2 = ["I love BGE", "I love text retrieval"]
+embeddings_1 = model.encode(sentences_1)
+embeddings_2 = model.encode(sentences_2)
+```
+取得向量后，通过内积计算相似度：
+```
+similarity = embeddings_1 @ embeddings_2.T
+print(similarity)
+```
 
 
 ## 项目
@@ -170,7 +220,9 @@ BGE Embedding是一个通用向量模型。 我们使用[retromae](https://githu
 
 
 
-## Contributors:
+## 贡献者:
+
+十分感谢所有参与FlagEmbedding社区成员的贡献，也欢迎新的成员加入！
 
 <a href="https://github.com/FlagOpen/FlagEmbedding/graphs/contributors">
   <img src="https://contrib.rocks/image?repo=FlagOpen/FlagEmbedding" />