embedding

BGE Embedding ICL

BGE Embedding ICL is an excellent all-around model for text embedding.

Model details

Developed by
BAAI
Use case
embedding
Optimization
BEI
Hardware
H100
API
OpenAI SDK
License
Apache 2.0

Example usage

BAAI/bge-en-icl is a text-embeddings model, producing a 1D embeddings vector, given an input. It's frequently used for downstream tasks like clustering, used with vector databases.

This model is quantized to FP8 for deployment, which is supported by Nvidia's newest GPUs e.g. H100, H100_40GB or L4. Quantization is optional, but leads to higher efficiency.

Input

1from openai import OpenAI
2import os
3
4client = OpenAI(
5    api_key=os.environ['BASETEN_API_KEY'],
6    base_url="https://model-xxxxxx.api.baseten.co/environments/production/sync/v1"
7)
8
9embedding = client.embeddings.create(
10    input="Baseten Embeddings are fast",
11    model="model"
12)

JSON output

1{
2    "data": [
3        {
4            "embedding": [
5                0
6            ],
7            "index": 0,
8            "object": "embedding"
9        }
10    ],
11    "model": "thenlper/gte-base",
12    "object": "list",
13    "usage": {
14        "prompt_tokens": 512,
15        "total_tokens": 512
16    }
17}