embedding
BGE Embedding ICL

BGE Embedding ICL is an excellent all-around model for text embedding.
Model details
View repositoryExample usage
BAAI/bge-en-icl
is a text-embeddings model, producing a 1D embeddings vector, given an input. It's frequently used for downstream tasks like clustering, used with vector databases.
This model is quantized to FP8 for deployment, which is supported by Nvidia's newest GPUs e.g. H100, H100_40GB or L4. Quantization is optional, but leads to higher efficiency.
Input
1from openai import OpenAI
2import os
3
4client = OpenAI(
5 api_key=os.environ['BASETEN_API_KEY'],
6 base_url="https://model-xxxxxx.api.baseten.co/environments/production/sync/v1"
7)
8
9embedding = client.embeddings.create(
10 input="Baseten Embeddings are fast",
11 model="model"
12)
JSON output
1{
2 "data": [
3 {
4 "embedding": [
5 0
6 ],
7 "index": 0,
8 "object": "embedding"
9 }
10 ],
11 "model": "thenlper/gte-base",
12 "object": "list",
13 "usage": {
14 "prompt_tokens": 512,
15 "total_tokens": 512
16 }
17}