Vertex AI Embedding
Usage - Embeddingโ
- SDK
- LiteLLM PROXY
import litellm
from litellm import embedding
litellm.vertex_project = "hardy-device-38811" # Your Project ID
litellm.vertex_location = "us-central1" # proj location
response = embedding(
model="vertex_ai/textembedding-gecko",
input=["good morning from litellm"],
)
print(response)
- Add model to config.yaml
model_list:
- model_name: snowflake-arctic-embed-m-long-1731622468876
litellm_params:
model: vertex_ai/<your-model-id>
vertex_project: "adroit-crow-413218"
vertex_location: "us-central1"
vertex_credentials: adroit-crow-413218-a956eef1a2a8.json
litellm_settings:
drop_params: True
- Start Proxy
$ litellm --config /path/to/config.yaml
- Make Request using OpenAI Python SDK, Langchain Python SDK
import openai
client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")
response = client.embeddings.create(
model="snowflake-arctic-embed-m-long-1731622468876",
input = ["good morning from litellm", "this is another item"],
)
print(response)
Supported Embedding Modelsโ
All models listed here are supported
| Model Name | Function Call |
|---|---|
| text-embedding-004 | embedding(model="vertex_ai/text-embedding-004", input) |
| text-multilingual-embedding-002 | embedding(model="vertex_ai/text-multilingual-embedding-002", input) |
| textembedding-gecko | embedding(model="vertex_ai/textembedding-gecko", input) |
| textembedding-gecko-multilingual | embedding(model="vertex_ai/textembedding-gecko-multilingual", input) |
| textembedding-gecko-multilingual@001 | embedding(model="vertex_ai/textembedding-gecko-multilingual@001", input) |
| textembedding-gecko@001 | embedding(model="vertex_ai/textembedding-gecko@001", input) |
| textembedding-gecko@003 | embedding(model="vertex_ai/textembedding-gecko@003", input) |
| text-embedding-preview-0409 | embedding(model="vertex_ai/text-embedding-preview-0409", input) |
| text-multilingual-embedding-preview-0409 | embedding(model="vertex_ai/text-multilingual-embedding-preview-0409", input) |
| Fine-tuned OR Custom Embedding models | embedding(model="vertex_ai/<your-model-id>", input) |
Supported OpenAI (Unified) Paramsโ
| param | type | vertex equivalent |
|---|---|---|
input | string or List[string] | instances |
dimensions | int | output_dimensionality |
input_type | Literal["RETRIEVAL_QUERY","RETRIEVAL_DOCUMENT", "SEMANTIC_SIMILARITY", "CLASSIFICATION", "CLUSTERING", "QUESTION_ANSWERING", "FACT_VERIFICATION"] | task_type |
Usage with OpenAI (Unified) Paramsโ
- SDK
- LiteLLM PROXY
response = litellm.embedding(
model="vertex_ai/text-embedding-004",
input=["good morning from litellm", "gm"]
input_type = "RETRIEVAL_DOCUMENT",
dimensions=1,
)
import openai
client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")
response = client.embeddings.create(
model="text-embedding-004",
input = ["good morning from litellm", "gm"],
dimensions=1,
extra_body = {
"input_type": "RETRIEVAL_QUERY",
}
)
print(response)
Supported Vertex Specific Paramsโ
| param | type |
|---|---|
auto_truncate | bool |
task_type | Literal["RETRIEVAL_QUERY","RETRIEVAL_DOCUMENT", "SEMANTIC_SIMILARITY", "CLASSIFICATION", "CLUSTERING", "QUESTION_ANSWERING", "FACT_VERIFICATION"] |
title | str |
Usage with Vertex Specific Params (Use task_type and title)โ
You can pass any vertex specific params to the embedding model. Just pass them to the embedding function like this:
Relevant Vertex AI doc with all embedding params
- SDK
- LiteLLM PROXY
response = litellm.embedding(
model="vertex_ai/text-embedding-004",
input=["good morning from litellm", "gm"]
task_type = "RETRIEVAL_DOCUMENT",
title = "test",
dimensions=1,
auto_truncate=True,
)
import openai
client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")
response = client.embeddings.create(
model="text-embedding-004",
input = ["good morning from litellm", "gm"],
dimensions=1,
extra_body = {
"task_type": "RETRIEVAL_QUERY",
"auto_truncate": True,
"title": "test",
}
)
print(response)
BGE Embeddingsโ
Use BGE (Baidu General Embedding) models deployed on Vertex AI.
Usageโ
- SDK
- LiteLLM PROXY
Using BGE on Vertex AI
import litellm
response = litellm.embedding(
model="vertex_ai/bge/<your-endpoint-id>",
input=["Hello", "World"],
vertex_project="your-project-id",
vertex_location="your-location"
)
print(response)
- Add model to config.yaml
config.yaml
model_list:
- model_name: bge-embedding
litellm_params:
model: vertex_ai/bge/<your-endpoint-id>
vertex_project: "your-project-id"
vertex_location: "us-central1"
vertex_credentials: your-credentials.json
litellm_settings:
drop_params: True
- Start Proxy
$ litellm --config /path/to/config.yaml
- Make Request using OpenAI Python SDK
Making requests to BGE
import openai
client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")
response = client.embeddings.create(
model="bge-embedding",
input=["good morning from litellm", "this is another item"]
)
print(response)
Using a Private Service Connect (PSC) endpoint
config.yaml (PSC)
model_list:
- model_name: bge-small-en-v1.5
litellm_params:
model: vertex_ai/bge/1234567890
api_base: http://10.96.32.8 # Your PSC IP
vertex_project: my-project-id #optional
vertex_location: us-central1 #optional
Multi-Modal Embeddingsโ
Known Limitations:
- Only supports 1 image / video / image per request
- Only supports GCS or base64 encoded images / videos
Usageโ
- SDK
- LiteLLM PROXY (Unified Endpoint)
- LiteLLM PROXY (Vertex SDK)
Using GCS Images
response = await litellm.aembedding(
model="vertex_ai/multimodalembedding@001",
input="gs://cloud-samples-data/vertex-ai/llm/prompts/landmark1.png" # will be sent as a gcs image
)
Using base 64 encoded images
response = await litellm.aembedding(
model="vertex_ai/multimodalembedding@001",
input="data:image/jpeg;base64,..." # will be sent as a base64 encoded image
)
- Add model to config.yaml
model_list:
- model_name: multimodalembedding@001
litellm_params:
model: vertex_ai/multimodalembedding@001
vertex_project: "adroit-crow-413218"
vertex_location: "us-central1"
vertex_credentials: adroit-crow-413218-a956eef1a2a8.json
litellm_settings:
drop_params: True
- Start Proxy
$ litellm --config /path/to/config.yaml
- Make Request use OpenAI Python SDK, Langchain Python SDK
- OpenAI SDK
- Langchain
Requests with GCS Image / Video URI
import openai
client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")
# # request sent to model set on litellm proxy, `litellm --model`
response = client.embeddings.create(
model="multimodalembedding@001",
input = "gs://cloud-samples-data/vertex-ai/llm/prompts/landmark1.png",
)
print(response)
Requests with base64 encoded images
import openai
client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")
# # request sent to model set on litellm proxy, `litellm --model`
response = client.embeddings.create(
model="multimodalembedding@001",
input = "data:image/jpeg;base64,...",
)
print(response)
Requests with GCS Image / Video URI
from langchain_openai import OpenAIEmbeddings
embeddings_models = "multimodalembedding@001"
embeddings = OpenAIEmbeddings(
model="multimodalembedding@001",
base_url="http://0.0.0.0:4000",
api_key="sk-1234", # type: ignore
)
query_result = embeddings.embed_query(
"gs://cloud-samples-data/vertex-ai/llm/prompts/landmark1.png"
)
print(query_result)
Requests with base64 encoded images
from langchain_openai import OpenAIEmbeddings
embeddings_models = "multimodalembedding@001"
embeddings = OpenAIEmbeddings(
model="multimodalembedding@001",
base_url="http://0.0.0.0:4000",
api_key="sk-1234", # type: ignore
)
query_result = embeddings.embed_query(
"data:image/jpeg;base64,..."
)
print(query_result)
- Add model to config.yaml
default_vertex_config:
vertex_project: "adroit-crow-413218"
vertex_location: "us-central1"
vertex_credentials: adroit-crow-413218-a956eef1a2a8.json
- Start Proxy
$ litellm --config /path/to/config.yaml
- Make Request use OpenAI Python SDK
import vertexai
from vertexai.vision_models import Image, MultiModalEmbeddingModel, Video
from vertexai.vision_models import VideoSegmentConfig
from google.auth.credentials import Credentials
LITELLM_PROXY_API_KEY = "sk-1234"
LITELLM_PROXY_BASE = "http://0.0.0.0:4000/vertex-ai"
import datetime
class CredentialsWrapper(Credentials):
def __init__(self, token=None):
super().__init__()
self.token = token
self.expiry = None # or set to a future date if needed
def refresh(self, request):
pass
def apply(self, headers, token=None):
headers['Authorization'] = f'Bearer {self.token}'
@property
def expired(self):
return False # Always consider the token as non-expired
@property
def valid(self):
return True # Always consider the credentials as valid
credentials = CredentialsWrapper(token=LITELLM_PROXY_API_KEY)
vertexai.init(
project="adroit-crow-413218",
location="us-central1",
api_endpoint=LITELLM_PROXY_BASE,
credentials = credentials,
api_transport="rest",
)
model = MultiModalEmbeddingModel.from_pretrained("multimodalembedding")
image = Image.load_from_file(
"gs://cloud-samples-data/vertex-ai/llm/prompts/landmark1.png"
)
embeddings = model.get_embeddings(
image=image,
contextual_text="Colosseum",
dimension=1408,
)
print(f"Image Embedding: {embeddings.image_embedding}")
print(f"Text Embedding: {embeddings.text_embedding}")
Text + Image + Video Embeddingsโ
- SDK
- LiteLLM PROXY (Unified Endpoint)
Text + Image
response = await litellm.aembedding(
model="vertex_ai/multimodalembedding@001",
input=["hey", "gs://cloud-samples-data/vertex-ai/llm/prompts/landmark1.png"] # will be sent as a gcs image
)
Text + Video
response = await litellm.aembedding(
model="vertex_ai/multimodalembedding@001",
input=["hey", "gs://my-bucket/embeddings/supermarket-video.mp4"] # will be sent as a gcs image
)
Image + Video
response = await litellm.aembedding(
model="vertex_ai/multimodalembedding@001",
input=["gs://cloud-samples-data/vertex-ai/llm/prompts/landmark1.png", "gs://my-bucket/embeddings/supermarket-video.mp4"] # will be sent as a gcs image
)
- Add model to config.yaml
model_list:
- model_name: multimodalembedding@001
litellm_params:
model: vertex_ai/multimodalembedding@001
vertex_project: "adroit-crow-413218"
vertex_location: "us-central1"
vertex_credentials: adroit-crow-413218-a956eef1a2a8.json
litellm_settings:
drop_params: True
- Start Proxy
$ litellm --config /path/to/config.yaml
- Make Request use OpenAI Python SDK, Langchain Python SDK
Text + Image
import openai
client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")
# # request sent to model set on litellm proxy, `litellm --model`
response = client.embeddings.create(
model="multimodalembedding@001",
input = ["hey", "gs://cloud-samples-data/vertex-ai/llm/prompts/landmark1.png"],
)
print(response)
Text + Video
import openai
client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")
# # request sent to model set on litellm proxy, `litellm --model`
response = client.embeddings.create(
model="multimodalembedding@001",
input = ["hey", "gs://my-bucket/embeddings/supermarket-video.mp4"],
)
print(response)
Image + Video
import openai
client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")
# # request sent to model set on litellm proxy, `litellm --model`
response = client.embeddings.create(
model="multimodalembedding@001",
input = ["gs://cloud-samples-data/vertex-ai/llm/prompts/landmark1.png", "gs://my-bucket/embeddings/supermarket-video.mp4"],
)
print(response)