Vertex AI Embedding

Usage - Embedding

SDK
LiteLLM PROXY

import litellm
from litellm import embedding
litellm.vertex_project = "hardy-device-38811" # Your Project ID
litellm.vertex_location = "us-central1"  # proj location

response = embedding(
    model="vertex_ai/textembedding-gecko",
    input=["good morning from litellm"],
)
print(response)

Add model to config.yaml

model_list:
  - model_name: snowflake-arctic-embed-m-long-1731622468876
    litellm_params:
      model: vertex_ai/<your-model-id>
      vertex_project: "adroit-crow-413218"
      vertex_location: "us-central1"
      vertex_credentials: adroit-crow-413218-a956eef1a2a8.json 

litellm_settings:
  drop_params: True

Start Proxy

$ litellm --config /path/to/config.yaml

Make Request using OpenAI Python SDK, Langchain Python SDK

import openai

client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")

response = client.embeddings.create(
    model="snowflake-arctic-embed-m-long-1731622468876", 
    input = ["good morning from litellm", "this is another item"],
)

print(response)

Supported Embedding Models

All models listed here are supported

Model Name	Function Call
text-embedding-004	`embedding(model="vertex_ai/text-embedding-004", input)`
text-multilingual-embedding-002	`embedding(model="vertex_ai/text-multilingual-embedding-002", input)`
textembedding-gecko	`embedding(model="vertex_ai/textembedding-gecko", input)`
textembedding-gecko-multilingual	`embedding(model="vertex_ai/textembedding-gecko-multilingual", input)`
textembedding-gecko-multilingual@001	`embedding(model="vertex_ai/textembedding-gecko-multilingual@001", input)`
textembedding-gecko@001	`embedding(model="vertex_ai/textembedding-gecko@001", input)`
textembedding-gecko@003	`embedding(model="vertex_ai/textembedding-gecko@003", input)`
text-embedding-preview-0409	`embedding(model="vertex_ai/text-embedding-preview-0409", input)`
text-multilingual-embedding-preview-0409	`embedding(model="vertex_ai/text-multilingual-embedding-preview-0409", input)`
Fine-tuned OR Custom Embedding models	`embedding(model="vertex_ai/<your-model-id>", input)`

Supported OpenAI (Unified) Params

param	type	vertex equivalent
`input`	string or List[string]	`instances`
`dimensions`	int	`output_dimensionality`
`input_type`	Literal["RETRIEVAL_QUERY","RETRIEVAL_DOCUMENT", "SEMANTIC_SIMILARITY", "CLASSIFICATION", "CLUSTERING", "QUESTION_ANSWERING", "FACT_VERIFICATION"]	`task_type`

Usage with OpenAI (Unified) Params

SDK
LiteLLM PROXY

response = litellm.embedding(
    model="vertex_ai/text-embedding-004",
    input=["good morning from litellm", "gm"]
    input_type = "RETRIEVAL_DOCUMENT",
    dimensions=1,
)

import openai

client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")

response = client.embeddings.create(
    model="text-embedding-004", 
    input = ["good morning from litellm", "gm"],
    dimensions=1,
    extra_body = {
        "input_type": "RETRIEVAL_QUERY",
    }
)

print(response)

Supported Vertex Specific Params

param	type
`auto_truncate`	bool
`task_type`	Literal["RETRIEVAL_QUERY","RETRIEVAL_DOCUMENT", "SEMANTIC_SIMILARITY", "CLASSIFICATION", "CLUSTERING", "QUESTION_ANSWERING", "FACT_VERIFICATION"]
`title`	str

Usage with Vertex Specific Params (Use `task_type` and `title`)

You can pass any vertex specific params to the embedding model. Just pass them to the embedding function like this:

Relevant Vertex AI doc with all embedding params

SDK
LiteLLM PROXY

response = litellm.embedding(
    model="vertex_ai/text-embedding-004",
    input=["good morning from litellm", "gm"]
    task_type = "RETRIEVAL_DOCUMENT",
    title = "test",
    dimensions=1,
    auto_truncate=True,
)

import openai

client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")

response = client.embeddings.create(
    model="text-embedding-004", 
    input = ["good morning from litellm", "gm"],
    dimensions=1,
    extra_body = {
        "task_type": "RETRIEVAL_QUERY",
        "auto_truncate": True,
        "title": "test",
    }
)

print(response)

BGE Embeddings

Use BGE (Baidu General Embedding) models deployed on Vertex AI.

Usage

SDK
LiteLLM PROXY

Using BGE on Vertex AI
import litellm

response = litellm.embedding(
    model="vertex_ai/bge/<your-endpoint-id>",
    input=["Hello", "World"],
    vertex_project="your-project-id",
    vertex_location="your-location"
)

print(response)

Add model to config.yaml

config.yaml
model_list:
  - model_name: bge-embedding
    litellm_params:
      model: vertex_ai/bge/<your-endpoint-id>
      vertex_project: "your-project-id"
      vertex_location: "us-central1"
      vertex_credentials: your-credentials.json

litellm_settings:
  drop_params: True

Start Proxy

$ litellm --config /path/to/config.yaml

Make Request using OpenAI Python SDK

Making requests to BGE
import openai

client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")

response = client.embeddings.create(
    model="bge-embedding",
    input=["good morning from litellm", "this is another item"]
)

print(response)

Using a Private Service Connect (PSC) endpoint

config.yaml (PSC)
model_list:
  - model_name: bge-small-en-v1.5
    litellm_params:
      model: vertex_ai/bge/1234567890 
      api_base: http://10.96.32.8  # Your PSC IP
      vertex_project: my-project-id  #optional
      vertex_location: us-central1 #optional

Known Limitations:

Only supports 1 image / video / image per request
Only supports GCS or base64 encoded images / videos

Usage

SDK
LiteLLM PROXY (Unified Endpoint)
LiteLLM PROXY (Vertex SDK)

Using GCS Images

response = await litellm.aembedding(
    model="vertex_ai/multimodalembedding@001",
    input="gs://cloud-samples-data/vertex-ai/llm/prompts/landmark1.png" # will be sent as a gcs image
)

Using base 64 encoded images

response = await litellm.aembedding(
    model="vertex_ai/multimodalembedding@001",
    input="data:image/jpeg;base64,..." # will be sent as a base64 encoded image
)

Add model to config.yaml

model_list:
  - model_name: multimodalembedding@001
    litellm_params:
      model: vertex_ai/multimodalembedding@001
      vertex_project: "adroit-crow-413218"
      vertex_location: "us-central1"
      vertex_credentials: adroit-crow-413218-a956eef1a2a8.json 

litellm_settings:
  drop_params: True

Start Proxy

$ litellm --config /path/to/config.yaml

Make Request use OpenAI Python SDK, Langchain Python SDK

OpenAI SDK
Langchain

Requests with GCS Image / Video URI

import openai

client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")

# # request sent to model set on litellm proxy, `litellm --model`
response = client.embeddings.create(
    model="multimodalembedding@001", 
    input = "gs://cloud-samples-data/vertex-ai/llm/prompts/landmark1.png",
)

print(response)

Requests with base64 encoded images

import openai

client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")

# # request sent to model set on litellm proxy, `litellm --model`
response = client.embeddings.create(
    model="multimodalembedding@001", 
    input = "data:image/jpeg;base64,...",
)

print(response)

Requests with GCS Image / Video URI

from langchain_openai import OpenAIEmbeddings

embeddings_models = "multimodalembedding@001"

embeddings = OpenAIEmbeddings(
    model="multimodalembedding@001",
    base_url="http://0.0.0.0:4000",
    api_key="sk-1234",  # type: ignore
)


query_result = embeddings.embed_query(
    "gs://cloud-samples-data/vertex-ai/llm/prompts/landmark1.png"
)
print(query_result)

Requests with base64 encoded images

from langchain_openai import OpenAIEmbeddings

embeddings_models = "multimodalembedding@001"

embeddings = OpenAIEmbeddings(
    model="multimodalembedding@001",
    base_url="http://0.0.0.0:4000",
    api_key="sk-1234",  # type: ignore
)


query_result = embeddings.embed_query(
    "data:image/jpeg;base64,..."
)
print(query_result)

Add model to config.yaml

default_vertex_config:
  vertex_project: "adroit-crow-413218"
  vertex_location: "us-central1"
  vertex_credentials: adroit-crow-413218-a956eef1a2a8.json 

Start Proxy

$ litellm --config /path/to/config.yaml

Make Request use OpenAI Python SDK

import vertexai

from vertexai.vision_models import Image, MultiModalEmbeddingModel, Video
from vertexai.vision_models import VideoSegmentConfig
from google.auth.credentials import Credentials


LITELLM_PROXY_API_KEY = "sk-1234"
LITELLM_PROXY_BASE = "http://0.0.0.0:4000/vertex-ai"

import datetime

class CredentialsWrapper(Credentials):
    def __init__(self, token=None):
        super().__init__()
        self.token = token
        self.expiry = None  # or set to a future date if needed
        
    def refresh(self, request):
        pass
    
    def apply(self, headers, token=None):
        headers['Authorization'] = f'Bearer {self.token}'

    @property
    def expired(self):
        return False  # Always consider the token as non-expired

    @property
    def valid(self):
        return True  # Always consider the credentials as valid

credentials = CredentialsWrapper(token=LITELLM_PROXY_API_KEY)

vertexai.init(
    project="adroit-crow-413218",
    location="us-central1",
    api_endpoint=LITELLM_PROXY_BASE,
    credentials = credentials,
    api_transport="rest",
   
)

model = MultiModalEmbeddingModel.from_pretrained("multimodalembedding")
image = Image.load_from_file(
    "gs://cloud-samples-data/vertex-ai/llm/prompts/landmark1.png"
)

embeddings = model.get_embeddings(
    image=image,
    contextual_text="Colosseum",
    dimension=1408,
)
print(f"Image Embedding: {embeddings.image_embedding}")
print(f"Text Embedding: {embeddings.text_embedding}")

Text + Image + Video Embeddings

SDK
LiteLLM PROXY (Unified Endpoint)

Text + Image

response = await litellm.aembedding(
    model="vertex_ai/multimodalembedding@001",
    input=["hey", "gs://cloud-samples-data/vertex-ai/llm/prompts/landmark1.png"] # will be sent as a gcs image
)

Text + Video

response = await litellm.aembedding(
    model="vertex_ai/multimodalembedding@001",
    input=["hey", "gs://my-bucket/embeddings/supermarket-video.mp4"] # will be sent as a gcs image
)

Image + Video

response = await litellm.aembedding(
    model="vertex_ai/multimodalembedding@001",
    input=["gs://cloud-samples-data/vertex-ai/llm/prompts/landmark1.png", "gs://my-bucket/embeddings/supermarket-video.mp4"] # will be sent as a gcs image
)

Add model to config.yaml

model_list:
  - model_name: multimodalembedding@001
    litellm_params:
      model: vertex_ai/multimodalembedding@001
      vertex_project: "adroit-crow-413218"
      vertex_location: "us-central1"
      vertex_credentials: adroit-crow-413218-a956eef1a2a8.json 

litellm_settings:
  drop_params: True

Start Proxy

$ litellm --config /path/to/config.yaml

Make Request use OpenAI Python SDK, Langchain Python SDK

Text + Image

import openai

client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")

# # request sent to model set on litellm proxy, `litellm --model`
response = client.embeddings.create(
    model="multimodalembedding@001", 
    input = ["hey", "gs://cloud-samples-data/vertex-ai/llm/prompts/landmark1.png"],
)

print(response)

Text + Video

import openai

client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")

# # request sent to model set on litellm proxy, `litellm --model`
response = client.embeddings.create(
    model="multimodalembedding@001", 
    input = ["hey", "gs://my-bucket/embeddings/supermarket-video.mp4"],
)

print(response)

Image + Video

import openai

client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")

# # request sent to model set on litellm proxy, `litellm --model`
response = client.embeddings.create(
    model="multimodalembedding@001", 
    input = ["gs://cloud-samples-data/vertex-ai/llm/prompts/landmark1.png", "gs://my-bucket/embeddings/supermarket-video.mp4"],
)

print(response)

Usage - Embedding​

Supported Embedding Models​

Supported OpenAI (Unified) Params​

Usage with OpenAI (Unified) Params​

Supported Vertex Specific Params​

Usage with Vertex Specific Params (Use task_type and title)​

BGE Embeddings​

Usage​

Multi-Modal Embeddings​

Usage​

Text + Image + Video Embeddings​

Usage - Embedding

Supported Embedding Models

Supported OpenAI (Unified) Params

Usage with OpenAI (Unified) Params

Supported Vertex Specific Params

Usage with Vertex Specific Params (Use `task_type` and `title`)

BGE Embeddings

Usage

Multi-Modal Embeddings

Usage

Text + Image + Video Embeddings