v1.63.14-stable

March 22, 2025

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaffer

CTO, LiteLLM

These are the changes since v1.63.11-stable.

This release brings:

LLM Translation Improvements (MCP Support and Bedrock Application Profiles)
Perf improvements for Usage-based Routing
Streaming guardrail support via websockets
Azure OpenAI client perf fix (from previous release)

Docker Run LiteLLM Proxy

docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
ghcr.io/berriai/litellm:main-v1.63.14-stable.patch1

Demo Instance

Here's a Demo Instance to test changes:

Instance: https://demo.litellm.ai/
Login Credentials:
- Username: admin
- Password: sk-1234

New Models / Updated Models

Azure gpt-4o - fixed pricing to latest global pricing - PR
O1-Pro - add pricing + model information - PR
Azure AI - mistral 3.1 small pricing added - PR
Azure - gpt-4.5-preview pricing added - PR

LLM Translation

New LLM Features

Bedrock: Support bedrock application inference profiles Docs
- Infer aws region from bedrock application profile id - (arn:aws:bedrock:us-east-1:...)
Ollama - support calling via /v1/completions Get Started
Bedrock - support us.deepseek.r1-v1:0 model name Docs
OpenRouter - OPENROUTER_API_BASE env var support Docs
Azure - add audio model parameter support - Docs
OpenAI - PDF File support Docs
OpenAI - o1-pro Responses API streaming support Docs
[BETA] MCP - Use MCP Tools with LiteLLM SDK Docs

Bug Fixes

Voyage: prompt token on embedding tracking fix - PR
Sagemaker - Fix ‘Too little data for declared Content-Length’ error - PR
OpenAI-compatible models - fix issue when calling openai-compatible models w/ custom_llm_provider set - PR
VertexAI - Embedding ‘outputDimensionality’ support - PR
Anthropic - return consistent json response format on streaming/non-streaming - PR

Spend Tracking Improvements

litellm_proxy/ - support reading litellm response cost header from proxy, when using client sdk
Reset Budget Job - fix budget reset error on keys/teams/users PR
Streaming - Prevents final chunk w/ usage from being ignored (impacted bedrock streaming + cost tracking) PR

UI

Users Page
- Feature: Control default internal user settings PR
Icons:
- Feature: Replace external "artificialanalysis.ai" icons by local svg PR
Sign In/Sign Out
- Fix: Default login when default_user_id user does not exist in DB PR

Logging Integrations

Support post-call guardrails for streaming responses Get Started
Arize Get Started
- fix invalid package import PR
- migrate to using standardloggingpayload for metadata, ensures spans land successfully PR
- fix logging to just log the LLM I/O PR
- Dynamic API Key/Space param support Get Started
StandardLoggingPayload - Log litellm_model_name in payload. Allows knowing what the model sent to API provider was Get Started
Prompt Management - Allow building custom prompt management integration Get Started

Performance / Reliability improvements

Redis Caching - add 5s default timeout, prevents hanging redis connection from impacting llm calls PR
Allow disabling all spend updates / writes to DB - patch to allow disabling all spend updates to DB with a flag PR
Azure OpenAI - correctly re-use azure openai client, fixes perf issue from previous Stable release PR
Azure OpenAI - uses litellm.ssl_verify on Azure/OpenAI clients PR
Usage-based routing - Wildcard model support Get Started
Usage-based routing - Support batch writing increments to redis - reduces latency to same as ‘simple-shuffle’ PR
Router - show reason for model cooldown on ‘no healthy deployments available error’ PR
Caching - add max value limit to an item in in-memory cache (1MB) - prevents OOM errors on large image url’s being sent through proxy PR

General Improvements

Passthrough Endpoints - support returning api-base on pass-through endpoints Response Headers Docs
SSL - support reading ssl security level from env var - Allows user to specify lower security settings Get Started
Credentials - only poll Credentials table when STORE_MODEL_IN_DB is True PR
Image URL Handling - new architecture doc on image url handling Docs
OpenAI - bump to pip install "openai==1.68.2" PR
Gunicorn - security fix - bump gunicorn==23.0.0 PR

Complete Git Diff

Here's the complete git diff

Docker Run LiteLLM Proxy​

Demo Instance​

New Models / Updated Models​

LLM Translation​

Spend Tracking Improvements​

UI​

Logging Integrations​

Performance / Reliability improvements​

General Improvements​

Complete Git Diff​

Docker Run LiteLLM Proxy

Demo Instance

New Models / Updated Models

LLM Translation

Spend Tracking Improvements

UI

Logging Integrations

Performance / Reliability improvements

General Improvements

Complete Git Diff