-
Sameer Kankute authored
* fix(opentelemetry): JSON-serialize dict metadata fields for OTEL span attributes (#27451) (#27455) Squash-merged by litellm-agent from Anai-Guo's PR. * feat(dashscope): add embeddings and reranks(qwen3-rerank) support via OpenAI-compatible endpoint (#27508) Squash-merged by litellm-agent from yimao's PR. * fix(vertex_ai/gemini): raise BadRequestError when image_url or url fi… (#24550) Squash-merged by litellm-agent from krisxia0506's PR. * fix(vertex_ai): raise error on mid-stream 429/error chunks instead of silently swallowing (#23711) Squash-merged by litellm-agent from krisxia0506's PR. * fix: raise BadRequestError for file content blocks missing 'file' sub… (#24503) Squash-merged by litellm-agent from krisxia0506's PR. * Fix Gemini MIME detection for extensionless GCS URIs (#27278) Squash-merged by litellm-agent from krisxia0506's PR. * fix(vertex_ai/partner_models): drop unused vertexai SDK gate from count_tokens (closes #28084) (#28107) Squash-merged by litellm-agent from voidborne-d's PR. * feat(chart): add support for autoscaling behavior in HPA (#27990) Squash-merged by litellm-agent from FabrizioCafolla's PR. * feat(proxy): add blocked flag to models for pause/resume from the UI (#27927) Squash-merged by litellm-agent from Cyberfilo's PR. * fix: pass socket timeouts to Redis cluster clients (#27920) Squash-merged by litellm-agent from tomdee's PR. * Fix/cache token (#28009) Squash-merged by litellm-agent from escon1004's PR. * fix(deepseek): forward reasoning_content in multi-turn thinking mode conversations (#28080) Squash-merged by litellm-agent from Divyansh8321's PR. * fix(guardrails): return HTTP 400 instead of 500 for blocked requests (#27617) * fix: reset org and tag budgets (#27326) * reset org budgets * reset tag budgets --------- Co-authored-by:
Michael Riad Zaky <michaelr@Mac.localdomain> * fix(ui): omit allowed_routes from key edit save when unchanged (#27553) * fix(ui): omit allowed_routes from key edit save when unchanged When a team admin opens Edit Settings on a key with key_type=AI APIs and saves without changing anything, the UI re-sends the existing allowed_routes value, which the backend's _check_allowed_routes_caller_permission gate rejects for non-proxy-admins (LIT-2681). Strip allowed_routes from the patch in handleSubmit when it deep-equals the original keyData.allowed_routes. The backend treats absence as "leave alone," so no-op saves now succeed for non-admins. Admins explicitly editing the field still send the new value. * fix(ui): order-insensitive allowed_routes diff + cover null-original case Address Greptile review: - Switch the "is allowed_routes unchanged" check to a Set-based comparison so a server-side reorder of the array doesn't register as a user edit and re-trigger LIT-2681. - Add two regression tests: (1) keyData.allowed_routes is null and the form is untouched — patch should strip the field; (2) server returned routes in a different order than the user originally entered — patch should still recognize the value as unchanged. * chore(ui): strip ticket refs and tighten comments in key edit fix - Remove internal-tracker references from in-code comments - Tighten the WHY comment in handleSubmit to two lines - Drop redundant test-block comments — test names already describe the case * fix(ui): annotate Set<string> generic in allowed_routes diff to fix tsc * fix(guardrails): return HTTP 400 instead of 500 for guardrail-blocked requests GuardrailRaisedException and BlockedPiiEntityError both lacked a status_code attribute. When these exceptions reached the proxy exception handler (getattr(e, 'status_code', 500)), the fallback defaulted to HTTP 500 — making intentional guardrail blocks indistinguishable from server errors and causing unnecessary client retries. Changes: - Add status_code=400 (keyword-only) to GuardrailRaisedException - Add status_code=400 (keyword-only) to BlockedPiiEntityError - Update _is_guardrail_intervention() to recognize both exceptions so downstream loggers record 'guardrail_intervened' instead of 'guardrail_failed_to_respond' - Add 6 unit tests for default/custom status codes and getattr pattern - Strengthen existing blocked-action test with status_code assertion Fixes #24348 --------- Co-authored-by:
Michael-RZ-Berri <michael@berri.ai> Co-authored-by:
Michael Riad Zaky <michaelr@Mac.localdomain> Co-authored-by:
ryan-crabbe-berri <ryan@berri.ai> Co-authored-by:
Krrish Dholakia <krrish+github@berri.ai> * fix(router/proxy): address Greptile P1+P2 review comments on PR #28161 - router: raise ServiceUnavailableError (503) instead of RouterRateLimitErrorBasic (429) when a specifically-addressed deployment is administratively blocked; 429 misleads retry-enabled clients into spinning forever against a paused model - proxy_server: compute get_fully_blocked_model_names() once before both branches in model_list() instead of duplicating the call in each branch - deepseek: upgrade silent debug log to warning when injecting placeholder reasoning_content so callers are clearly notified of degraded multi-turn quality - tests: update two blocked-deployment assertions to expect ServiceUnavailableError Co-authored-by:
Cursor <cursoragent@cursor.com> * fix: address bug detection findings (cache token order, mutable defaults) Co-authored-by:
Yassin Kortam <yassin@berri.ai> * fix: address bugs in async pass-through, anthropic cache token detection, rerank tests - async_get_available_deployment_for_pass_through: enforce blocked check on specific deployments - cost_calculator: detect anthropic-style usage by attribute presence (not truthiness) to avoid mixing OpenAI cached_tokens into anthropic normalization when read=0 - dashscope rerank tests: pass request to httpx.Response constructions for consistency Co-authored-by:
Yassin Kortam <yassin@berri.ai> * fix code qa * fix(vertex_ai/gemini): strip MIME parameters from GCS contentType GCS object metadata's contentType field can include parameters such as 'text/html; charset=utf-8'. Strip them in _apply_gemini_mime_type_aliases so downstream get_file_extension_from_mime_type sees a bare MIME type. Co-authored-by:
Yassin Kortam <yassin@berri.ai> * fix(vertex_ai/gemini): clarify mime-type error message string concatenation Co-authored-by:
Yassin Kortam <yassin@berri.ai> --------- Co-authored-by:
Tai An <antai12232931@outlook.com> Co-authored-by:
Vincent <yimao1231@gmail.com> Co-authored-by:
Kris Xia <xiajiayi0506@gmail.com> Co-authored-by: d
🔹 <liusway405@gmail.com> Co-authored-by:Fabrizio Cafolla <developer@fabriziocafolla.com> Co-authored-by:
Filippo Menghi <113345637+Cyberfilo@users.noreply.github.com> Co-authored-by:
Tom Denham <tom@tomdee.co.uk> Co-authored-by:
escon1004 <70471150+escon1004@users.noreply.github.com> Co-authored-by:
Divyansh Singhal <97736786+Divyansh8321@users.noreply.github.com> Co-authored-by:
robin-fiddler <robin@fiddler.ai> Co-authored-by:
Michael-RZ-Berri <michael@berri.ai> Co-authored-by:
Michael Riad Zaky <michaelr@Mac.localdomain> Co-authored-by:
ryan-crabbe-berri <ryan@berri.ai> Co-authored-by:
Krrish Dholakia <krrish+github@berri.ai> Co-authored-by:
Cursor <cursoragent@cursor.com> Co-authored-by:
Yassin Kortam <yassin@berri.ai>
Sameer Kankute authored* fix(opentelemetry): JSON-serialize dict metadata fields for OTEL span attributes (#27451) (#27455) Squash-merged by litellm-agent from Anai-Guo's PR. * feat(dashscope): add embeddings and reranks(qwen3-rerank) support via OpenAI-compatible endpoint (#27508) Squash-merged by litellm-agent from yimao's PR. * fix(vertex_ai/gemini): raise BadRequestError when image_url or url fi… (#24550) Squash-merged by litellm-agent from krisxia0506's PR. * fix(vertex_ai): raise error on mid-stream 429/error chunks instead of silently swallowing (#23711) Squash-merged by litellm-agent from krisxia0506's PR. * fix: raise BadRequestError for file content blocks missing 'file' sub… (#24503) Squash-merged by litellm-agent from krisxia0506's PR. * Fix Gemini MIME detection for extensionless GCS URIs (#27278) Squash-merged by litellm-agent from krisxia0506's PR. * fix(vertex_ai/partner_models): drop unused vertexai SDK gate from count_tokens (closes #28084) (#28107) Squash-merged by litellm-agent from voidborne-d's PR. * feat(chart): add support for autoscaling behavior in HPA (#27990) Squash-merged by litellm-agent from FabrizioCafolla's PR. * feat(proxy): add blocked flag to models for pause/resume from the UI (#27927) Squash-merged by litellm-agent from Cyberfilo's PR. * fix: pass socket timeouts to Redis cluster clients (#27920) Squash-merged by litellm-agent from tomdee's PR. * Fix/cache token (#28009) Squash-merged by litellm-agent from escon1004's PR. * fix(deepseek): forward reasoning_content in multi-turn thinking mode conversations (#28080) Squash-merged by litellm-agent from Divyansh8321's PR. * fix(guardrails): return HTTP 400 instead of 500 for blocked requests (#27617) * fix: reset org and tag budgets (#27326) * reset org budgets * reset tag budgets --------- Co-authored-by:
Michael Riad Zaky <michaelr@Mac.localdomain> * fix(ui): omit allowed_routes from key edit save when unchanged (#27553) * fix(ui): omit allowed_routes from key edit save when unchanged When a team admin opens Edit Settings on a key with key_type=AI APIs and saves without changing anything, the UI re-sends the existing allowed_routes value, which the backend's _check_allowed_routes_caller_permission gate rejects for non-proxy-admins (LIT-2681). Strip allowed_routes from the patch in handleSubmit when it deep-equals the original keyData.allowed_routes. The backend treats absence as "leave alone," so no-op saves now succeed for non-admins. Admins explicitly editing the field still send the new value. * fix(ui): order-insensitive allowed_routes diff + cover null-original case Address Greptile review: - Switch the "is allowed_routes unchanged" check to a Set-based comparison so a server-side reorder of the array doesn't register as a user edit and re-trigger LIT-2681. - Add two regression tests: (1) keyData.allowed_routes is null and the form is untouched — patch should strip the field; (2) server returned routes in a different order than the user originally entered — patch should still recognize the value as unchanged. * chore(ui): strip ticket refs and tighten comments in key edit fix - Remove internal-tracker references from in-code comments - Tighten the WHY comment in handleSubmit to two lines - Drop redundant test-block comments — test names already describe the case * fix(ui): annotate Set<string> generic in allowed_routes diff to fix tsc * fix(guardrails): return HTTP 400 instead of 500 for guardrail-blocked requests GuardrailRaisedException and BlockedPiiEntityError both lacked a status_code attribute. When these exceptions reached the proxy exception handler (getattr(e, 'status_code', 500)), the fallback defaulted to HTTP 500 — making intentional guardrail blocks indistinguishable from server errors and causing unnecessary client retries. Changes: - Add status_code=400 (keyword-only) to GuardrailRaisedException - Add status_code=400 (keyword-only) to BlockedPiiEntityError - Update _is_guardrail_intervention() to recognize both exceptions so downstream loggers record 'guardrail_intervened' instead of 'guardrail_failed_to_respond' - Add 6 unit tests for default/custom status codes and getattr pattern - Strengthen existing blocked-action test with status_code assertion Fixes #24348 --------- Co-authored-by:
Michael-RZ-Berri <michael@berri.ai> Co-authored-by:
Michael Riad Zaky <michaelr@Mac.localdomain> Co-authored-by:
ryan-crabbe-berri <ryan@berri.ai> Co-authored-by:
Krrish Dholakia <krrish+github@berri.ai> * fix(router/proxy): address Greptile P1+P2 review comments on PR #28161 - router: raise ServiceUnavailableError (503) instead of RouterRateLimitErrorBasic (429) when a specifically-addressed deployment is administratively blocked; 429 misleads retry-enabled clients into spinning forever against a paused model - proxy_server: compute get_fully_blocked_model_names() once before both branches in model_list() instead of duplicating the call in each branch - deepseek: upgrade silent debug log to warning when injecting placeholder reasoning_content so callers are clearly notified of degraded multi-turn quality - tests: update two blocked-deployment assertions to expect ServiceUnavailableError Co-authored-by:
Cursor <cursoragent@cursor.com> * fix: address bug detection findings (cache token order, mutable defaults) Co-authored-by:
Yassin Kortam <yassin@berri.ai> * fix: address bugs in async pass-through, anthropic cache token detection, rerank tests - async_get_available_deployment_for_pass_through: enforce blocked check on specific deployments - cost_calculator: detect anthropic-style usage by attribute presence (not truthiness) to avoid mixing OpenAI cached_tokens into anthropic normalization when read=0 - dashscope rerank tests: pass request to httpx.Response constructions for consistency Co-authored-by:
Yassin Kortam <yassin@berri.ai> * fix code qa * fix(vertex_ai/gemini): strip MIME parameters from GCS contentType GCS object metadata's contentType field can include parameters such as 'text/html; charset=utf-8'. Strip them in _apply_gemini_mime_type_aliases so downstream get_file_extension_from_mime_type sees a bare MIME type. Co-authored-by:
Yassin Kortam <yassin@berri.ai> * fix(vertex_ai/gemini): clarify mime-type error message string concatenation Co-authored-by:
Yassin Kortam <yassin@berri.ai> --------- Co-authored-by:
Tai An <antai12232931@outlook.com> Co-authored-by:
Vincent <yimao1231@gmail.com> Co-authored-by:
Kris Xia <xiajiayi0506@gmail.com> Co-authored-by: d
🔹 <liusway405@gmail.com> Co-authored-by:Fabrizio Cafolla <developer@fabriziocafolla.com> Co-authored-by:
Filippo Menghi <113345637+Cyberfilo@users.noreply.github.com> Co-authored-by:
Tom Denham <tom@tomdee.co.uk> Co-authored-by:
escon1004 <70471150+escon1004@users.noreply.github.com> Co-authored-by:
Divyansh Singhal <97736786+Divyansh8321@users.noreply.github.com> Co-authored-by:
robin-fiddler <robin@fiddler.ai> Co-authored-by:
Michael-RZ-Berri <michael@berri.ai> Co-authored-by:
Michael Riad Zaky <michaelr@Mac.localdomain> Co-authored-by:
ryan-crabbe-berri <ryan@berri.ai> Co-authored-by:
Krrish Dholakia <krrish+github@berri.ai> Co-authored-by:
Cursor <cursoragent@cursor.com> Co-authored-by:
Yassin Kortam <yassin@berri.ai>
Loading