Skip to content
  • Sameer Kankute's avatar
    36c494fd
    Litellm oss staging (#28161) · 36c494fd
    Sameer Kankute authored
    
    
    * fix(opentelemetry): JSON-serialize dict metadata fields for OTEL span attributes (#27451) (#27455)
    
    Squash-merged by litellm-agent from Anai-Guo's PR.
    
    * feat(dashscope): add embeddings and reranks(qwen3-rerank) support via OpenAI-compatible endpoint (#27508)
    
    Squash-merged by litellm-agent from yimao's PR.
    
    * fix(vertex_ai/gemini): raise BadRequestError when image_url or url fi… (#24550)
    
    Squash-merged by litellm-agent from krisxia0506's PR.
    
    * fix(vertex_ai): raise error on mid-stream 429/error chunks instead of silently swallowing (#23711)
    
    Squash-merged by litellm-agent from krisxia0506's PR.
    
    * fix: raise BadRequestError for file content blocks missing 'file' sub… (#24503)
    
    Squash-merged by litellm-agent from krisxia0506's PR.
    
    * Fix Gemini MIME detection for extensionless GCS URIs (#27278)
    
    Squash-merged by litellm-agent from krisxia0506's PR.
    
    * fix(vertex_ai/partner_models): drop unused vertexai SDK gate from count_tokens (closes #28084) (#28107)
    
    Squash-merged by litellm-agent from voidborne-d's PR.
    
    * feat(chart): add support for autoscaling behavior in HPA (#27990)
    
    Squash-merged by litellm-agent from FabrizioCafolla's PR.
    
    * feat(proxy): add blocked flag to models for pause/resume from the UI (#27927)
    
    Squash-merged by litellm-agent from Cyberfilo's PR.
    
    * fix: pass socket timeouts to Redis cluster clients (#27920)
    
    Squash-merged by litellm-agent from tomdee's PR.
    
    * Fix/cache token (#28009)
    
    Squash-merged by litellm-agent from escon1004's PR.
    
    * fix(deepseek): forward reasoning_content in multi-turn thinking mode conversations (#28080)
    
    Squash-merged by litellm-agent from Divyansh8321's PR.
    
    * fix(guardrails): return HTTP 400 instead of 500 for blocked requests (#27617)
    
    * fix: reset org and tag budgets (#27326)
    
    * reset org budgets
    
    * reset tag budgets
    
    ---------
    
    Co-authored-by: default avatarMichael Riad Zaky <michaelr@Mac.localdomain>
    
    * fix(ui): omit allowed_routes from key edit save when unchanged (#27553)
    
    * fix(ui): omit allowed_routes from key edit save when unchanged
    
    When a team admin opens Edit Settings on a key with key_type=AI APIs and
    saves without changing anything, the UI re-sends the existing allowed_routes
    value, which the backend's _check_allowed_routes_caller_permission gate
    rejects for non-proxy-admins (LIT-2681).
    
    Strip allowed_routes from the patch in handleSubmit when it deep-equals the
    original keyData.allowed_routes. The backend treats absence as "leave alone,"
    so no-op saves now succeed for non-admins. Admins explicitly editing the
    field still send the new value.
    
    * fix(ui): order-insensitive allowed_routes diff + cover null-original case
    
    Address Greptile review:
    
    - Switch the "is allowed_routes unchanged" check to a Set-based comparison so
      a server-side reorder of the array doesn't register as a user edit and
      re-trigger LIT-2681.
    - Add two regression tests: (1) keyData.allowed_routes is null and the form
      is untouched — patch should strip the field; (2) server returned routes in
      a different order than the user originally entered — patch should still
      recognize the value as unchanged.
    
    * chore(ui): strip ticket refs and tighten comments in key edit fix
    
    - Remove internal-tracker references from in-code comments
    - Tighten the WHY comment in handleSubmit to two lines
    - Drop redundant test-block comments — test names already describe the case
    
    * fix(ui): annotate Set<string> generic in allowed_routes diff to fix tsc
    
    * fix(guardrails): return HTTP 400 instead of 500 for guardrail-blocked requests
    
    GuardrailRaisedException and BlockedPiiEntityError both lacked a
    status_code attribute.  When these exceptions reached the proxy
    exception handler (getattr(e, 'status_code', 500)), the fallback
    defaulted to HTTP 500 — making intentional guardrail blocks
    indistinguishable from server errors and causing unnecessary client
    retries.
    
    Changes:
    - Add status_code=400 (keyword-only) to GuardrailRaisedException
    - Add status_code=400 (keyword-only) to BlockedPiiEntityError
    - Update _is_guardrail_intervention() to recognize both exceptions
      so downstream loggers record 'guardrail_intervened' instead of
      'guardrail_failed_to_respond'
    - Add 6 unit tests for default/custom status codes and getattr pattern
    - Strengthen existing blocked-action test with status_code assertion
    
    Fixes #24348
    
    ---------
    
    Co-authored-by: default avatarMichael-RZ-Berri <michael@berri.ai>
    Co-authored-by: default avatarMichael Riad Zaky <michaelr@Mac.localdomain>
    Co-authored-by: default avatarryan-crabbe-berri <ryan@berri.ai>
    Co-authored-by: default avatarKrrish Dholakia <krrish+github@berri.ai>
    
    * fix(router/proxy): address Greptile P1+P2 review comments on PR #28161
    
    - router: raise ServiceUnavailableError (503) instead of RouterRateLimitErrorBasic (429)
      when a specifically-addressed deployment is administratively blocked; 429 misleads
      retry-enabled clients into spinning forever against a paused model
    - proxy_server: compute get_fully_blocked_model_names() once before both branches in
      model_list() instead of duplicating the call in each branch
    - deepseek: upgrade silent debug log to warning when injecting placeholder
      reasoning_content so callers are clearly notified of degraded multi-turn quality
    - tests: update two blocked-deployment assertions to expect ServiceUnavailableError
    
    Co-authored-by: default avatarCursor <cursoragent@cursor.com>
    
    * fix: address bug detection findings (cache token order, mutable defaults)
    
    Co-authored-by: default avatarYassin Kortam <yassin@berri.ai>
    
    * fix: address bugs in async pass-through, anthropic cache token detection, rerank tests
    
    - async_get_available_deployment_for_pass_through: enforce blocked check on specific deployments
    - cost_calculator: detect anthropic-style usage by attribute presence (not truthiness) to avoid mixing OpenAI cached_tokens into anthropic normalization when read=0
    - dashscope rerank tests: pass request to httpx.Response constructions for consistency
    
    Co-authored-by: default avatarYassin Kortam <yassin@berri.ai>
    
    * fix code qa
    
    * fix(vertex_ai/gemini): strip MIME parameters from GCS contentType
    
    GCS object metadata's contentType field can include parameters such as
    'text/html; charset=utf-8'. Strip them in _apply_gemini_mime_type_aliases
    so downstream get_file_extension_from_mime_type sees a bare MIME type.
    
    Co-authored-by: default avatarYassin Kortam <yassin@berri.ai>
    
    * fix(vertex_ai/gemini): clarify mime-type error message string concatenation
    
    Co-authored-by: default avatarYassin Kortam <yassin@berri.ai>
    
    ---------
    
    Co-authored-by: default avatarTai An <antai12232931@outlook.com>
    Co-authored-by: default avatarVincent <yimao1231@gmail.com>
    Co-authored-by: default avatarKris Xia <xiajiayi0506@gmail.com>
    Co-authored-by: d 🔹
    
     <liusway405@gmail.com>
    Co-authored-by: default avatarFabrizio Cafolla <developer@fabriziocafolla.com>
    Co-authored-by: default avatarFilippo Menghi <113345637+Cyberfilo@users.noreply.github.com>
    Co-authored-by: default avatarTom Denham <tom@tomdee.co.uk>
    Co-authored-by: default avatarescon1004 <70471150+escon1004@users.noreply.github.com>
    Co-authored-by: default avatarDivyansh Singhal <97736786+Divyansh8321@users.noreply.github.com>
    Co-authored-by: default avatarrobin-fiddler <robin@fiddler.ai>
    Co-authored-by: default avatarMichael-RZ-Berri <michael@berri.ai>
    Co-authored-by: default avatarMichael Riad Zaky <michaelr@Mac.localdomain>
    Co-authored-by: default avatarryan-crabbe-berri <ryan@berri.ai>
    Co-authored-by: default avatarKrrish Dholakia <krrish+github@berri.ai>
    Co-authored-by: default avatarCursor <cursoragent@cursor.com>
    Co-authored-by: default avatarYassin Kortam <yassin@berri.ai>
    36c494fd
    Litellm oss staging (#28161)
    Sameer Kankute authored
    
    
    * fix(opentelemetry): JSON-serialize dict metadata fields for OTEL span attributes (#27451) (#27455)
    
    Squash-merged by litellm-agent from Anai-Guo's PR.
    
    * feat(dashscope): add embeddings and reranks(qwen3-rerank) support via OpenAI-compatible endpoint (#27508)
    
    Squash-merged by litellm-agent from yimao's PR.
    
    * fix(vertex_ai/gemini): raise BadRequestError when image_url or url fi… (#24550)
    
    Squash-merged by litellm-agent from krisxia0506's PR.
    
    * fix(vertex_ai): raise error on mid-stream 429/error chunks instead of silently swallowing (#23711)
    
    Squash-merged by litellm-agent from krisxia0506's PR.
    
    * fix: raise BadRequestError for file content blocks missing 'file' sub… (#24503)
    
    Squash-merged by litellm-agent from krisxia0506's PR.
    
    * Fix Gemini MIME detection for extensionless GCS URIs (#27278)
    
    Squash-merged by litellm-agent from krisxia0506's PR.
    
    * fix(vertex_ai/partner_models): drop unused vertexai SDK gate from count_tokens (closes #28084) (#28107)
    
    Squash-merged by litellm-agent from voidborne-d's PR.
    
    * feat(chart): add support for autoscaling behavior in HPA (#27990)
    
    Squash-merged by litellm-agent from FabrizioCafolla's PR.
    
    * feat(proxy): add blocked flag to models for pause/resume from the UI (#27927)
    
    Squash-merged by litellm-agent from Cyberfilo's PR.
    
    * fix: pass socket timeouts to Redis cluster clients (#27920)
    
    Squash-merged by litellm-agent from tomdee's PR.
    
    * Fix/cache token (#28009)
    
    Squash-merged by litellm-agent from escon1004's PR.
    
    * fix(deepseek): forward reasoning_content in multi-turn thinking mode conversations (#28080)
    
    Squash-merged by litellm-agent from Divyansh8321's PR.
    
    * fix(guardrails): return HTTP 400 instead of 500 for blocked requests (#27617)
    
    * fix: reset org and tag budgets (#27326)
    
    * reset org budgets
    
    * reset tag budgets
    
    ---------
    
    Co-authored-by: default avatarMichael Riad Zaky <michaelr@Mac.localdomain>
    
    * fix(ui): omit allowed_routes from key edit save when unchanged (#27553)
    
    * fix(ui): omit allowed_routes from key edit save when unchanged
    
    When a team admin opens Edit Settings on a key with key_type=AI APIs and
    saves without changing anything, the UI re-sends the existing allowed_routes
    value, which the backend's _check_allowed_routes_caller_permission gate
    rejects for non-proxy-admins (LIT-2681).
    
    Strip allowed_routes from the patch in handleSubmit when it deep-equals the
    original keyData.allowed_routes. The backend treats absence as "leave alone,"
    so no-op saves now succeed for non-admins. Admins explicitly editing the
    field still send the new value.
    
    * fix(ui): order-insensitive allowed_routes diff + cover null-original case
    
    Address Greptile review:
    
    - Switch the "is allowed_routes unchanged" check to a Set-based comparison so
      a server-side reorder of the array doesn't register as a user edit and
      re-trigger LIT-2681.
    - Add two regression tests: (1) keyData.allowed_routes is null and the form
      is untouched — patch should strip the field; (2) server returned routes in
      a different order than the user originally entered — patch should still
      recognize the value as unchanged.
    
    * chore(ui): strip ticket refs and tighten comments in key edit fix
    
    - Remove internal-tracker references from in-code comments
    - Tighten the WHY comment in handleSubmit to two lines
    - Drop redundant test-block comments — test names already describe the case
    
    * fix(ui): annotate Set<string> generic in allowed_routes diff to fix tsc
    
    * fix(guardrails): return HTTP 400 instead of 500 for guardrail-blocked requests
    
    GuardrailRaisedException and BlockedPiiEntityError both lacked a
    status_code attribute.  When these exceptions reached the proxy
    exception handler (getattr(e, 'status_code', 500)), the fallback
    defaulted to HTTP 500 — making intentional guardrail blocks
    indistinguishable from server errors and causing unnecessary client
    retries.
    
    Changes:
    - Add status_code=400 (keyword-only) to GuardrailRaisedException
    - Add status_code=400 (keyword-only) to BlockedPiiEntityError
    - Update _is_guardrail_intervention() to recognize both exceptions
      so downstream loggers record 'guardrail_intervened' instead of
      'guardrail_failed_to_respond'
    - Add 6 unit tests for default/custom status codes and getattr pattern
    - Strengthen existing blocked-action test with status_code assertion
    
    Fixes #24348
    
    ---------
    
    Co-authored-by: default avatarMichael-RZ-Berri <michael@berri.ai>
    Co-authored-by: default avatarMichael Riad Zaky <michaelr@Mac.localdomain>
    Co-authored-by: default avatarryan-crabbe-berri <ryan@berri.ai>
    Co-authored-by: default avatarKrrish Dholakia <krrish+github@berri.ai>
    
    * fix(router/proxy): address Greptile P1+P2 review comments on PR #28161
    
    - router: raise ServiceUnavailableError (503) instead of RouterRateLimitErrorBasic (429)
      when a specifically-addressed deployment is administratively blocked; 429 misleads
      retry-enabled clients into spinning forever against a paused model
    - proxy_server: compute get_fully_blocked_model_names() once before both branches in
      model_list() instead of duplicating the call in each branch
    - deepseek: upgrade silent debug log to warning when injecting placeholder
      reasoning_content so callers are clearly notified of degraded multi-turn quality
    - tests: update two blocked-deployment assertions to expect ServiceUnavailableError
    
    Co-authored-by: default avatarCursor <cursoragent@cursor.com>
    
    * fix: address bug detection findings (cache token order, mutable defaults)
    
    Co-authored-by: default avatarYassin Kortam <yassin@berri.ai>
    
    * fix: address bugs in async pass-through, anthropic cache token detection, rerank tests
    
    - async_get_available_deployment_for_pass_through: enforce blocked check on specific deployments
    - cost_calculator: detect anthropic-style usage by attribute presence (not truthiness) to avoid mixing OpenAI cached_tokens into anthropic normalization when read=0
    - dashscope rerank tests: pass request to httpx.Response constructions for consistency
    
    Co-authored-by: default avatarYassin Kortam <yassin@berri.ai>
    
    * fix code qa
    
    * fix(vertex_ai/gemini): strip MIME parameters from GCS contentType
    
    GCS object metadata's contentType field can include parameters such as
    'text/html; charset=utf-8'. Strip them in _apply_gemini_mime_type_aliases
    so downstream get_file_extension_from_mime_type sees a bare MIME type.
    
    Co-authored-by: default avatarYassin Kortam <yassin@berri.ai>
    
    * fix(vertex_ai/gemini): clarify mime-type error message string concatenation
    
    Co-authored-by: default avatarYassin Kortam <yassin@berri.ai>
    
    ---------
    
    Co-authored-by: default avatarTai An <antai12232931@outlook.com>
    Co-authored-by: default avatarVincent <yimao1231@gmail.com>
    Co-authored-by: default avatarKris Xia <xiajiayi0506@gmail.com>
    Co-authored-by: d 🔹
    
     <liusway405@gmail.com>
    Co-authored-by: default avatarFabrizio Cafolla <developer@fabriziocafolla.com>
    Co-authored-by: default avatarFilippo Menghi <113345637+Cyberfilo@users.noreply.github.com>
    Co-authored-by: default avatarTom Denham <tom@tomdee.co.uk>
    Co-authored-by: default avatarescon1004 <70471150+escon1004@users.noreply.github.com>
    Co-authored-by: default avatarDivyansh Singhal <97736786+Divyansh8321@users.noreply.github.com>
    Co-authored-by: default avatarrobin-fiddler <robin@fiddler.ai>
    Co-authored-by: default avatarMichael-RZ-Berri <michael@berri.ai>
    Co-authored-by: default avatarMichael Riad Zaky <michaelr@Mac.localdomain>
    Co-authored-by: default avatarryan-crabbe-berri <ryan@berri.ai>
    Co-authored-by: default avatarKrrish Dholakia <krrish+github@berri.ai>
    Co-authored-by: default avatarCursor <cursoragent@cursor.com>
    Co-authored-by: default avatarYassin Kortam <yassin@berri.ai>
Loading