Skip to content

feat(sdk): add optional AWS role assumption (Role ARN) for Bedrock/Sagemaker#4800

Open
Koushik-Salammagari wants to merge 3 commits into
Agenta-AI:mainfrom
Koushik-Salammagari:feat/bedrock-assume-role
Open

feat(sdk): add optional AWS role assumption (Role ARN) for Bedrock/Sagemaker#4800
Koushik-Salammagari wants to merge 3 commits into
Agenta-AI:mainfrom
Koushik-Salammagari:feat/bedrock-assume-role

Conversation

@Koushik-Salammagari

Copy link
Copy Markdown

Summary

Adds optional AWS IAM role assumption for Bedrock and Sagemaker providers. When a
provider secret carries an aws_role_arn, the long-lived keys are used only to sign a
single sts:AssumeRole call, and the short-lived session credentials it returns are what
get forwarded to LiteLLM. This lets users connect Bedrock/Sagemaker with a role to assume
instead of (or on top of) static keys.

SDK (sdks/python/agenta/sdk/engines/running/handlers.py)

  • New _resolve_aws_role_arn(...): if a role ARN is present (either aws_role_arn or
    AWS_ROLE_ARN), it builds an STS client from the base credentials, assumes the role,
    and replaces the credential set with the temporary session credentials. The role ARN is
    consumed and never forwarded to LiteLLM as an unknown kwarg. Region resolves from the
    usual aliases (aws_region_name / aws_region / AWS_REGION / aws_default_region),
    defaulting to us-east-1.
  • It composes with _normalize_aws_provider_settings, so resolution stays request-scoped
    and never mutates os.environ
    . Both auto_ai_critique_v0 and
    _run_prompt_llm_config_with_retry call sites are wired through it.
  • boto3 is promoted from the dev group to a runtime dependency (re-locked: sdks/python,
    api, services).

Frontend

  • Adds an optional Role ARN field to the Configure Provider drawer for Bedrock and
    Sagemaker, wiring roleArnaws_role_arn through the LlmProvider type and the
    secret transforms.
  • Fixes a drawer bug where fields declared required: false were forced required once a
    known provider was selected.

Relationship to existing work

Testing

Verified locally

  • uv run pytest oss/tests/pytest/unit/628 passed (includes the 15 new tests below).
  • ruff format + ruff check — clean on changed Python files.
  • pnpm exec prettier --check and eslint — clean on the four changed frontend files.
  • tsc --noEmit on @agenta/ossno new type errors introduced (diffed against the
    base; the pre-existing test-infra errors are unchanged).
  • uv sync --locked --dry-run (uv 0.11.14, the CI version) — passes for sdks/python,
    api, and services.

Added tests

sdks/python/oss/tests/pytest/unit/test_resolve_aws_role_arn.py — 15 unit tests covering
the no-op path, both ARN casings triggering assume_role, base credentials signing the STS
client, region defaulting/resolution, temp credentials replacing static keys, the role ARN
being dropped, input dict immutability, a clear error when boto3 is missing, and
composition with _normalize_aws_provider_settings.

Demo

Backend regression suite for the new resolver (boto3 STS mocked):

Role ARN resolver unit tests passing

QA follow-up

  • Frontend: the Role ARN field is implemented (renders for Bedrock/Sagemaker, optional,
    no *), but I could not capture a live UI recording locally — please verify the field
    renders and the form submits with and without a value during review.
  • Live AWS: assuming a real role on an ECS/Lambda deployment and confirming a Bedrock
    call succeeds with the temporary credentials still needs QA (no live AWS access locally).

Checklist

  • Summary describes what changed and why
  • Relevant tests pass locally
  • Relevant linting and formatting pass locally
  • Demo included (backend regression run); frontend/live-AWS QA noted above
  • I have signed the CLA

Related: #4244, #4797, #4396

Per-request AWS credentials were installed by mutating process-global
os.environ around an awaited mockllm.acompletion(...) call. Because the
event loop can switch coroutines while those variables are set, overlapping
requests in the same worker could observe each other's credentials.

The resolved credentials are already forwarded to LiteLLM as request-scoped
aws_* kwargs via **provider_settings, so the env-mutation context manager is
redundant and is the source of the leak. Normalize the AWS credential and
region aliases (env-style uppercase, aws_region vs aws_region_name) into
LiteLLM's canonical aws_* params so credentials stay request-scoped without
touching os.environ, and remove user_aws_credentials_from, ENV_KEYS_TO_CLEAR,
ENV_KEYS_FROM_USER, and _coerce_credentials.

Closes Agenta-AI#4244
Adds optional IAM role assumption for AWS providers. When a Bedrock or
Sagemaker secret carries an `aws_role_arn`, the long-lived keys sign a single
`sts:AssumeRole` call and the short-lived session credentials it returns are
what reach LiteLLM.

SDK:
- `_resolve_aws_role_arn` exchanges the role ARN for temporary STS credentials
  and composes with `_normalize_aws_provider_settings`, so resolution stays
  request-scoped and never mutates `os.environ` (builds on the Agenta-AI#4244 fix). The
  role ARN is consumed and never forwarded to LiteLLM as an unknown kwarg.
- Promote `boto3` from the dev group to a runtime dependency; relock.

Frontend:
- Add an optional "Role ARN" field to the Configure Provider drawer for
  Bedrock and Sagemaker, wired `roleArn` <-> `aws_role_arn` through the
  `LlmProvider` type and the secret transforms.
- Fix the drawer forcing `required: false` fields to required when a known
  provider is selected.
@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jun 23, 2026
@vercel

vercel Bot commented Jun 23, 2026

Copy link
Copy Markdown

@Koushik-Salammagari is attempting to deploy a commit to the agenta projects Team on Vercel.

A member of the Team first needs to authorize it.

@dosubot dosubot Bot added enhancement New feature or request SDK labels Jun 23, 2026
@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 449adcfa-3e7c-4b44-86e2-382a58d7f356

📥 Commits

Reviewing files that changed from the base of the PR and between 62354e6 and f91a269.

📒 Files selected for processing (2)
  • sdks/python/agenta/sdk/engines/running/handlers.py
  • sdks/python/oss/tests/pytest/unit/test_resolve_aws_role_arn.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • sdks/python/agenta/sdk/engines/running/handlers.py
  • sdks/python/oss/tests/pytest/unit/test_resolve_aws_role_arn.py

📝 Walkthrough

Summary by CodeRabbit

Release Notes

  • New Features

    • Added optional Role ARN configuration (roleArn) for Bedrock and SageMaker, enabling AWS role assumption prior to requests.
  • Bug Fixes

    • Improved AWS credential handling: credentials are now isolated per request, normalized into the expected provider format, and no longer mutate global environment variables.
  • Tests

    • Updated and expanded unit coverage for AWS credential alias normalization, role resolution behavior, and environment isolation.
  • Chores

    • Promoted boto3 to a runtime dependency.

Walkthrough

Adds AWS IAM role ARN support for Bedrock/SageMaker providers. The Python SDK replaces the env-var-based user_aws_credentials_from context manager with request-scoped STS role resolution and AWS credential alias normalization. boto3 is promoted to a runtime dependency. The web UI adds an optional roleArn field to provider configuration.

Changes

AWS Role ARN Support for Bedrock/SageMaker

Layer / File(s) Summary
LlmProvider type, PROVIDER_FIELDS constant, boto3 dependency
web/packages/agenta-shared/src/types/llmProvider.ts, web/oss/src/components/ModelRegistry/Drawers/ConfigureProviderDrawer/assets/constants.ts, sdks/python/pyproject.toml
Adds optional roleArn?: string to LlmProvider, inserts an optional roleArn entry in PROVIDER_FIELDS scoped to bedrock/sagemaker, and moves boto3>=1,<2 from dev to runtime dependencies.
Web transforms and provider drawer form
web/packages/agenta-entities/src/secret/core/transforms.ts, web/oss/src/components/ModelRegistry/Drawers/ConfigureProviderDrawer/assets/ConfigureProviderDrawerContent.tsx
transformSecret maps wire aws_role_arn extra to LlmProvider.roleArn; transformCustomProviderPayloadData emits it back. The drawer form adds roleArn: "" to initialValues and generalizes the isRequired check to treat any required === false field as optional.
AWS credential helpers and mockllm cleanup
sdks/python/agenta/sdk/engines/running/handlers.py, sdks/python/agenta/sdk/litellm/mockllm.py
Introduces _normalize_aws_provider_settings and _resolve_aws_role_arn for request-scoped AWS credentials, removes the old _coerce_credentials helper, and deletes the user_aws_credentials_from environment-variable context manager from mockllm.py.
Call-site integration in handlers.py
sdks/python/agenta/sdk/engines/running/handlers.py
Updates auto_ai_critique_v0 and _run_prompt_llm_config_with_retry to pass normalized, resolved AWS kwargs directly into mockllm.acompletion, replacing the previous user_aws_credentials_from wrapping.
Tests: resolver suite and updated fixtures
sdks/python/oss/tests/pytest/unit/test_resolve_aws_role_arn.py, sdks/python/oss/tests/pytest/unit/test_auto_ai_critique_v0_runtime.py, sdks/python/oss/tests/pytest/unit/test_chat_v0_inputs.py, sdks/python/oss/tests/pytest/unit/test_prompt_template_extensions.py
Adds a resolver test suite covering no-op paths, ARN casing, STS client signing, region aliases, output shape, immutability, missing boto3, and normalize composition. Updates critique tests for alias normalization and environ isolation, and removes stale user_aws_credentials_from monkeypatches from chat and retry tests.

Sequence Diagram(s)

sequenceDiagram
    participant handlers as handlers.py
    participant resolver as _resolve_aws_role_arn
    participant sts as boto3 STS
    participant normalizer as _normalize_aws_provider_settings
    participant litellm as mockllm.acompletion

    handlers->>resolver: provider_settings (may include aws_role_arn)
    alt aws_role_arn present
        resolver->>sts: assume_role(RoleArn, RoleSessionName)
        sts-->>resolver: temporary AccessKeyId / SecretAccessKey / SessionToken
        resolver-->>handlers: settings with temporary credentials, role keys dropped
    else no role ARN
        resolver-->>handlers: settings unchanged
    end
    handlers->>normalizer: resolved settings
    normalizer-->>handlers: canonical aws_access_key_id / aws_secret_access_key / aws_session_token / aws_region_name kwargs
    handlers->>litellm: acompletion(**canonical_aws_kwargs)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.04% which is insufficient. The required threshold is 60.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: optional AWS role assumption for Bedrock and Sagemaker.
Description check ✅ Passed The description accurately matches the implemented SDK and frontend changes around AWS role assumption and provider UI support.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
sdks/python/oss/tests/pytest/unit/test_resolve_aws_role_arn.py (1)

172-192: 📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

Add a regression test for blank aws_role_arn / AWS_ROLE_ARN.

This suite misses the empty-string role ARN edge case, which is where alias leakage can occur. Pinning that behavior will prevent regressions.

💡 Proposed test
+def test_blank_role_arn_is_not_forwarded():
+    settings = {
+        "aws_role_arn": "",
+        "aws_access_key_id": "BASE_KEY",
+        "aws_secret_access_key": "BASE_SECRET",
+    }
+    normalized = _normalize_aws_provider_settings(_resolve_aws_role_arn(settings))
+    assert "aws_role_arn" not in normalized
+    assert "AWS_ROLE_ARN" not in normalized

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 1c456586-f2cd-44db-90cd-8aa422984d99

📥 Commits

Reviewing files that changed from the base of the PR and between 8b7e319 and 62354e6.

⛔ Files ignored due to path filters (3)
  • api/uv.lock is excluded by !**/*.lock
  • sdks/python/uv.lock is excluded by !**/*.lock
  • services/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (11)
  • sdks/python/agenta/sdk/engines/running/handlers.py
  • sdks/python/agenta/sdk/litellm/mockllm.py
  • sdks/python/oss/tests/pytest/unit/test_auto_ai_critique_v0_runtime.py
  • sdks/python/oss/tests/pytest/unit/test_chat_v0_inputs.py
  • sdks/python/oss/tests/pytest/unit/test_prompt_template_extensions.py
  • sdks/python/oss/tests/pytest/unit/test_resolve_aws_role_arn.py
  • sdks/python/pyproject.toml
  • web/oss/src/components/ModelRegistry/Drawers/ConfigureProviderDrawer/assets/ConfigureProviderDrawerContent.tsx
  • web/oss/src/components/ModelRegistry/Drawers/ConfigureProviderDrawer/assets/constants.ts
  • web/packages/agenta-entities/src/secret/core/transforms.ts
  • web/packages/agenta-shared/src/types/llmProvider.ts
💤 Files with no reviewable changes (3)
  • sdks/python/oss/tests/pytest/unit/test_chat_v0_inputs.py
  • sdks/python/oss/tests/pytest/unit/test_prompt_template_extensions.py
  • sdks/python/agenta/sdk/litellm/mockllm.py

Comment thread sdks/python/agenta/sdk/engines/running/handlers.py
Comment thread sdks/python/agenta/sdk/engines/running/handlers.py
@bekossy bekossy requested a review from jp-agenta June 23, 2026 10:55
Address CodeRabbit review on Agenta-AI#4800:
- Drop empty aws_role_arn/AWS_ROLE_ARN aliases before the early return so a
  blank UI field never leaks to LiteLLM as an unknown kwarg.
- Resolve the role ARN via asyncio.to_thread at both async call sites so the
  blocking boto3 sts:AssumeRole call cannot stall the event loop.
- Add regression tests for blank role-ARN values (both casings, '' and None).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request SDK size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant