SQL Tool — Design Doc
Read-only, tenant-isolated SQL access for AgentCore agents. Lives at
app/api/agent/— this doc is the entry point.
| Status | Shipped on PR #8322 (ONRAMP-4761) |
| Audience | Engineers reading, modifying, or reviewing SQL-tool code |
| What it is | read_via_sql + describe_schema tools wired into portal + onramp_chat agents |
| Tenant model | Vendor + customer-account, with portal vs. vendor-user split |
| Threat-model primer | ~/.claude/plans/sql-tool-tenant-isolation-primer.md |
TL;DR
The SQL tool lets agents answer open-ended data questions (e.g. "how many tasks did I complete last week?") by writing SQL against the OnRamp database. Cross-tenant reads are physically prevented by four independent defenses; any single layer would suffice on its own.
flowchart LR
A[Agent writes SQL] --> B[1. Validators<br/>reject dangerous shapes]
B --> C[2. Parse + rewrite<br/>inject vendor/customer scope]
C --> D[3. Postgres role<br/>SELECT-only on allowlisted tables]
D --> E[4. RLS policies<br/>scope rows by session GUC]
E --> F[(Result)]
style B fill:#fff4e6,color:#000
style C fill:#e6f3ff,color:#000
style D fill:#e6ffe6,color:#000
style E fill:#ffe6f3,color:#000Locality wins: exposure decisions live on the column line via typed markers (Public[T], Confidential[T], Internal[T], VendorOwner[T], CustomerOwner[T]). PR diff = security-policy diff.
Default-deny enforced by CI: untagged column on an exposed table fails test_every_column_on_exposed_table_is_classified — unmergeable without a marker, an info-dict, or an explicit signoff entry.
How a query flows end-to-end
sequenceDiagram
autonumber
participant Agent as Agent (portal/copilot)
participant Tool as read_via_sql tool
participant Flask as Flask /api/internal/agent/sql-read
participant Val as validators.py
participant Rew as parser+scope+cap injectors
participant DB as Postgres (or_readonly_user)
Agent->>Tool: SQL string
Tool->>Flask: POST {vendor_id, user_id, is_customer_user, sql}
Flask->>Flask: Cross-check User row<br/>(vendor, is_customer_user, mapping)
Flask->>Val: validate_sql(sql)
Val-->>Flask: accept | reject(reason)
Flask->>Rew: inject_scope + inject_caps
Rew-->>Flask: rewritten SQL
Flask->>DB: SET LOCAL app.vendor_id=N<br/>SET LOCAL app.user_id=N<br/>...
Flask->>DB: execute(rewritten SQL)
Note over DB: RLS restrictive policies<br/>filter rows by GUC
DB-->>Flask: rows (≤1000, ≤1MB, ≤5s)
Flask-->>Tool: {scope_status, data, meta}
Tool-->>Agent: result dict
Note over Flask: emit_audit() on every path,<br/>including rejectionsWhat each step physically prevents (click to expand)
| # | Step | Blocks |
|---|---|---|
| 3 | _resolve_caller_context cross-checks User.vendor + is_customer_user against payload, plus or_customer_user_account_mapping for portal users | Agent passing a foreign vendor_id / user_id |
| 4 | validate_sql — regex + AST | INSERT/UPDATE/DELETE/DDL, multi-statement, pg_* / lo_* functions, SELECT *, recursive CTE depth >3, polymorphic assoc_object_* columns |
| 5 | inject_scope (sqlglot build_scope walk) | Cross-vendor reads via subquery, CTE, UNION arm, EXISTS, derived table — every scope gets vendor = caller_vendor_id injected |
| 5 | inject_caps | Result blow-up: forces LIMIT ≤1000. Also decodes hashid string literals and rejects cross-vendor matches |
| 7 | or_readonly_user Postgres role + statement_timeout=5s | Any write attempt, pg_*/lo_* calls that slipped past validators |
| 9 | RLS restrictive policies referencing app.vendor_id GUC | Anything else — last-resort row-level filter at DB layer |
| 10 | emit_audit(record) | Silent failures: every accept and every reject is logged (sql_tool.query) |
Security model — the four layers, in detail
flowchart TB
subgraph L1[Layer 1 — Identity]
I1[POST payload<br/>vendor_id, user_id, is_customer_user]
I2[DB cross-check<br/>User.vendor, User.is_customer_user]
I3[Mapping cross-check<br/>or_customer_user_account_mapping]
I1 --> I2 --> I3
end
subgraph L2[Layer 2 — Parse + Rewrite]
P1[validators.py<br/>shape rejection]
P2[parser.py<br/>sqlglot AST]
P3[scope_injector.py<br/>vendor/customer/role/soft-delete WHERE]
P4[cap_injector.py<br/>LIMIT 1000 + hashid check]
P1 --> P2 --> P3 --> P4
end
subgraph L3[Layer 3 — DB role]
D1[or_readonly_user<br/>LOGIN, NOCREATEDB, NOCREATEROLE]
D2[GRANT SELECT only<br/>on allowlisted tables]
D3[statement_timeout = 5000ms]
D1 --> D2 --> D3
end
subgraph L4[Layer 4 — RLS]
R1[ALTER TABLE ... ENABLE ROW LEVEL SECURITY]
R2[Permissive FOR ALL TO public<br/>web app unaffected]
R3[Restrictive FOR SELECT TO or_readonly<br/>vendor = current_setting'app.vendor_id']
R1 --> R2 --> R3
end
L1 --> L2 --> L3 --> L4Layer 1 — Caller identity (no HMAC; here is why that's OK)
The endpoint sits at /api/internal/agent/sql-read and skips HMAC. The PR description records that decision; it relies on:
- VPC isolation —
/api/internal/*not externally exposed. - Cross-checking the User row in
_resolve_caller_context(sql_read_controller.py:49–125):User.vendor == payload.vendor_idorPermissionErrorUser.is_customer_user == payload.is_customer_userorPermissionError
- Customer-mapping cross-check for portal users —
or_customer_user_account_mappingrow must exist and be active. - Layers 2–4 are independent of any caller assertion.
Even if an attacker reaches the Flask network and forges identity, the rewriter and RLS still enforce row scope from the same authenticated payload.
Layer 2 — Parse + rewrite (sqlglot, fail-closed)
The rewriter chain is three files:
| File | Job | Fail-closed behavior |
|---|---|---|
validators.py | Reject shapes that should never reach the AST: multi-statement, INSERT/UPDATE/DELETE/DDL, pg_* / lo_* calls, SELECT *, polymorphic columns, RECURSIVE CTE depth > 3 | Returns ValidationResult.reject(reason) — caller emits audit + 4xx |
scope_injector.py | For every exp.Table source in every sqlglot scope (CTE, UNION arm, derived-table, EXISTS body): inject vendor = :caller_vendor_id, deleted_at IS NULL, customer_id = :caller_customer_id (portal only), self-only / project-membership filters | Raises ScopeInjectionError if the table has no resolvable vendor column and is not tagged global_lookup or rls_scoped |
cap_injector.py | Inject LIMIT 1000. Scan original SQL for hashid-shaped string literals; if any decode to an id whose vendor row ≠ caller, reject | Raises CapInjectionError on hashid_cross_vendor |
The scope walk is built on sqlglot.optimizer.scope.build_scope so subqueries, CTEs, UNION arms, and EXISTS bodies are all walked by sqlglot — not by hand. Every real exp.Table source in every scope gets predicates appended; CTE / derived-table aliases appear as Scope objects in scope.sources and are skipped (their inner scope handles it).
# Example: agent submits this
SELECT id, project_name FROM or_projects WHERE health = 'red'
# Rewriter emits this
SELECT id, project_name
FROM or_projects
WHERE (health = 'red')
AND "or_projects"."vendor" = 17
AND "or_projects"."deleted_at" IS NULL
LIMIT 1000The original WHERE is wrapped in () to preserve OR precedence — see _inject_where_predicates line 233 (CRITICAL #1 paren wrap).
Layer 3 — Postgres read-only role
Two principals provisioned by migration b6519fcbc296:
or_readonly— group role, NOLOGIN, NOCREATEDB, NOCREATEROLE. HoldsGRANT CONNECT+GRANT USAGE ON SCHEMA public+ table-levelGRANT SELECTon each allowlisted table. Carriesstatement_timeout = 5000msat role level.or_readonly_user— login user, LOGIN, NOCREATEDB, NOCREATEROLE. Member ofor_readonly, inherits all grants. Authenticates viaONRAMP_READONLY_DATABASE_PASSWORDenv var.
Provisioning per tenant: bun devtools/set-readonly-db-env-vars.ts writes the password into Duplo for each environment.
Even a perfect rewriter bug that emits
DROP TABLEwould fail at this layer:or_readonly_usercannot execute write SQL.
Layer 4 — Row-Level Security
Migration e09d5464d3fe applies two policies per table from app/api/agent/rls_policies/:
-- Permissive: web-app role unaffected (postgres bypasses RLS by default,
-- but this is belt-and-suspenders for managed-Postgres surprises).
CREATE POLICY "<table>_app_full" ON "<table>"
FOR ALL TO public USING (true) WITH CHECK (true);
-- Restrictive: SQL-tool role sees only rows matching the GUC.
CREATE POLICY "<table>_readonly_scope" ON "<table>"
AS RESTRICTIVE FOR SELECT TO or_readonly
USING (vendor = current_setting('app.vendor_id', true)::int);The , true in current_setting returns NULL when the GUC is unset. vendor = NULL is FALSE in SQL, so an unset GUC means zero rows — fail-closed.
SqlReadService._execute_query runs SET LOCAL app.vendor_id = :v (and app.user_id, app.is_customer_user, app.customer_id) inside the transaction before execute(sql_rewritten). SET LOCAL is transaction-scoped, so values can't leak across pool checkouts.
Tables currently RLS-protected (app/api/agent/rls_policies/__init__.py):
or_comment, or_modules, or_object_status, or_projects, or_resources,
or_tasks, or_vendors, or_project_user_mappings, or_customer_user_account_mapping, userDrift gate: test_rls_policy_drift.py reads pg_policies and asserts equality with the registry.
ORM-declarative allowlist (the column markers)
Schema inventory is the ORM (alembic-gated to the DB). Exposure policy lives on the column line:
class ORProject(ORStandardModel, PortalAccessible):
project_name: Mapped[Public[str]] = mapped_column(String, nullable=False)
health_score: Mapped[Internal[float]] = mapped_column(Float) # explicit deny
internal_notes: Mapped[Confidential[str]] = mapped_column(Text) # vendor-only, no portal
vendor: Mapped[VendorOwner[int]] = mapped_column(ForeignKey(...)) # override-only
account: Mapped[CustomerOwner[int]] = mapped_column(ForeignKey(...)) # override-only
__sql_tool_policy__ = {
"role_filters": {
"CONTRIBUTOR": "project_member_only",
"API_USER": "denied",
},
}| Marker | Use | Visible to vendor user | Visible to portal user |
|---|---|---|---|
Public[T] | Most exposed columns | ✅ | ✅ |
Confidential[T] | Sensitive but business-relevant | ✅ | ❌ |
Internal[T] | Passwords, secrets, internal hashes | ❌ | ❌ |
VendorOwner[T] | Override when FK detection ambiguous | (tenancy column) | (tenancy column) |
CustomerOwner[T] | Override when FK detection ambiguous | (tenancy column) | (tenancy column) |
Auto-classified (no marker required): id, uuid, created_at, modified_at, archived_at, deleted_at, created_by, modified_by, archived_by, deleted_by — see AUTO_EXPOSED_COLUMN_NAMES.
Migration form (legacy mapped_column without Mapped[T] annotations):
some_col = mapped_column(Text, info={"sql_tool": {"exposed": True, "pii": True}})Both forms feed the same registry — the loader merges them (orm_allowlist._collect_exposed_columns).
How the boot loader builds the registry
flowchart TB
Start([Flask boot]) --> Walk[_all_orm_subclasses<br/>walk ORBaseModel + ORStandardModel]
Walk --> Each{For each class}
Each --> Hints[_resolve_class_hints<br/>walk MRO with per-class tolerance]
Hints --> Auto[Auto-classify by name<br/>id, uuid, created_at, ...]
Auto --> Markers[Extract _SqlMeta from<br/>Mapped/Annotated/PEP-695 alias]
Markers --> Info[Merge column.info'sql_tool' overrides]
Info --> OptIn{Has explicit marker<br/>OR PortalAccessible<br/>OR __sql_tool_policy__?}
OptIn -->|No| Skip[Skip — not in registry]
OptIn -->|Yes| Resolve[resolve_vendor_owner<br/>resolve_customer_owner]
Resolve --> Fail{Resolution failed?}
Fail -->|Yes & not global/rls_scoped| Boom[RuntimeError at boot<br/>fail-safe Edge 2/3]
Fail -->|No| Entry[Add to registry]
Entry --> EachThe registry is @lru_cache(maxsize=1) — built once per worker. Tests clear via collect_allowlist.cache_clear().
Why MRO walked manually (orm_allowlist.py:65–102): a single get_type_hints(cls) call evaluates every annotation on every ancestor in one ForwardRef pass. One unresolvable name on an unrelated parent class drops every hint — including the marker annotations we depend on. The MRO walk tolerates per-class failures and falls back to per-annotation eval.
Owner resolution order
Vendor owner (owner_resolution.resolve_vendor_owner):
- Explicit
Mapped[VendorOwner[T]]marker — single match wins, multi-marker raises. __sql_tool_policy__["owner_via"]override.- FK to
or_vendors.id— single match wins, multi-FK raises (Edge 1). - Naming convention fallback:
vendor_id,vendor,tenant_id. None— caller (loader) decides whether to fail.
Customer owner (resolve_customer_owner):
- Explicit
Mapped[CustomerOwner[T]]marker. __sql_tool_policy__["customer_owner_via"].- FK to
account.id. - Naming convention fallback:
customer_id,customer,account. - Bridge resolution — walk relationships annotated with
info={"sql_tool": {"resolves_customer_owner_via": "<col>"}}(e.g.User → CustomerUserAccountMapping → account). None.
Five fail-safe edges (boot will refuse to start if any fire)
flowchart LR
E1[Edge 1<br/>multi-FK ambiguity] --> Boot([RuntimeError at Flask boot])
E2[Edge 2<br/>Public column on table<br/>without resolvable vendor owner] --> Boot
E3[Edge 3<br/>PortalAccessible without<br/>resolvable customer owner] --> Boot
E4[Edge 4<br/>untagged column on<br/>exposed table — CI gate] --> CI([pytest fails])
E5[Edge 5<br/>RLS drift between<br/>registry and pg_policies] --> CI
style Boot fill:#ffe6e6,color:#000
style CI fill:#fff4e6,color:#000Edges 1–3 raise at boot. Edges 4–5 fail in CI. None can be silently bypassed.
Footguns + areas of improvement
Honest list. Some are documented trade-offs; others are real gaps that should land as follow-ups.
🔴 Gap — AUTO_DENIED_COLUMN_PATTERNS is defined but never enforced
sql_tool_conventions.py:41 defines patterns for columns that look sensitive (*password*, *token*, *ssn*, etc.) and exports a matches_auto_deny_pattern helper. Nothing in the rewriter or the CI gate calls it. Tests cover the pattern matcher in isolation, but no production code path inspects column names against it.
A new column named payment_token marked Mapped[Public[str]] would pass the classification gate today.
Fix: wire matches_auto_deny_pattern into test_every_column_on_exposed_table_is_classified to require an entry in tests/snapshots/sql_tool_sensitive_overrides.json whenever a Public column matches a deny pattern. The signoff JSON file already exists for that purpose.
🔴 Gap — RLS-registry drift not enforced by structure
RLS_POLICIES in app/api/agent/rls_policies/__init__.py is the source of truth; migration e09d5464d3fe reads it at upgrade time. Changing a policy's predicate in the registry alone is invisible to already-migrated databases — old migrations don't replay. test_rls_policy_drift.py catches drift on a fresh DB but not on long-running ones.
Convention (currently undocumented in CONTRIBUTING): adding or changing a policy requires a new alembic migration calling apply_policy() for the changed entry.
Fix: add a version int to each PolicyDefinition. Drift test asserts pg_policies matches the current version. Adding to the registry without bumping version + writing a migration fails CI. Manageable now (10 tables); dangerous past ~25.
🔴 Gap — semantic ambiguity in column names
Cross-tenant leaks are physically prevented by the four layers. The actual long-term failure mode is answers that are technically scoped but semantically wrong — agent picks task_is_internal thinking it means "ops-internal" when it gates customer visibility, etc.
Three levers under-used today:
agent_descriptionper column viainfo={"sql_tool": {"agent_description": "..."}}— flows intodescribe_schema. Used on exactly one column (milestone_nameinORModule).example_queriesper table — supported inallowlist.describebut empty everywhere.- Semantic views. A
vw_customer_visible_tasksthat codifies the joint meaning oftask_is_internal+is_customer_visibleonce. Agents stop reasoning about flags. Separate plan.
Adding more columns without addressing this scales rope, not power.
🔴 Gap — critical mapping tables not yet exposed
or_task_mappings (module ↔ task ordering), or_module_mappings (playbook ↔ module ordering), and or_task_dependencies (task dependency graph) are not in the allowlist or RLS_POLICIES. Agent can read or_tasks but cannot reconstruct task order or module composition — most "what's the next task" / "what's blocked by what" questions unanswerable via SQL. All three have direct vendor FK; RLS template identical to or_tasks.
🔴 Gap — account table has no RLS policy
account is in the _ALLOWLISTED_TABLES SELECT-grant list (migration b6519fcbc296) and the agent can SELECT from it, but it is not in RLS_POLICIES in app/api/agent/rls_policies/__init__.py. The rewriter is the only defense for cross-vendor account reads.
If a future SQL shape slips past the scope injector, RLS would catch it on every other RLS-protected table — but not on account.
Fix: add app/api/agent/rls_policies/account.py and register it.
🟡 Footgun — over-permissive table flags have no signoff gate
__sql_tool_policy__["global_lookup"] = True and "rls_scoped": True both short-circuit predicate injection in scope_injector.py:144–146. A developer marking a table global_lookup=True by mistake silently drops scope. Default-deny only catches missing markers, not over-permissive ones.
Fix: add a snapshot test listing tables with global_lookup or rls_scoped set, requiring reviewer signoff to add new entries (mirrors the sql_tool_sensitive_overrides.json pattern).
🟡 Footgun — hashid validator only probes two tables
cap_injector._reject_if_cross_vendor_hashid decodes hashid-shaped string literals and looks them up in or_projects then or_tasks. A cross-vendor hashid for or_resources, or_modules, etc. is not caught at this layer. (The scope injector still rewrites the WHERE so cross-vendor data wouldn't be returned — but the early hashid check is incomplete.)
Fix: extend the lookup to every table with a vendor FK, or — better — drop the hashid pre-check entirely and rely on scope injection + RLS. The pre-check is best-effort already.
🟡 Footgun — hashid string-literal scan misses pg-string forms
_validate_hashids regex is r"'([^']*)'". It misses:
- E-strings (
E'...') - Dollar-quoted strings (
$$...$$) - Escaped quotes (
'O''Brien') - Comments (
/* ... */)
Same caveat — scope injection is the real defense, this is opportunistic. Worth either tightening or removing.
🟡 Footgun — _lookup_customer_id returns LIMIT 1 from mapping table
sql_read_controller._lookup_customer_id falls back to the lowest-id mapping row when customer_id isn't passed. A portal user mapped to multiple accounts gets a stable but possibly wrong default. Fine in steady state (most portal users map to one account), but a quiet failure mode if that ever changes.
Fix: require customer_id in the payload for portal users — never silently pick.
🟡 Footgun — column.info["sql_tool"] shallow-merges over markers
orm_allowlist._collect_exposed_columns lines 170–171 dict-merges info_override on top of marker-derived flags. info={"sql_tool": {"exposed": False}} on a column also marked Public[T] would silently disable exposure. Probably intentional as an escape hatch — but no warning logged.
Fix: log a warning when info-dict overrides a marker flag with a different value.
🟢 Dead code — rejection_snippet is unused
agent_surfaces.rejection_snippet is exported but never called. The rewriter raises ScopeInjectionError with raw message strings instead. Either wire it into the error path so the agent gets a uniform refusal, or delete it.
🟢 Limitation — polymorphic columns blanket-rejected
validators._POLYMORPHIC_COLUMNS = {"assoc_object_type_id", "assoc_object_id"} rejects any query touching them. Documented as v1; v2 would resolve the type id at parse time and rewrite the join. Affects or_comment and similar polymorphic tables. Documented in app/api/agent/rls_policies/or_comment.py.
🟢 Operational — registry cache lives until worker restart
collect_allowlist and load_registry are @lru_cache. Adding a new ORM model needs a worker restart to land in the SQL-tool registry. Fine in production; surprising in dev. make dev already restarts on Python changes.
🟢 Operational — no HMAC on the internal endpoint
Documented decision in PR description. The four defense layers below it still hold; HMAC would only add a "trusted-caller" signal. Future hardening idea: HKDF-derived per-endpoint keys from a single master secret (single env var, cryptographic isolation per endpoint) — currently tracked as TBD in the PR body.
Adding a new column or table — the playbook
flowchart TD
Start([New column on a SQL-tool-exposed table]) --> Q1{Is it sensitive?}
Q1 -->|Yes — secrets, passwords| Internal[Mapped Internal'T']
Q1 -->|Yes — vendor-internal| Confidential[Mapped Confidential'T']
Q1 -->|No — agent should query| Public[Mapped Public'T']
Q1 -->|It's a tenancy FK| OwnerCheck{FK auto-detect ambiguous?}
OwnerCheck -->|Yes| Override[Mapped VendorOwner/CustomerOwner'T']
OwnerCheck -->|No| Skip[No marker — auto-detected]
Internal --> Done([CI passes])
Confidential --> Done
Public --> Done
Override --> Done
Skip --> Done
Start --> Q2{Auto-classified name?<br/>id/uuid/created_at/...}
Q2 -->|Yes| Done
Start --> Q3{New table?}
Q3 -->|Yes — portal-readable| Mixin[Add PortalAccessible mixin]
Q3 -->|Yes — vendor-only| Policy[Add __sql_tool_policy__ dict]
Mixin --> RLS{Tenant join through bridge?}
Policy --> RLS
RLS -->|Yes| RLSAdd[Add app/api/agent/rls_policies/your_table.py]
RLS -->|No| Done
RLSAdd --> DoneQuick checklist for a new exposed table:
- Inherit
ORStandardModel(orORBaseModel) — gives you the auto-classified columns for free. - Add
PortalAccessiblemixin if portal callers should see it. - Add
__sql_tool_policy__ = {...}for role filters, customer scoping, project membership. - Tag every non-auto-classified column with a marker (or
info={"sql_tool": {...}}). - If the table has no direct vendor FK and isn't
global_lookup, add an RLS policy inapp/api/agent/rls_policies/. - Run
pytest app/api/agent/tests -q --tb=short --no-header— the classification gate, snapshot test, and RLS drift test will tell you what's missing. - Snapshot bump (
tests/snapshots/sql_tool_allowlist.json) — regenerate, review the diff.
Test surfaces
| Test | What it locks |
|---|---|
test_sql_tool_classification_gate.py | Default-deny. Every column on every exposed table must be classified or signoff-listed |
test_sql_tool_skeleton.py | Marker extraction, both forms (typed + info-dict), auto-deny patterns, conventions |
test_sql_rewriter.py | 50+ cases against scope_injector, validators, cap_injector — cross-vendor isolation, CTE/UNION/EXISTS coverage |
test_sql_read_service.py | End-to-end orchestration including audit emission on every path |
test_sql_read_controller.py | Identity cross-check, customer mapping, role resolution |
test_rls_policy_drift.py | pg_policies matches RLS_POLICIES registry (CI-skip without docker) |
test_rls_enforcement.py | RLS actually rejects cross-vendor reads at the DB layer (CI-skip without docker) |
CloudWatch Insights queries for on-call live in sql_audit.py docstring.
File map (one-glance)
app/api/agent/
├── sql_tool_types.py # 5 markers (Public/Confidential/Internal/VendorOwner/CustomerOwner)
├── sql_tool_mixins.py # PortalAccessible
├── sql_tool_conventions.py # AUTO_EXPOSED_COLUMN_NAMES + AUTO_DENIED_COLUMN_PATTERNS + SIGNOFF_FILE
│
├── controllers/
│ └── sql_read_controller.py # POST endpoint + identity cross-check
├── services/
│ ├── sql_read_service.py # 3-layer orchestration + audit
│ ├── sql_audit.py # AuditRecord + emit_audit
│ └── sql_rewriter/
│ ├── orm_allowlist.py # ORM walk → registry (boot loader)
│ ├── owner_resolution.py # FK auto-detection + bridge walk
│ ├── allowlist.py # registry → AllowlistRegistry per caller type
│ ├── agent_surfaces.py # describe_schema + orientation card
│ ├── parser.py # sqlglot wrapper
│ ├── validators.py # shape rejection (regex + AST)
│ ├── scope_injector.py # vendor/customer/role/soft-delete WHERE injection
│ └── cap_injector.py # LIMIT 1000 + hashid cross-vendor check
├── rls_policies/ # one file per RLS-protected table + helpers + registry
└── tests/
├── test_sql_tool_classification_gate.py
├── test_sql_tool_skeleton.py
├── test_sql_rewriter.py
├── test_sql_read_service.py
├── test_sql_read_controller.py
├── test_rls_policy_drift.py
└── test_rls_enforcement.py
migrations/versions/
├── 2026_04_29_b6519fcbc296_add_or_readonly_role.py
└── 2026_04_29_e09d5464d3fe_add_sql_tool_rls_policies.py
onramp-agents/src/shared/tools/
└── sql_read.py # read_via_sql + describe_schema Strands tool factoriesOperational env vars
ONRAMP_READONLY_DATABASE_PASSWORD # required — login user password
ONRAMP_READONLY_DATABASE_HOST # optional — falls back to ONRAMP_DATABASE_HOST
ONRAMP_DATABASE_PORT # shared with primary
ONRAMP_DATABASE_NAME # shared with primaryPer-tenant provisioning: bun devtools/set-readonly-db-env-vars.ts.
Where this plugs into agents
| Agent | Tool factory called from |
|---|---|
| Portal copilot | onramp-agents/src/agents/portal/... builds tools from PortalAgentState (carries is_customer_user=True, customer_id) |
| OnRamp chat / vendor copilot | builds from ChatAgentState (carries is_customer_user=False) |
Both call the same create_read_via_sql_tool(state) and create_describe_schema_tool(state) factories in shared/tools/sql_read.py. Identity comes from trusted state — never from agent input.
Related plans (reading order)
~/.claude/plans/sql-tool-tenant-isolation-primer.md— threat model, invariants, OnRamp-specific pitfalls~/.claude/plans/sql-tool-orm-refactor-chunk1.md— design that landed: markers, mixin, conventions, fail-safe edges~/.claude/plans/convert-it-now-piped-catmull.md— chunk 1.5: convert info-dict form to typed markers~/.claude/plans/sql-tool-security-review-fixes.md— chunk 2 security findings~/.claude/plans/sql-tool-fixup-rls.md— RLS migration consolidation~/.claude/plans/sql-tool-drop-hmac.md— explicit decision record on no-HMAC posture