feat(BA-2938): Migrate Session data to RBAC database#9636
feat(BA-2938): Migrate Session data to RBAC database#9636
Conversation
Add alembic migration to migrate Session entities to the new RBAC system: - Add SESSION entity-type permissions to all role+scope combinations - Skip domain member roles (scope too broad) - Member roles get READ operation only - Owner/admin roles get all operations - Create AUTO edges from User scope → Session (via user_uuid) - Create AUTO edges from Project scope → Session (via group_id) - Use keyset pagination for scalability with large session tables - Support both upgrade and downgrade operations This migration follows the new RBAC pattern where permissions table includes role_id, scope_type, scope_id directly (no permission_groups). The association_scopes_entities table uses relation_type='auto' to mark automatically managed scope-entity relationships. Unlike VFolder, Session has no invitation/sharing mechanism, making the migration simpler with only entity-type permissions and AUTO edges. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Migrates Session entities into the RBAC system by adding entity-type permissions and creating AUTO scope→entity associations via an Alembic migration.
Changes:
- Added an Alembic migration to backfill SESSION permissions for existing role+scope combinations.
- Added AUTO association edges from user/project scopes to Session records using batched pagination.
- Added a changelog entry describing the migration.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 10 comments.
| File | Description |
|---|---|
| src/ai/backend/manager/models/alembic/versions/30c8308738ee_migrate_session_data_to_rbac.py | Implements the RBAC data migration for Session permissions and scope associations. |
| changes/9636.feature.md | Documents the feature/migration at a high level. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/ai/backend/manager/models/alembic/versions/30c8308738ee_migrate_session_data_to_rbac.py
Show resolved
Hide resolved
src/ai/backend/manager/models/alembic/versions/30c8308738ee_migrate_session_data_to_rbac.py
Show resolved
Hide resolved
src/ai/backend/manager/models/alembic/versions/30c8308738ee_migrate_session_data_to_rbac.py
Outdated
Show resolved
Hide resolved
| values_list.append( | ||
| f"('{row.role_id}', '{row.scope_type}', '{row.scope_id}', " | ||
| f"'{EntityType.SESSION.value}', '{operation.value}')" | ||
| ) |
There was a problem hiding this comment.
This builds SQL by interpolating database values into the statement text. Even if these fields are expected to be UUID/enums, this pattern is brittle (quoting/escaping issues) and makes it easier for unexpected data to break the migration. Prefer parameterized inserts (e.g., executemany with bind params) or a set-based INSERT ... SELECT that avoids string-building.
src/ai/backend/manager/models/alembic/versions/30c8308738ee_migrate_session_data_to_rbac.py
Outdated
Show resolved
Hide resolved
src/ai/backend/manager/models/alembic/versions/30c8308738ee_migrate_session_data_to_rbac.py
Outdated
Show resolved
Hide resolved
src/ai/backend/manager/models/alembic/versions/30c8308738ee_migrate_session_data_to_rbac.py
Outdated
Show resolved
Hide resolved
src/ai/backend/manager/models/alembic/versions/30c8308738ee_migrate_session_data_to_rbac.py
Outdated
Show resolved
Hide resolved
src/ai/backend/manager/models/alembic/versions/30c8308738ee_migrate_session_data_to_rbac.py
Outdated
Show resolved
Hide resolved
| last_id = "00000000-0000-0000-0000-000000000000" | ||
| while True: | ||
| query = sa.text(""" | ||
| SELECT id::text AS id, user_uuid::text AS user_uuid | ||
| FROM sessions | ||
| WHERE id::text > :last_id | ||
| ORDER BY id | ||
| LIMIT :limit | ||
| """) |
There was a problem hiding this comment.
Reasons why I don't use OFFSET here:
- https://www.postgresql.org/docs/current/queries-limit.html#QUERIES-LIMIT
- https://docs.gitlab.com/development/database/keyset_pagination/
Using OFFSET to migrate huge data is inefficient.
…tent ordering Replace text-based UUID comparison with UUID type comparison to avoid potential ordering discrepancies between lexicographic (text) and binary (UUID) sort orders. Changes: - Import UUID from uuid module - Initialize last_id as UUID object instead of string - Remove ::text casting in WHERE and SELECT clauses - Use native UUID comparison (id > :last_id) instead of text comparison This ensures that filtering and ordering use the same sort order, preventing potential data skips or duplicates during batch processing. Addresses Copilot review comment about UUID/text ordering mismatch.
…LECT Replace application-side OFFSET pagination with a single set-based INSERT ... SELECT query for better performance and consistency. Changes: - Remove while loop with OFFSET pagination - Use CTE (WITH clause) to derive role+scope combinations - Use UNION ALL to combine member and owner operations - Use unnest() to expand operation arrays inline - Single database round-trip instead of multiple batches Benefits: - O(1) time complexity vs O(N) with OFFSET - No risk of row skips/duplicates from concurrent changes - Simpler code without manual batching logic - Better query plan from database optimizer Addresses Copilot review comment about OFFSET inefficiency.
…d queries Replace string interpolation in INSERT and DELETE statements with parameterized queries to improve security and maintainability. Changes for INSERT: - Build values_list with dicts instead of string formatting - Use parameterized query with :named parameters - Execute individual inserts in loop (safe for ON CONFLICT) Changes for DELETE: - Replace 'SELECT + string join' pattern with subquery - Use DELETE ... WHERE id IN (SELECT ... LIMIT N) - Check rowcount instead of empty result set - Keep all parameters bound safely Benefits: - Eliminates SQL injection risks from malformed UUIDs - Prevents quoting/escaping issues - Avoids oversized query strings with large batches - More maintainable and readable code Addresses Copilot review comments about SQL injection and DELETE pattern.
Rename constant to use more accurate terminology. 'SUFFIX' better describes the pattern matching with str.endswith() than 'POSTFIX'. Addresses Copilot review comment about naming convention.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Process User scope edges | ||
| last_id = UUID("00000000-0000-0000-0000-000000000000") | ||
| while True: | ||
| query = sa.text(""" | ||
| SELECT id, user_uuid | ||
| FROM sessions | ||
| WHERE id > :last_id | ||
| ORDER BY id | ||
| LIMIT :limit | ||
| """) | ||
| rows = db_conn.execute(query, {"last_id": last_id, "limit": BATCH_SIZE}).all() | ||
| if not rows: | ||
| break | ||
|
|
||
| last_id = rows[-1].id | ||
|
|
||
| # Bulk insert using parameterized query | ||
| values_list = [ | ||
| { | ||
| "scope_type": "user", | ||
| "scope_id": str(row.user_uuid), | ||
| "entity_type": entity_type, | ||
| "entity_id": str(row.id), | ||
| "relation_type": relation_type, | ||
| } | ||
| for row in rows | ||
| ] | ||
|
|
||
| if values_list: | ||
| insert_query = sa.text(""" | ||
| INSERT INTO association_scopes_entities (scope_type, scope_id, entity_type, entity_id, relation_type) | ||
| VALUES (:scope_type, :scope_id, :entity_type, :entity_id, :relation_type) | ||
| ON CONFLICT (scope_type, scope_id, entity_id) DO NOTHING | ||
| """) | ||
| for values in values_list: | ||
| db_conn.execute(insert_query, values) | ||
|
|
||
| # Process Project scope edges | ||
| last_id = UUID("00000000-0000-0000-0000-000000000000") | ||
| while True: | ||
| query = sa.text(""" | ||
| SELECT id, group_id | ||
| FROM sessions | ||
| WHERE id > :last_id | ||
| ORDER BY id | ||
| LIMIT :limit | ||
| """) | ||
| rows = db_conn.execute(query, {"last_id": last_id, "limit": BATCH_SIZE}).all() | ||
| if not rows: | ||
| break | ||
|
|
||
| last_id = rows[-1].id | ||
|
|
||
| # Bulk insert using parameterized query | ||
| values_list = [ | ||
| { | ||
| "scope_type": "project", | ||
| "scope_id": str(row.group_id), | ||
| "entity_type": entity_type, | ||
| "entity_id": str(row.id), | ||
| "relation_type": relation_type, | ||
| } | ||
| for row in rows | ||
| ] | ||
|
|
||
| if values_list: | ||
| insert_query = sa.text(""" | ||
| INSERT INTO association_scopes_entities (scope_type, scope_id, entity_type, entity_id, relation_type) | ||
| VALUES (:scope_type, :scope_id, :entity_type, :entity_id, :relation_type) | ||
| ON CONFLICT (scope_type, scope_id, entity_id) DO NOTHING | ||
| """) | ||
| for values in values_list: | ||
| db_conn.execute(insert_query, values) | ||
|
|
||
|
|
There was a problem hiding this comment.
The project-scope association block largely duplicates the user-scope block above. Refactoring the shared keyset-pagination + insert logic into a helper would reduce duplication and the risk of fixing a bug in only one path.
| # Process User scope edges | |
| last_id = UUID("00000000-0000-0000-0000-000000000000") | |
| while True: | |
| query = sa.text(""" | |
| SELECT id, user_uuid | |
| FROM sessions | |
| WHERE id > :last_id | |
| ORDER BY id | |
| LIMIT :limit | |
| """) | |
| rows = db_conn.execute(query, {"last_id": last_id, "limit": BATCH_SIZE}).all() | |
| if not rows: | |
| break | |
| last_id = rows[-1].id | |
| # Bulk insert using parameterized query | |
| values_list = [ | |
| { | |
| "scope_type": "user", | |
| "scope_id": str(row.user_uuid), | |
| "entity_type": entity_type, | |
| "entity_id": str(row.id), | |
| "relation_type": relation_type, | |
| } | |
| for row in rows | |
| ] | |
| if values_list: | |
| insert_query = sa.text(""" | |
| INSERT INTO association_scopes_entities (scope_type, scope_id, entity_type, entity_id, relation_type) | |
| VALUES (:scope_type, :scope_id, :entity_type, :entity_id, :relation_type) | |
| ON CONFLICT (scope_type, scope_id, entity_id) DO NOTHING | |
| """) | |
| for values in values_list: | |
| db_conn.execute(insert_query, values) | |
| # Process Project scope edges | |
| last_id = UUID("00000000-0000-0000-0000-000000000000") | |
| while True: | |
| query = sa.text(""" | |
| SELECT id, group_id | |
| FROM sessions | |
| WHERE id > :last_id | |
| ORDER BY id | |
| LIMIT :limit | |
| """) | |
| rows = db_conn.execute(query, {"last_id": last_id, "limit": BATCH_SIZE}).all() | |
| if not rows: | |
| break | |
| last_id = rows[-1].id | |
| # Bulk insert using parameterized query | |
| values_list = [ | |
| { | |
| "scope_type": "project", | |
| "scope_id": str(row.group_id), | |
| "entity_type": entity_type, | |
| "entity_id": str(row.id), | |
| "relation_type": relation_type, | |
| } | |
| for row in rows | |
| ] | |
| if values_list: | |
| insert_query = sa.text(""" | |
| INSERT INTO association_scopes_entities (scope_type, scope_id, entity_type, entity_id, relation_type) | |
| VALUES (:scope_type, :scope_id, :entity_type, :entity_id, :relation_type) | |
| ON CONFLICT (scope_type, scope_id, entity_id) DO NOTHING | |
| """) | |
| for values in values_list: | |
| db_conn.execute(insert_query, values) | |
| def _paginate_and_associate(scope_type: str, scope_id_column: str) -> None: | |
| last_id = UUID("00000000-0000-0000-0000-000000000000") | |
| insert_query = sa.text(""" | |
| INSERT INTO association_scopes_entities (scope_type, scope_id, entity_type, entity_id, relation_type) | |
| VALUES (:scope_type, :scope_id, :entity_type, :entity_id, :relation_type) | |
| ON CONFLICT (scope_type, scope_id, entity_id) DO NOTHING | |
| """) | |
| while True: | |
| query = sa.text(f""" | |
| SELECT id, {scope_id_column} AS scope_id | |
| FROM sessions | |
| WHERE id > :last_id | |
| ORDER BY id | |
| LIMIT :limit | |
| """) | |
| rows = db_conn.execute( | |
| query, | |
| {"last_id": last_id, "limit": BATCH_SIZE}, | |
| ).all() | |
| if not rows: | |
| break | |
| last_id = rows[-1].id | |
| # Bulk insert using parameterized query | |
| values_list = [ | |
| { | |
| "scope_type": scope_type, | |
| "scope_id": str(row.scope_id), | |
| "entity_type": entity_type, | |
| "entity_id": str(row.id), | |
| "relation_type": relation_type, | |
| } | |
| for row in rows | |
| ] | |
| if values_list: | |
| for values in values_list: | |
| db_conn.execute(insert_query, values) | |
| # Process User scope edges | |
| _paginate_and_associate("user", "user_uuid") | |
| # Process Project scope edges | |
| _paginate_and_associate("project", "group_id") |
| for values in values_list: | ||
| db_conn.execute(insert_query, values) | ||
|
|
There was a problem hiding this comment.
This inserts association rows one-by-one inside the batch (for values in values_list: db_conn.execute(...)). For large sessions tables this causes many DB round-trips and can make the migration impractically slow. Prefer a single executemany call (passing the full values_list to execute) or a set-based insert per batch.
| for values in values_list: | ||
| db_conn.execute(insert_query, values) | ||
|
|
There was a problem hiding this comment.
This batch insert also executes one INSERT per row. Please switch to executemany (single execute call with a list of param dicts) or a set-based INSERT ... SELECT to avoid O(n) database calls.
| delete_query = sa.text(""" | ||
| DELETE FROM association_scopes_entities | ||
| WHERE id IN ( | ||
| SELECT id FROM association_scopes_entities | ||
| WHERE entity_type = :entity_type |
There was a problem hiding this comment.
The migration comment says it removes SESSION "AUTO" edges, but this DELETE filters only by entity_type. Either add a relation_type = 'auto' filter or update the wording; current behavior removes all session associations regardless of relation_type.
| # Precompute operation lists | ||
| member_ops = [op.value for op in OperationType.member_operations()] | ||
| owner_ops = [op.value for op in OperationType.owner_operations()] |
There was a problem hiding this comment.
OperationType.member_operations() / owner_operations() return sets; building lists directly from them can produce nondeterministic ordering. It’s safer to sort the operation values to keep the migration deterministic and easier to reason about.
| # Precompute operation lists | |
| member_ops = [op.value for op in OperationType.member_operations()] | |
| owner_ops = [op.value for op in OperationType.owner_operations()] | |
| # Precompute operation lists (sorted for deterministic ordering) | |
| member_ops = sorted(op.value for op in OperationType.member_operations()) | |
| owner_ops = sorted(op.value for op in OperationType.owner_operations()) |
Summary
This migration brings Session entities into the RBAC system, enabling fine-grained access control. Unlike VFolder, Session has no invitation/sharing mechanism, making the migration simpler with only entity-type permissions and AUTO edges.
Test plan
alembic upgrade head)alembic downgrade -1)Resolves BA-2938