Skip to content

feat: add data profile extension#214

Open
aymanyaq wants to merge 1 commit intoawslabs:mainfrom
aymanyaq:feat/data-profile-extension
Open

feat: add data profile extension#214
aymanyaq wants to merge 1 commit intoawslabs:mainfrom
aymanyaq:feat/data-profile-extension

Conversation

@aymanyaq
Copy link
Copy Markdown

@aymanyaq aymanyaq commented Apr 24, 2026

Summary

Added new optional extension to the AI-DLC rules that enhances Reverse Engineering with real data profiling and carries data accuracy enforcement through Functional Design, Code Generation, and Build & Test for brownfield, data-driven applications.

For full context and discussion, please see the RFC: #218

Changes

  • Data Profile (4 rules, DATA-PROFILE-001 through DATA-PROFILE-004):

Key Capabilities:

  • Data Source Profiling (001): During Reverse Engineering, profiles every data source the application touches — extracts exact categorical values, key patterns, numeric ranges, storage type variance (mixed-type detection for schema-less stores), and audits shared data dependencies (signatures, fragilities, safe usage patterns). Uses a 3-tier accessibility model (local files → code-inferable → user-reported)
  • Functional Design Data Alignment (002): Requires Functional Design to read the Data Profile and use exact data values when designing filters, selectors, and business rules — never abbreviate or assume
  • Code Generation Data Accuracy (003): Enforces that all hardcoded filter values, selector options, and data constants in generated code match the Data Profile exactly. Mandates cross-referencing every hardcoded string and preferring dynamic values over hardcoded ones where possible
  • Build & Test Data Validation (004): During Build & Test, validates that source functions handling mixed-type attributes (flagged in Storage Type Variance) handle ALL documented types defensively

Core Design Principles:

  • Agents must never guess data values — always use exact values from the profile
  • The profile is a required input for ALL downstream construction stages
  • Storage type variance detection prevents the most dangerous class of runtime failure in schema-less stores
  • Shared dependency auditing prevents fragile calling patterns in brownfield code

User experience

This extension is opt-in. When a user is in the Requirements Analysis phase, they will be prompted with an opt-in question to enable data profiling during Reverse Engineering.

Checklist

  • I have reviewed the contributing guidelines
  • I have performed a self-review of this change
  • Changes have been tested
  • Changes are documented

Test Plan

  • Load the extension during the Requirements Analysis phase and verify the opt-in prompt appears.
  • All markdown files have passed markdownlint-cli2.

Acknowledgment

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.

@scottschreckengaust
Copy link
Copy Markdown
Member

@aymanyaq, thank you for your interests in adding extensions. Please consider posting a detailed request for comments (RFC) with extensive background and details for us to consider including in the official extensions. We will work on some better guidance for external contributors. Thank you for your patience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants