JS/TS update prompts to not have connection string, add logger requirements, RestError updates by KarishmaGhiya · Pull Request #643 · ronniegeraghty/hyoka

KarishmaGhiya · 2026-05-06T21:40:32Z

Summary

Improves JS/TS evaluation prompts to enforce three critical Azure SDK patterns that were causing high failure rates in hyoka evaluations. These changes were validated across 3 evaluation runs (Run 4 → Run 5 → Run 6) with measurable improvements at each stage.

Changes

1. Remove connection strings from prompts — use `DefaultAzureCredential` (all prompts)

Replaced connection string authentication with @azure/identity credential-based auth in prompts that were still using connection strings (app-configuration, cosmos-db, event-hubs, service-bus, key-vault)
Updated evaluation criteria to expect credential-based client construction

2. Fix Event Hubs skill example — use `DefaultAzureCredential` (SKILL.md)

Root cause: Section 10 of the generator skill showed connectionString usage while Sections 3-4 said "never use connection strings" — a direct contradiction
Fixed to use DefaultAzureCredential with fully qualified namespace
Updated both producer and consumer examples

3. Add `@azure/logger` diagnostic logging requirement (all 14 prompts)

Root cause: @azure/logger had an 87% failure rate across ALL configs (including skills). The skill teaches it (Section 1), but it's an "additive" pattern — code works without it, so the model ignores skill instructions
Fix: Added explicit prompt-level instruction: Enable SDK diagnostic logging using @azure/logger with a configurable log level
Result (Run 5): 87% fail → 0% fail (complete fix)

4. Add `RestError` exception handling requirement (12 prompts)

Root cause: RestError from @azure/core-rest-pipeline had a 68% failure rate. Same pattern as logger — the skill teaches it (Section 2) but the model ignores additive requirements
Fix: Added explicit prompt-level instruction: Handle errors using RestError from @azure/core-rest-pipeline with statusCode checks
Result (Run 6): 68% fail → 0% fail (complete fix)

Evaluation Results

Metric	Run 4 (before)	Run 5 (+logger fix)	Run 6 (+RestError fix)
Overall pass rate	84.0%	86.5%	94.4%
@azure/logger	13% pass	100% pass	100% pass
RestError	32% pass	32% pass	100% pass
Best config	87.3%	89.9%	96.3%

Key Insight

Instruction authority hierarchy for LLMs:

Prompt task instructions (highest priority) — model reliably follows these
Skill directives — followed for structural patterns (auth), ignored for additive patterns (logging, error handling)
Retrieved docs (MCP) — informational, not directive
Training data prior (lowest) — default behavior

For "additive" SDK patterns (code works fine without them), prompt-level instruction is required — skills alone are insufficient.

Files Changed

14 JS/TS prompt files under prompts/ — credential, logger, and RestError updates
skills/generator/js-ts-azure-patterns/SKILL.md — Event Hubs example fixed to use DAC

…stead of connection string Section 10 contradicted Sections 3 and 4 by showing connectionString pattern. Updated to use fully qualified namespace + DefaultAzureCredential, consistent with the skill's own guidance to never use connection strings. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…rompts @azure/logger was the ronniegeraghty#1 failing criteria (87% fail rate across all configs, including skills). Root cause: the skill instructs 'always set up SDK logging' but the model prioritizes the specific task in the prompt over general skill instructions. Only blob-storage-manager passed because it explicitly asked for 'SDK logging at a configurable level'. Added 'Enable SDK diagnostic logging using @azure/logger' to the Prompt section of all 13 other JS/TS prompts to make the requirement explicit. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add explicit instruction to handle errors using RestError from @azure/core-rest-pipeline with statusCode checks to all 12 JS/TS prompts that were missing it. Same fix pattern as the @azure/logger fix - additive SDK patterns need prompt-level instruction to be reliably generated. Analysis from Run 5: RestError failed at 68% overall (54/79 criteria). Even with skills (which teach RestError in Section 2), failure rate was 55-63%. Prompts that already mentioned RestError in their Prompt section (app-configuration, storage-crud) had 0% failure rate. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

samvaity · 2026-05-11T23:33:11Z

-Show required npm packages (@azure/event-hubs and
-@azure/eventhubs-checkpointstore-blob) and proper async/await patterns.
+Enable SDK diagnostic logging using `@azure/logger` with a configurable log level.
+Handle errors using `RestError` from `@azure/core-rest-pipeline` with `statusCode` checks.


Don't think this should be explicitly mentioned in the prompt, its give a bias?

Our evaluation criteria is based on the usage of these libraries. Unless the agent is told to perform an action to add logging and Handle error it almost NEVER does it, even if it's part of the best practices skill. Please see the findings in Points 3 and 4 in the description above.

samvaity · 2026-05-11T23:33:46Z


-Show required npm packages (@azure/event-hubs and
-@azure/eventhubs-checkpointstore-blob) and proper async/await patterns.
+Enable SDK diagnostic logging using `@azure/logger` with a configurable log level.


This too, if the aim of the PR is to make the prompts generic, it should just say "Enable SDK diagnostic logging"

samvaity · 2026-05-11T23:34:29Z


-Use DefaultAzureCredential for authentication. Show required npm packages
-and include proper error handling with try/catch.
+Use a credential from `@azure/identity` for authentication. Enable SDK diagnostic


Why are we mentioning bias with specific packages if we don't want to mention specific credential types?

samvaity · 2026-05-11T23:35:10Z

 The project needs:

- A **secret provider class** that retrieves secrets from Key Vault by name, with graceful handling when a secret doesn't exist (return a default value instead of crashing). It should also be able to retrieve a specific version of a secret (not just the latest), and inspect a secret's expiry date so the caller can tell if a secret is about to expire.
+- A **secret provider class** that retrieves secrets from Key Vault by name, with graceful handling when a secret doesn't exist (return a default value instead of crashing) — use `RestError` from `@azure/core-rest-pipeline` with `statusCode` checks (e.g., 404) to detect not-found vs other failures. It should also be able to retrieve a specific version of a secret (not just the latest), and inspect a secret's expiry date so the caller can tell if a secret is about to expire.


This is updating the prompt from general scope to bias with sdk terminologies.

KarishmaGhiya · 2026-05-11T23:52:42Z

@samvaity Unless we specify a specific prompt to the LLM to add logging or parse Rest Error - it doesn't do it just based on best practices skill. It only does what is needed to make the code work. Atleast this is the gap I found in JS. The best practices skill is already being read by the LLM and it doesn't take any action with just that knowledge unless specifically told to add logging, error handling.

KarishmaGhiya · 2026-05-12T00:57:09Z

Evaluation Results: SDK-Specific vs Generic Prompt Wording

We ran two experiments to determine how prompt wording affects whether the model uses the correct Azure SDK patterns for error handling (RestError) and logging (@azure/logger).

Experiment Setup

56 evaluations per run (14 JS/TS prompts × 4 sonnet configs: baseline, baseline-skills, azure-mcp, azure-mcp-skills)
Run 6 — SDK-specific wording in prompts (e.g., "Handle errors using RestError from @azure/core-rest-pipeline with statusCode checks", "Enable SDK diagnostic logging using @azure/logger with a configurable log level")
Run 7 — Generic wording (e.g., "Handle errors by parsing the HTTP status code", "Enable SDK diagnostic logging with a configurable log level")

Results

Metric	Run 6 (SDK-specific)	Run 7 (generic)	Delta
Overall pass rate	94.4%	89.4%	-5.0%
RestError criteria	100% (80/80)	62.5% (50/80)	-37.5%
Logger criteria	~100%	92.1% (58/63)	-7.9%

RestError Breakdown (Run 7 — generic wording)

Prompt	Pass Rate
event-hubs	0% (0/4)
app-configuration	25% (2/8)
cosmos-db	25% (1/4)
service-bus	25% (1/4)
identity-service-principal	50% (2/4)
storage-blob-manager	50% (2/4)
encrypted-uploader	64% (9/14)
key-vault-crud	67% (4/6)
resource-manager	75% (3/4)
key-vault-secret-config	75% (6/8)
storage-account-mgmt	100% (4/4)
identity-default-credential	100% (4/4)
identity-managed-identity	100% (4/4)
storage-crud	100% (8/8)

Config-Level RestError (Run 7)

Config	Pass Rate
baseline	36.4% (8/22)
azure-mcp	55.0% (11/20)
azure-mcp-skills	73.7% (14/19)
baseline-skills	89.5% (17/19)

Key Takeaway

Generic wording ("handle errors by parsing the HTTP status code") is too vague — the model doesn't reliably translate that into the correct SDK pattern (RestError import + statusCode check). The SDK-specific wording produced 100% pass rates for both RestError and logger criteria.

This confirms the instruction authority hierarchy: for additive SDK patterns (logging, error handling), the model needs explicit SDK construct names in the prompt. Skills help (baseline-skills scores higher than baseline) but aren't sufficient alone — the gap between skills-only (89.5%) and prompt-specific (100%) is significant.

Recommendation (OPEN TO DISCUSSION)

Keep SDK-specific terminology in the prompt sections. The prompts are language-specific (JS/TS) already, so mentioning RestError and @azure/logger is consistent with the level of specificity expected. The evaluation criteria sections already use SDK-specific terms — the prompt section should match.

KarishmaGhiya and others added 4 commits May 6, 2026 13:47

update prompts

c257335

KarishmaGhiya marked this pull request as ready for review May 11, 2026 22:10

KarishmaGhiya changed the title ~~J/TS update prompts to not have connection string~~ JS/TS update prompts to not have connection string, add logger requirements, RestError updates May 11, 2026

ronniegeraghty approved these changes May 11, 2026

View reviewed changes

samvaity reviewed May 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JS/TS update prompts to not have connection string, add logger requirements, RestError updates#643

JS/TS update prompts to not have connection string, add logger requirements, RestError updates#643
KarishmaGhiya wants to merge 4 commits into
ronniegeraghty:mainfrom
KarishmaGhiya:prompt-updates

KarishmaGhiya commented May 6, 2026 •

edited

Loading

Uh oh!

samvaity May 11, 2026

Uh oh!

KarishmaGhiya May 11, 2026

Uh oh!

samvaity May 11, 2026

Uh oh!

samvaity May 11, 2026

Uh oh!

samvaity May 11, 2026 •

edited

Loading

Uh oh!

KarishmaGhiya commented May 11, 2026

Uh oh!

KarishmaGhiya commented May 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

KarishmaGhiya commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

1. Remove connection strings from prompts — use DefaultAzureCredential (all prompts)

2. Fix Event Hubs skill example — use DefaultAzureCredential (SKILL.md)

3. Add @azure/logger diagnostic logging requirement (all 14 prompts)

4. Add RestError exception handling requirement (12 prompts)

Evaluation Results

Key Insight

Files Changed

Uh oh!

samvaity May 11, 2026

Choose a reason for hiding this comment

Uh oh!

KarishmaGhiya May 11, 2026

Choose a reason for hiding this comment

Uh oh!

samvaity May 11, 2026

Choose a reason for hiding this comment

Uh oh!

samvaity May 11, 2026

Choose a reason for hiding this comment

Uh oh!

samvaity May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

KarishmaGhiya commented May 11, 2026

Uh oh!

KarishmaGhiya commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Evaluation Results: SDK-Specific vs Generic Prompt Wording

Experiment Setup

Results

RestError Breakdown (Run 7 — generic wording)

Config-Level RestError (Run 7)

Key Takeaway

Recommendation (OPEN TO DISCUSSION)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

KarishmaGhiya commented May 6, 2026 •

edited

Loading

1. Remove connection strings from prompts — use `DefaultAzureCredential` (all prompts)

2. Fix Event Hubs skill example — use `DefaultAzureCredential` (SKILL.md)

3. Add `@azure/logger` diagnostic logging requirement (all 14 prompts)

4. Add `RestError` exception handling requirement (12 prompts)

samvaity May 11, 2026 •

edited

Loading

KarishmaGhiya commented May 12, 2026 •

edited

Loading