-
Notifications
You must be signed in to change notification settings - Fork 0
Description
return new ChatClient(options.ChatCompletionModelId, options.ApiKey);to
var azureClient = new AzureOpenAIClient(
new Uri(options.Endpoint),
new AzureKeyCredential(options.ApiKey));
return azureClient.GetChatClient(options.DeploymentName);Future: Managed Identities
var azureClient = new AzureOpenAIClient(
new Uri(opts.Endpoint),
new DefaultAzureCredential());public class AzureOpenAIOptions
{
public string Endpoint { get; set; } // https://my-aoai.openai.azure.com
public string ApiKey { get; set; } // key from Azure
public string DeploymentName { get; set; } // your model deployment
}"AzureOpenAI": {
"Endpoint": "https://my-aoai.openai.azure.com/",
"ApiKey": "YOUR_KEY",
"DeploymentName": "gpt-4o",
}Platform-hub IAC (other repo)
IaC - westus3
RG: can-hub-ai-plat-wus3-100-rg
AOAI: hub-ai-plat-wus3-100-aoai
VNET
RG: can-hub-network-plat-wus3-100-rg
Vnet: hub-network-plat-wus3-100-vnet
Snet: hub-network-hub-ai-100-snet
NSG: hub-network-hub-ai-100-snet-nsg
A. Tenant and resource setup (½ day)
Ensure AOAI access & create Azure OpenAI resource in your target region(s). Then deploy two chat models: gpt-4o and gpt-4o-mini (names are your deployment names, not base model IDs). [datacamp.com], [devopscube.com]
Assign initial TPM/RPM per deployment and request increases if needed. Quota is per subscription, per region, per model; you allocate TPM to each deployment. Start small, monitor, then increase. [learn.microsoft.com], [github.com]
Enable Private Endpoint (production) and keep public access in dev only. If you must expose externally, front with APIM + App Gateway (WAF). [techcommun...rosoft.com], [dellenny.com]
B. SDK & code change (½ day)
If you use OpenAI’s SDK today, you’ll replace client construction and switch from model name to deployment name, plus add the api-version for Azure. In .NET, prefer Azure.AI.OpenAI with DefaultAzureCredential (managed identities later). [learn.microsoft.com]
Before (OpenAI .NET):
C#using OpenAI;var client = new OpenAIClient(Environment.GetEnvironmentVariable("OPENAI_API_KEY"));// var chat = client.GetChatClient("gpt-4o"); // exampleShow more lines
After (Azure OpenAI .NET):
Markdownusing Azure.AI.OpenAI;using Azure.Identity;// Endpoint: https://.openai.azure.comvar azure = new AzureOpenAIClient( new Uri(Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")!), new DefaultAzureCredential());// NOTE: use your deployment name, e.g., "diag-gpt-4o"var chat = azure.GetChatClient("diag-gpt-4o"); Show more lines
Key differences: endpoint, deployment name (not base model), and Entra ID auth over raw keys. [learn.microsoft.com]
C. Config & ops (½–1 day, can parallelize)
App configuration: put AZURE_OPENAI_ENDPOINT, AOAI_DEPLOYMENT_PRIMARY, AOAI_DEPLOYMENT_FALLBACK, and AOAI_API_VERSION in Key Vault / configuration. [learn.microsoft.com]
Observability: wire usage/latency/error telemetry; watch 429s and TPM saturation. Azure quota is TPM/RPM throttled by deployment. [learn.microsoft.com]
Content safety: start with default Azure filters; if you hit false positives, later tune policy via RAI policy APIs or layer Azure AI Content Safety independently. Expect slightly higher latency than direct OpenAI due to safety passes. [deepwiki.com], [learn.microsoft.com], [pondhouse-data.com]
D. Networking & identity hardening (1 day, can be phased)
Private Link, APIM internal VNET, App Gateway (WAF) if exposing beyond VNET; move to managed identity for all server workloads. This removes public key materials and keeps traffic private. [techcommun...rosoft.com], [argonsys.com]
E. Capacity planning & fallback (¼ day)
Allocate TPM for two deployments per model (active + warm standby) and set your agent orchestrator to fallback from 4o → 4o‑mini on 429/timeout. Plan quota increases after first week. [learn.microsoft.com]