feat(openfeature): emit server-side EVP flagevaluation#11639
feat(openfeature): emit server-side EVP flagevaluation#11639leoromanovsky wants to merge 27 commits into
Conversation
|
🎯 Code Coverage (details) 🔗 Commit SHA: 6b7aa42 | Docs | Datadog PR Page | Give us feedback! |
🟢 Java Benchmark SLOs — All performance SLOs passed
PR vs. master results
Commit: Load and DaCapo benchmarks can be triggered manually in the GitLab pipeline. Results will appear in the Benchmarking Platform UI after completion. |
|
Hi! 👋 Thanks for your pull request! 🎉 To help us review it, please make sure to:
If you need help, please check our contributing guidelines. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3d4244f8ae
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
83ac4c4 to
81ed6f1
Compare
2a3f817 to
90ed6cb
Compare
| private int spansWritten; | ||
|
|
||
| public LLMObsSpanMapper() { | ||
| this(5 << 20); |
There was a problem hiding this comment.
This wasn't precise enough; the limit is measured in even megabytes. I can revert the change, though, and leave it as-is for the LLMObs product. I noticed this same constant (even 5M) in go tracer, so maybe it means something special here.
| /** | ||
| * Default SDK-side target for uncompressed EVP request bodies. Writers may split batches at or | ||
| * below this size to keep Agent proxy requests comfortably bounded. | ||
| */ | ||
| public static final int PAYLOAD_SIZE_LIMIT_BYTES = 5 * 1024 * 1024; |
There was a problem hiding this comment.
The declared limit in the agent is 10MB, but I'm leaving it at 5 here to not change tracer behavior too much.
| new AtomicReference<>(InitializationState.NOT_STARTED); | ||
| private final FlagEvalMetrics flagEvalMetrics; | ||
| private final FlagEvalHook flagEvalHook; | ||
| private final FlagEvalMetricsHook flagEvalMetricsHook; |
There was a problem hiding this comment.
While it's a bit outside the scope of this PR, we decided to rename these hooks globally to clarify the distinction between OTel metrics and EVP-based ones.
| .addString("variationType", flag.variationType.name()) | ||
| .addString("allocationKey", allocation.key); | ||
| .addString("allocationKey", allocation.key) | ||
| .addLong("dd.eval.timestamp_ms", evalTimestampMs); |
There was a problem hiding this comment.
Wondering if I should adjust this to be consistent.
| final boolean evalCountsEnabled = | ||
| config | ||
| .configProvider() | ||
| .getBoolean(FeatureFlaggingConfig.FLAGGING_EVALUATION_COUNTS_ENABLED, true); | ||
| if (evalCountsEnabled) { | ||
| final FlagEvaluationWriterImpl evalWriter = new FlagEvaluationWriterImpl(sco, config); | ||
| evalWriter.start(); | ||
| FLAG_EVAL_WRITER = evalWriter; | ||
| LOGGER.debug("Flag evaluation EVP writer started"); | ||
| } else { |
There was a problem hiding this comment.
We only start the writer on the provider launch. Not to self, check earlier in the PR whether we enqueue when FLAGGING_EVALUATION_COUNTS_ENABLED is disabled. These just take up memory without being processed.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6b7aa4273d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| } else if (value.isInstant()) { | ||
| return value.asInstant(); |
There was a problem hiding this comment.
Normalize instant context values before serializing
When an evaluation context contains an OpenFeature instant attribute, this preserves a java.time.Instant in the attrs map. The flagevaluation payload is later encoded with a plain Moshi instance in FlagEvaluationPayloads, which has no Instant adapter, so toJson throws during flush; because FlagEvaluationWriterImpl only clears aggregates after successful payload encoding, that poisoned bucket remains and every later flush aborts, blocking all flagevaluation telemetry until restart. Convert instants to a JSON scalar (or drop them) before aggregation/serialization.
Useful? React with 👍 / 👎.
🟢 NOTE TO REVIEWERS
I've chosen to keep this PR on the larger side in terms of "lines of code" as a test. The commits are deliberately layered in a narrative style. Each one is equivalent to what we would normally do with a "stacked" PR, but this preserves the overall view of the feature. Please review one commit at a time or all together.
If this review mechanism is not satisfactory, please let me know!
Motivation
Customers need consistent server-side feature-flag evaluation visibility across supported runtimes so rollout behavior can be correlated with application behavior in APM and Event Platform. This Java contribution adds that server-side
flagevaluationsignal for Java OpenFeature evaluations while preserving the existing OTelfeature_flag.evaluationspath and the existing exposure telemetry path.High Priority Changes and Decisions
These are the design points I would want reviewed most closely.
/api/v2/flagevaluation, but builds it under the proxy prefix discovered from the Agent. In current staging dogfooding that resolves to/evp_proxy/v4/api/v2/flagevaluation.FlagEvaluationsRequestwith top-levelcontextand aflagEvaluationsarray; this does not use a separate/batchedflagevaluationsroute.reasonis intentionally not a hidden aggregate key because it is not serialized to the worker contract.targeting_keyis the single identity field for the event. The hook removes duplicatetargetingKeyfromcontext.evaluationso the same identity is not encoded twice.first_evaluationandlast_evaluationbounds, while the payloadtimestampis the flush time.Other Changes
flagevaluationpath behindDD_FLAGGING_EVALUATION_COUNTS_ENABLEDwhile leaving the existing OTelfeature_flag.evaluationshook in place.Commit Guide
LOC is rename-aware
git diff-tree -M --numstatfor each commit against its parent.571938b6c74949ed7ee44aa06476b7ac126fb44a7a43658da64ec77cae036fa7ade3e4986a9af6bef1c37aba16115a407b76cc3b21c0c231d7cb5116FlagEvaluationsRequestpayloads, split oversized bodies, degrade oversized rows, and count drops.e43f93acd017d77854205f88cadeb2597e97d10b58df936165acc6e2194096df04f1beedb264e7d890ed6cb556c73cafdb465b1456b3fbValidation Evidence
Local Test Gates
origin/master(ac29db2316)::communication:test:products:feature-flagging:feature-flagging-api:test:products:feature-flagging:feature-flagging-agent:test:products:feature-flagging:feature-flagging-lib:test:products:feature-flagging:feature-flagging-lib:jmhClassesBackendApiFactoryTest,DDAgentFeaturesDiscoveryTestDDEvaluatorTest,ProviderTest,FlagEvalLoggingHookTestFeatureFlaggingSystemTestFeatureFlagEvpPublisherTest,FlagEvaluationAggregatorTest,FlagEvaluationPayloadsTest,FlagEvaluationWriterImplTest./gradlew :products:feature-flagging:feature-flagging-lib:test :products:feature-flagging:feature-flagging-lib:jacocoTestReport :products:feature-flagging:feature-flagging-lib:jacocoTestCoverageVerification./gradlew spotlessApply./gradlew spotlessCheck./gradlew :products:feature-flagging:feature-flagging-lib:spotlessCheck./gradlew :communication:forbiddenApis :dd-trace-core:forbiddenApis :internal-api:forbiddenApis :telemetry:forbiddenApis :products:feature-flagging:feature-flagging-api:forbiddenApis :products:feature-flagging:feature-flagging-agent:forbiddenApis :products:feature-flagging:feature-flagging-lib:forbiddenApisgit diff --checkpassed.Dogfooding App
ffe-dogfoodingJava artifacts from this localdd-trace-javastack withscripts/prepare-local-java.sh.dd-openfeatureanddd-java-agentartifacts plus the real backend EVP path.PROVIDER_READY.ffe-dogfooding-string-flagthrough the Java dogfooding app 15 times total: 5 evaluations for each targeting key:java-restack4-20260702T042247Z-alphajava-restack4-20260702T042247Z-bravojava-restack4-20260702T042247Z-charlievariant_1, allocationallocation-override-392dd7c149f8, servicejava, and evaluation reasonSTATIC.http://datadog-agent:8126/evp_proxy/v4/api/v2/flagevaluation, both returning202.Staging End-To-End
eventplatform.system.track(TRACK => 'flagevaluation')returned 3 aggregated rows for the exact targeting keys above.flag.key=ffe-dogfooding-string-flagvariant.key=variant_1allocation.key=allocation-override-392dd7c149f8evaluation_count=5System Tests
6b7aa4273d:TEST_LIBRARY=java ./run.sh +v FEATURE_FLAGGING_AND_EXPERIMENTATION tests/ffe/test_flag_eval_evp.py8 passed in 80.08s(Library: java@1.64.0-SNAPSHOT+6b7aa4273d,Weblog variant: spring-boot).