Telemetry integration for Rialto Server Manager#478
Conversation
Summary: Integrating telemetry T2 markers into Rialto Type: Fix Test Plan: UT/CT, Fullstack Jira: RDKEMW-16516
|
Pull request must be merged with a description containing the required fields, Summary: If there is no jira releated to this change, please put 'Jira: NO-JIRA'. Description can be changed by editing the top comment on your pull request and making a new commit. |
There was a problem hiding this comment.
Pull request overview
Integrates Telemetry 2.0 (T2) markers into Rialto Server Manager and Rialto media client/server error paths, and adds a lightweight telemetry wrapper + build toggles to enable/disable telemetry at build time.
Changes:
- Adds
RialtoTelemetry.hwrapper macros and wires telemetry support intologging/CMakeLists.txt(stubbed when disabled). - Emits telemetry events across serverManager and media client/server failure paths (IPC failures, disconnects, factory creation exceptions, etc.).
- Updates unit test expectations to account for additional
ErrorText()calls.
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 16 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unittests/media/client/ipc/ipcModuleBase/IpcModuleBase.cpp | Updates gMock expectations to match extra ErrorText() usage. |
| serverManager/ipc/source/Client.cpp | Adds telemetry event on session server disconnection/recovery path. |
| serverManager/common/source/SessionServerAppManager.cpp | Adds telemetry events for session server connection/state/config failures. |
| serverManager/common/source/HealthcheckService.cpp | Adds telemetry event when recovery triggers after failed pings. |
| media/server/service/source/main.cpp | Initializes telemetry in the Rialto Server process. |
| media/server/main/source/MediaKeysCapabilities.cpp | Adds telemetry events when capability factory creation fails. |
| media/server/ipc/source/MediaKeysModuleService.cpp | Adds telemetry events when module service factory creation fails. |
| media/server/ipc/source/MediaKeysCapabilitiesModuleService.cpp | Adds telemetry events when capabilities module creation fails. |
| media/client/main/source/MediaPipelineCapabilities.cpp | Adds telemetry events on media pipeline capabilities factory creation failures. |
| media/client/main/source/MediaKeysCapabilities.cpp | Adds telemetry events on media keys capabilities creation failures. |
| media/client/main/source/MediaKeys.cpp | Adds telemetry events on media keys creation failures. |
| media/client/main/source/ClientController.cpp | Adds telemetry events when shared memory init fails. |
| media/client/ipc/source/WebAudioPlayerIpc.cpp | Adds telemetry events for IPC failures across WebAudioPlayer operations. |
| media/client/ipc/source/MediaPipelineIpc.cpp | Adds telemetry events for IPC failures across MediaPipeline operations. |
| media/client/ipc/source/MediaPipelineCapabilitiesIpc.cpp | Adds telemetry events for IPC failures in capability queries. |
| media/client/ipc/source/MediaKeysIpc.cpp | Adds telemetry events for IPC failures and error-status conversions. |
| media/client/ipc/source/MediaKeysCapabilitiesIpc.cpp | Adds telemetry events for IPC failures in key capability queries. |
| media/client/ipc/source/IpcClient.cpp | Adds telemetry event on unexpected IPC disconnect. |
| media/client/ipc/source/ControlIpc.cpp | Adds telemetry events for control IPC failures (shared memory, register, ping ack). |
| media/client/common/include/RialtoClientLogging.h | Adds <cstdio> include (supports snprintf usage in client code). |
| logging/include/RialtoTelemetry.h | Introduces telemetry wrapper macros (init/uninit/string/int/float). |
| logging/include/RialtoLogging.h | Includes telemetry wrapper from the core logging header. |
| logging/CMakeLists.txt | Adds telemetry build toggles and links/includes for telemetry dependencies. |
| cmake/Findtelemetry.cmake | Adds a CMake find-module for telemetry headers/libs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| char telemetryBuff[128] = {0}; | ||
| snprintf(telemetryBuff, sizeof(telemetryBuff), | ||
| "Failed to create the media keys capabilities module service factory, reason: %s", e.what()); | ||
| TELEMETRY_EVENT_STRING("Rialto Client - MediaKeysCapabilitiesModuleService", telemetryBuff); |
There was a problem hiding this comment.
snprintf is used here but this file doesn’t include <cstdio>/<stdio.h>, so this may fail to compile on toolchains that don’t implicitly declare it. Also, this is server-side code but the telemetry marker string starts with "Rialto Client - ..."; please use a marker prefix consistent with the server component.
| RIALTO_CLIENT_LOG_ERROR("Could not initalise the shared memory"); | ||
| char telemetryBuff[128] = {0}; | ||
| snprintf(telemetryBuff, sizeof(telemetryBuff), "Could not initalise the shared memory"); |
There was a problem hiding this comment.
The new telemetry string repeats the misspelling "initalise". Since telemetry strings become part of long-lived dashboards/alerts, please correct it to "initialise" (and ideally keep it consistent with the log message, if you decide to fix that too).
| RIALTO_CLIENT_LOG_ERROR("Could not initalise the shared memory"); | |
| char telemetryBuff[128] = {0}; | |
| snprintf(telemetryBuff, sizeof(telemetryBuff), "Could not initalise the shared memory"); | |
| RIALTO_CLIENT_LOG_ERROR("Could not initialise the shared memory"); | |
| char telemetryBuff[128] = {0}; | |
| snprintf(telemetryBuff, sizeof(telemetryBuff), "Could not initialise the shared memory"); |
| RIALTO_CLIENT_LOG_ERROR("failed to get mute due to '%s'", ipcController->ErrorText().c_str()); | ||
| char telemetryBuff[128] = {0}; | ||
| snprintf(telemetryBuff, sizeof(telemetryBuff), "Failed to get mute due to '%s'", |
There was a problem hiding this comment.
In getTextTrackIdentifier(...), the new telemetry message says "Failed to get mute due to ...", which doesn’t match the API and will create confusing telemetry. Please update the telemetry string to reflect the actual operation (and consider fixing the existing log text as well for consistency).
| RIALTO_CLIENT_LOG_ERROR("failed to get mute due to '%s'", ipcController->ErrorText().c_str()); | |
| char telemetryBuff[128] = {0}; | |
| snprintf(telemetryBuff, sizeof(telemetryBuff), "Failed to get mute due to '%s'", | |
| RIALTO_CLIENT_LOG_ERROR("failed to get text track identifier due to '%s'", | |
| ipcController->ErrorText().c_str()); | |
| char telemetryBuff[128] = {0}; | |
| snprintf(telemetryBuff, sizeof(telemetryBuff), "Failed to get text track identifier due to '%s'", |
| TELEMETRY_EVENT_STRING("Rialto Client - ControllIpc", telemetryBuff); | ||
|
|
There was a problem hiding this comment.
Telemetry marker string uses "ControllIpc" (extra 'l'), which looks like a typo and will create a separate marker from the intended component name. Rename the marker to match the actual module/class name ("ControlIpc") so telemetry aggregation stays consistent.
| RIALTO_CLIENT_LOG_ERROR("failed to ack due to '%s'", ipcController->ErrorText().c_str()); | ||
| char telemetryBuff[128] = {0}; | ||
| snprintf(telemetryBuff, sizeof(telemetryBuff), "Failed to ack due to '%s'", ipcController->ErrorText().c_str()); | ||
| TELEMETRY_EVENT_STRING("Rialto Client - ControllIpc", telemetryBuff); |
There was a problem hiding this comment.
Telemetry marker string uses "ControllIpc" (extra 'l'), which looks like a typo and will create a separate marker from the intended component name. Rename the marker to match the actual module/class name ("ControlIpc") so telemetry aggregation stays consistent.
| TELEMETRY_EVENT_STRING("Rialto Client - ControllIpc", telemetryBuff); | |
| TELEMETRY_EVENT_STRING("Rialto Client - ControlIpc", telemetryBuff); |
| char telemetryBuff[128] = {0}; | ||
| snprintf(telemetryBuff, sizeof(telemetryBuff), "Failed to get the shared memory due to '%s'", | ||
| ipcController->ErrorText().c_str()); | ||
| TELEMETRY_EVENT_STRING("Rialto Client - ControllIpc", telemetryBuff); |
There was a problem hiding this comment.
Telemetry marker string uses "ControllIpc" (extra 'l'), which looks like a typo and will create a separate marker from the intended component name. Rename the marker to match the actual module/class name ("ControlIpc") so telemetry aggregation stays consistent.
| TELEMETRY_EVENT_STRING("Rialto Client - ControllIpc", telemetryBuff); | |
| TELEMETRY_EVENT_STRING("Rialto Client - ControlIpc", telemetryBuff); |
| target_compile_definitions(RialtoLogging PUBLIC RIALTO_TELEMETRY_SUPPORT=1) | ||
| target_include_directories(RialtoLogging PUBLIC ${TELEMETRY_INCLUDE_DIRS}) | ||
| target_link_libraries(RialtoLogging PUBLIC ${TELEMETRY_LIBRARIES}) |
There was a problem hiding this comment.
In the EthanLog-enabled branch, telemetry compile definitions/includes/libs are being applied to RialtoLogging again (lines 69-71) instead of RialtoEthanLog. This means telemetry can be unintentionally disabled for targets linking to RialtoEthanLog, and the block is also redundant with the earlier telemetry configuration. Apply telemetry settings to RialtoEthanLog (or factor the telemetry setup into shared logic) and remove the duplicate configuration.
| target_compile_definitions(RialtoLogging PUBLIC RIALTO_TELEMETRY_SUPPORT=1) | |
| target_include_directories(RialtoLogging PUBLIC ${TELEMETRY_INCLUDE_DIRS}) | |
| target_link_libraries(RialtoLogging PUBLIC ${TELEMETRY_LIBRARIES}) | |
| target_compile_definitions(RialtoEthanLog PUBLIC RIALTO_TELEMETRY_SUPPORT=1) | |
| target_include_directories(RialtoEthanLog PUBLIC ${TELEMETRY_INCLUDE_DIRS}) | |
| target_link_libraries(RialtoEthanLog PUBLIC ${TELEMETRY_LIBRARIES}) |
| RIALTO_CLIENT_LOG_ERROR("failed to stop due to '%s'", ipcController->ErrorText().c_str()); | ||
| char telemetryBuff[128] = {0}; | ||
| snprintf(telemetryBuff, sizeof(telemetryBuff), "Failed to stop due to '%s'", ipcController->ErrorText().c_str()); |
There was a problem hiding this comment.
In haveData(...), the new telemetry message says "Failed to stop due to ...", which doesn’t match the operation being performed. This looks like a copy/paste error and will make telemetry misleading; please update the telemetry string to describe the correct failing action (and consider aligning the existing log message too).
| RIALTO_CLIENT_LOG_ERROR("failed to stop due to '%s'", ipcController->ErrorText().c_str()); | |
| char telemetryBuff[128] = {0}; | |
| snprintf(telemetryBuff, sizeof(telemetryBuff), "Failed to stop due to '%s'", ipcController->ErrorText().c_str()); | |
| RIALTO_CLIENT_LOG_ERROR("failed to have data due to '%s'", ipcController->ErrorText().c_str()); | |
| char telemetryBuff[128] = {0}; | |
| snprintf(telemetryBuff, sizeof(telemetryBuff), "Failed to have data due to '%s'", | |
| ipcController->ErrorText().c_str()); |
| # Telemetry support | ||
| if(CMAKE_TELEMETRY_2_0_REQUIRED) | ||
| message(STATUS "Telemetry 2.0 support enabled") | ||
| target_compile_definitions(RialtoLogging PUBLIC RIALTO_TELEMETRY_SUPPORT=1) | ||
| target_include_directories(RialtoLogging PUBLIC ${TELEMETRY_INCLUDE_DIRS}) | ||
| target_link_libraries(RialtoLogging PUBLIC ${TELEMETRY_LIBRARIES}) | ||
| else() | ||
| message(STATUS "Telemetry support disabled (stub mode)") | ||
| endif() |
There was a problem hiding this comment.
Telemetry support is gated on CMAKE_TELEMETRY_2_0_REQUIRED, but this CMakeLists doesn’t call find_package(telemetry) (or otherwise populate TELEMETRY_INCLUDE_DIRS / TELEMETRY_LIBRARIES). As a result, enabling telemetry will likely fail to find headers/libs or silently link nothing. Consider invoking find_package(telemetry REQUIRED) inside the telemetry-enabled block (using the new cmake/Findtelemetry.cmake) and validating the variables before adding include dirs/link libs.
| char telemetryBuff[128] = {0}; | ||
| snprintf(telemetryBuff, sizeof(telemetryBuff), | ||
| "Failed to create the media keys capabilities factory, reason: %s", e.what()); | ||
| TELEMETRY_EVENT_STRING("Rialto Client - MediaKeysCapabilities", telemetryBuff); | ||
| } |
There was a problem hiding this comment.
snprintf is used here but this file doesn’t include <cstdio>/<stdio.h>, so snprintf may be undeclared and fail compilation on stricter toolchains. Also, this is server-side code (RIALTO_SERVER_LOG_*), but the telemetry marker string says "Rialto Client - ..." which looks inconsistent and could mis-categorize metrics.
Summary: Integrating telemetry T2 markers into Rialto Server Manager
Type: Fix
Test Plan: UT/CT, Fullstack
Jira: RDKEMW-16519
d966ba4 to
de1dd07
Compare
Summary: Integrating telemetry T2 markers into Rialto Server manager
Type: Fix
Test Plan: UT/CT, Fullstack
Jira: RDKEMW-16519