You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Given the complexity of the Perl program, a slight memory leak or increase is expected and has been observed. However, minimizing this is a primary objective.
On Linux: Work in Progress
On BSD: TBD
Xymon-rrd Issue: A previously identified issue has been resolved (Which version?).
Plan
Step #1: Detect/Identify Sources of Problems (WIP)
Valgrind: Initial testing on Xymon rrd suggests no leaks, although Xymon does not free memory as expected when stopped.
Additional Tools:
Devel::Cycle: Testing for circular references (no output indicates no leak).
Stale Forks: When forks terminate, memory is not properly released, causing leaks.
Memory Duplication with Forks: Forks replicate the main process and are not memory-optimized. For example, with 10 children and 1 master process, memory usage becomes 11 times that of the main process, contrary to expectations based on Copy-On-Write efficiency.
Currently using a pool of 10 processes (why this limitation?). Increasing the number could potentially accommodate more requests.
Implemented a Boss-Worker model with worker processes being re-forked upon unexpected termination.
Prevent Fork Termination: Rewrote SNMPWalk Engine for Retry Management (COMPLETED in v0.25.03 for SNMP_Session and NET_SNMP_C lib).
Partial Retries: The engine continues polling using base primitives (e.g., getbulk, getnext, get) until all data is gathered. If a timeout occurs (max 4 seconds per primitive, configurable), the device is reinjected into the pool to complete pending OIDs. If the global timeout is reached, retries are abandoned, and partial results are shown. This ensures polling completes without unnecessary delays or prematurely killing forks.
Dynamic Timeout & Retry Optimization: The retry mechanism dynamically adjusts based on network conditions and device responsiveness, improving reliability and efficiency.
Memory Duplication with Forks: A comprehensive review of the forking process is in progress (WIP).
Avoid Fork Termination: Rewriting of the SNMPWalk engine to manage retries successfully completed.
Unexpected Process Termination: Significant improvement observed, stable over several months.
Memory Increment: Some memory increment persists, though reduced compared to before. It's unclear whether this is related to Devmon or an SNMP polling issue.
The issue seems sporadic and likely arises during network outages, suggesting incomplete SNMP polling.
Interpreting memory data from Xymon and pinpointing the process responsible remains unclear.
RRD remains under suspicion. The next step is to remove all graphs for further investigation. However, the introduction of RRD-cache in Xymon 4.4 may postpone this troubleshooting.
Number of Forks
Description: The number of forks indicates the count of SNMP polling processes running concurrently.
Default: 10 forks.
Increased Forks (>10): Experimental. Increasing the number appears to cause process blockages (exact cause TBD).
This version presents the same information with enhanced clarity, streamlined structure, and precise technical language.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Stability and Memory Leak / Increase
Given the complexity of the Perl program, a slight memory leak or increase is expected and has been observed. However, minimizing this is a primary objective.
Plan
Step #1: Detect/Identify Sources of Problems (WIP)
Step #2: Fix/Apply Corrections (WIP)
Step #3: Control Results (WIP)
Current Status
Step #1: Problem Sources Identified
Step #2: Implement Corrections
Prevent Fork Termination: Rewrote SNMPWalk Engine for Retry Management (COMPLETED in v0.25.03 for SNMP_Session and NET_SNMP_C lib).
Partial Retries: The engine continues polling using base primitives (e.g., getbulk, getnext, get) until all data is gathered. If a timeout occurs (max 4 seconds per primitive, configurable), the device is reinjected into the pool to complete pending OIDs. If the global timeout is reached, retries are abandoned, and partial results are shown. This ensures polling completes without unnecessary delays or prematurely killing forks.
Dynamic Timeout & Retry Optimization: The retry mechanism dynamically adjusts based on network conditions and device responsiveness, improving reliability and efficiency.
Memory Duplication with Forks: A comprehensive review of the forking process is in progress (WIP).
Step #3: Control (WIP)
Number of Forks
This version presents the same information with enhanced clarity, streamlined structure, and precise technical language.
Beta Was this translation helpful? Give feedback.
All reactions