Skip to content

MSSQL Exporter - Negative Serial Number Errors #729

@jmbecker

Description

@jmbecker

Describe the bug
The MSSQL exporter runs without issue after being set up and then starts throwing TLS handshake failed errors some time later.

To Reproduce
Steps to reproduce the behavior:

  1. Set up prometheus.exporter.mssql with a configuration like the following in Grafana Alloy:
     prometheus.exporter.mssql "standard_exporter" {
       connection_string = "sqlserver://user:pass@sql-server:1433?encrypt=true&trustservercertificate=true"
       query_config = local.file.mssql_standard.content
     }
  1. Metrics begin populating as expected.
  2. Wait several hours (up to 24).
  3. See invalid metric errors that state that a TLS handshake error has occurred:
ts=2025-04-21T14:22:09.549504734Z level=error msg="Invalid metric description." component_path=/ component_id=prometheus.exporter.mssql.standard_exporter err="[mssqlintegration,collector=mssql_standard,query=mssql_user_errors_total] TLS Handshake failed: tls: failed to parse certificate from server: x509: negative serial number"

Expected behavior
Once the MSSQL exporter has been set up, it should continue to work without interruption as long as the sql server being scraped is up and ready to receive the queries. Since I'm trusting the certificate, I'm expecting TLS to continue working.

Configuration
We're using the exporter via Grafana Alloy inside of a kubernetes cluster. We deploy the Grafana Alloy collector with the flag --feature.community-components.enabled. We're using other prometheus-type components without issue.

Additional context

  • Both the standard exporter we have set up AND a custom exporter are having the same issue.
  • This issue seems to have stemmed from a change in behavior of Go 1.23+. This issue in the Go repository mentions that it will probably not be fixed. The workaround would be to add a godebug flag like so: godebug x509negativeserial=1
  • The crypto/x509 package calls out this change in behaviour here
  • Interestingly, scaling the MSSQL container down and back up, followed by changing the SQL scrape configuration appears to allow the scrape to work again (until it stops working).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions