Skip to content

datasets download virus genome taxon panics with segmentation fault for all taxa, starting around 2025-11-24 9am UTC #539

@corneliusroemer

Description

@corneliusroemer

Describe the bug
Since around 9am UTC today, all datasets download virus genome taxon requests fail with:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x18 pc=0x1026d67f0]

At Loculus, we mirror certain taxa every 2 hours so we can pin down to within 2hr when this started:

https://github.com/loculus-project/loculus/actions/workflows/datasets-mirror-priority-1.yml

To Reproduce

datasets download virus genome taxon 11234 --debug

Expected behavior
No panic

Logs

Issue seems to be related to the taxonomy service. Failure is in ResolveTaxons

$ datasets download virus genome taxon 11234 --filename 11234.zip --debug
2025/11/24 13:26:24
POST /datasets/v2/taxonomy/taxon_suggest HTTP/1.1
Host: api.ncbi.nlm.nih.gov
User-Agent: OpenAPI-Generator/1.0.0/go
Content-Length: 128
Accept: application/json
Content-Type: application/json
Ncbi-Phid: 71E4B0D51DB8E7C53AA5BB51
X-Datasets-Client: datasets-cli
X-Datasets-Client-Arch: arm64
X-Datasets-Client-Cmd: download virus genome taxon 11234 --filename 11234.zip --debug
X-Datasets-Client-Os: darwin
X-Datasets-Client-Version: 18.9.0
Accept-Encoding: gzip

{"exact_match":true,"tax_rank_filter":"higher_taxon","taxon_query":"11234","taxon_resource_filter":"TAXON_RESOURCE_FILTER_ALL"}

2025/11/24 13:26:24
HTTP/2.0 200 OK
Access-Control-Expose-Headers: X-RateLimit-Limit,X-RateLimit-Remaining
Content-Security-Policy: upgrade-insecure-requests
Content-Type: application/json
Date: Mon, 24 Nov 2025 12:26:23 GMT
Grpc-Metadata-Via: h2 linkerd
Ncbi-Phid: 71E4B0D51DB8E7C53AA5BB51.1.1.1
Server: Finatra
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
X-Datasets-Version: 18.9.1
X-Ratelimit-Limit: 5
X-Ratelimit-Remaining: 4
X-Ua-Compatible: IE=Edge
X-Xss-Protection: 1; mode=block


2025/11/24 13:26:24
POST /datasets/v2/taxonomy HTTP/1.1
Host: api.ncbi.nlm.nih.gov
User-Agent: OpenAPI-Generator/1.0.0/go
Content-Length: 51
Accept: application/json
Content-Type: application/json
Ncbi-Phid: 71E4B0D51DB8E7C53AA5BB51
X-Datasets-Client: datasets-cli
X-Datasets-Client-Arch: arm64
X-Datasets-Client-Cmd: download virus genome taxon 11234 --filename 11234.zip --debug
X-Datasets-Client-Os: darwin
X-Datasets-Client-Version: 18.9.0
Accept-Encoding: gzip

{"returned_content":"METADATA","taxons":["11234"]}

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x18 pc=0x1026d67f0]

goroutine 1 [running]:
datasets_cli/v2/datasets.(*taxonAutosuggestApi).GetMetadata(0x14000049aa0, {0x16dc95bbd, 0x5}, {0x1026fa873, 0x8})
	apps/public/Datasets/v2/datasets/ResolveTaxons.go:143 +0x150
datasets_cli/v2/datasets.(*taxonAutosuggestApi).CheckLineage(0x16dc95bbd?, {0x16dc95bbd, 0x5}, 0x27ff)
	apps/public/Datasets/v2/datasets/ResolveTaxons.go:148 +0x34
datasets_cli/v2/datasets.(*taxonAutosuggestApi).GetOrganisms(0x14000049aa0, {0x16dc95bbd?, 0x0?}, 0x1, {0x102705507, 0x19}, {0x1026f924a, 0x5}, 0xa, {0x14000049b64, ...})
	apps/public/Datasets/v2/datasets/ResolveTaxons.go:415 +0x114
datasets_cli/v2/datasets.RetrieveTaxIdsForTaxons(0x1400022ef08, {0x14000113040, 0x1, 0x16dc95bbd?}, 0x1, {0x102705507, 0x19}, {0x1026f924a, 0x5}, {0x14000049b64, ...})
	apps/public/Datasets/v2/datasets/ResolveTaxons.go:112 +0xf4
datasets_cli/v2/datasets.createDownloadVirusGenomeTaxonCmd.func1(0x1400022ef08, {0x14000131200?, 0x1?, 0x4?})
	apps/public/Datasets/v2/datasets/DownloadVirusGenomeTaxon.go:43 +0x14c
github.com/spf13/cobra.(*Command).execute(0x1400022ef08, {0x140001311c0, 0x4, 0x4})
	external/gazelle~~go_deps~com_github_spf13_cobra/command.go:985 +0x834
github.com/spf13/cobra.(*Command).ExecuteC(0x102dcdbe0)
	external/gazelle~~go_deps~com_github_spf13_cobra/command.go:1117 +0x344
github.com/spf13/cobra.(*Command).Execute(...)
	external/gazelle~~go_deps~com_github_spf13_cobra/command.go:1041
datasets_cli/v2/datasets.Execute()
	apps/public/Datasets/v2/datasets/root.go:422 +0x24
main.main()
	apps/public/Datasets/v2/cmd/datasets/main.go:10 +0x1c

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions