Skip to content

machine/hyperV: move ssh mounts to after the ready check#28871

Open
Luap99 wants to merge 1 commit into
podman-container-tools:mainfrom
Luap99:hyperv-flake
Open

machine/hyperV: move ssh mounts to after the ready check#28871
Luap99 wants to merge 1 commit into
podman-container-tools:mainfrom
Luap99:hyperv-flake

Conversation

@Luap99

@Luap99 Luap99 commented Jun 5, 2026

Copy link
Copy Markdown
Member

We are seeing frequent flakes in hyperV machine tests. The machine start fails with an ssh handshake failure:

ssh: handshake failed: read tcp 127.0.0.1:56425->127.0.0.1:56377:
wsarecv: An existing connection was forcibly closed by the remote host.

Normally we do the ssh probe in conductVMReadinessCheck() with a retry mechanism, however because the hyperV mount code already used ssh in PostStartNetworking() we never got there and failed early.

PostStartNetworking seems the wrong place to mount anyway so move this to MountVolumesToVM() instead which is placed after the ready check already so it should have a working ssh by then.

Does this PR introduce a user-facing change?

Fixed a possible race condition when starting hyperV machines.

We are seeing frequent flakes in hyperV machine tests. The machine start
fails with an ssh handshake failure:

ssh: handshake failed: read tcp 127.0.0.1:56425->127.0.0.1:56377:
wsarecv: An existing connection was forcibly closed by the remote host.

Normally we do the ssh probe in conductVMReadinessCheck() with a retry
mechanism, however because the hyperV mount code already used ssh in
PostStartNetworking() we never got there and failed early.

PostStartNetworking seems the wrong place to mount anyway so move this
to MountVolumesToVM() instead which is placed after the ready check
already so it should have a working ssh by then.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
@Luap99 Luap99 added the No New Tests Allow PR to proceed without adding regression tests label Jun 5, 2026
@Luap99

Luap99 commented Jun 5, 2026

Copy link
Copy Markdown
Member Author

cc @baude @mheon

@Luap99

Luap99 commented Jun 5, 2026

Copy link
Copy Markdown
Member Author
2026-06-05T18:20:50.6580288Z podman machine set
2026-06-05T18:20:50.6580919Z D:/a/podman/podman/pkg/machine/e2e/set_test.go:15
2026-06-05T18:20:50.6582140Z   set machine cpus, disk, memory
2026-06-05T18:20:50.6583082Z   D:/a/podman/podman/pkg/machine/e2e/set_test.go:31
2026-06-05T18:20:50.6584226Z   > Enter [BeforeEach] TOP-LEVEL - D:/a/podman/podman/pkg/machine/e2e/machine_test.go:218 @ 06/05/26 18:20:50.657
2026-06-05T18:20:50.6594902Z   < Exit [BeforeEach] TOP-LEVEL - D:/a/podman/podman/pkg/machine/e2e/machine_test.go:218 @ 06/05/26 18:20:50.658 (1ms)
2026-06-05T18:20:50.6596410Z   > Enter [It] set machine cpus, disk, memory - D:/a/podman/podman/pkg/machine/e2e/set_test.go:31 @ 06/05/26 18:20:50.658
2026-06-05T18:20:50.6598982Z   D:\a\podman\podman\bin\windows\podman.exe machine init --disk-size 11 --image C:\Users\RUNNER~1\AppData\Local\Temp\podman-machine.x86_64.hyperv.vhdx 9d5b1a93fca1
2026-06-05T18:21:23.3820837Z   Machine init complete
2026-06-05T18:21:23.3821609Z   To start your machine run:
2026-06-05T18:21:23.3821892Z 
2026-06-05T18:21:23.3837796Z   	podman machine start 9d5b1a93fca1
2026-06-05T18:21:23.3838255Z 
2026-06-05T18:21:23.3910517Z   D:\a\podman\podman\bin\windows\podman.exe machine set --memory 524288 9d5b1a93fca1
2026-06-05T18:21:23.5157753Z   Error: requested amount of memory (524288 MB) greater than total system memory (16378 MB)
2026-06-05T18:21:23.5309499Z   D:\a\podman\podman\bin\windows\podman.exe machine set --cpus 2 --disk-size 102 --memory 4096 9d5b1a93fca1
2026-06-05T18:21:25.2914933Z   D:\a\podman\podman\bin\windows\podman.exe machine set --cpus 2 --disk-size 5 --memory 4096 9d5b1a93fca1
2026-06-05T18:21:25.4184815Z   Error: new disk size must be larger than 102 GB
2026-06-05T18:21:25.4303190Z   D:\a\podman\podman\bin\windows\podman.exe machine start 9d5b1a93fca1
2026-06-05T18:21:25.5487241Z   Starting machine "9d5b1a93fca1"
2026-06-05T18:22:41.9670457Z 
2026-06-05T18:22:41.9673097Z   This machine is currently configured in rootless mode. If your containers
2026-06-05T18:22:41.9674251Z   require root permissions (e.g. ports < 1024), or if you run into compatibility
2026-06-05T18:22:41.9676692Z   issues with non-podman clients, you can switch using the following command:
2026-06-05T18:22:41.9677303Z 
2026-06-05T18:22:41.9678584Z   	podman machine set --rootful 9d5b1a93fca1
2026-06-05T18:22:41.9678941Z 
2026-06-05T18:23:13.2504756Z   Error: machine did not transition into running state: ssh error: ssh: handshake failed: read tcp 127.0.0.1:63178->127.0.0.1:63088: wsarecv: An existing connection was forcibly closed by the remote host.
2026-06-05T18:23:13.2746605Z   [FAILED] Expected
2026-06-05T18:23:13.2747395Z       <int>: 125
2026-06-05T18:23:13.2748714Z   to match exit code:
2026-06-05T18:23:13.2749102Z       <int>: 0
2026-06-05T18:23:13.2897930Z   In [It] at: D:/a/podman/podman/pkg/machine/e2e/set_test.go:57 @ 06/05/26 18:23:13.273

Mhh, that does not seem to work and still fails, though now with the longer retry timeout so it seem like ssh not coming up at all or a gvproxy error seems more likely.

@mheon

mheon commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Even if it's not fixing the issue, still LGTM for me - this is a much cleaner way of doing things and should be less race-prone

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

machine No New Tests Allow PR to proceed without adding regression tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants