Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Actions Runner fails on IPv6 only host #3138

Open
alsutton opened this issue Feb 8, 2024 · 9 comments
Open

Actions Runner fails on IPv6 only host #3138

alsutton opened this issue Feb 8, 2024 · 9 comments
Labels
bug Something isn't working

Comments

@alsutton
Copy link

alsutton commented Feb 8, 2024

Describe the bug
On an IPv6 only host a self-hosted action runner fails to register itself as online due to a failed network request to a URL with an IPv4 only address.

Context: Many hosting companies are now charging for IPv4 addresses (most recently AWS), so having IPv6-only self-hosted runners is desirable from an operational cost perspective.

To Reproduce

  1. Configure a self-hosted action-runner on an IPv4/v6 host.
  2. Reconfigure the host to be IPv6 only.
  3. Try to start the action-runner.
  4. Runner fails to register and reports error in the logs (below)

Expected behavior
I expected the runner to start and register as online.

Runner Version and Platform

Version of your runner?
2.312.0

OS of the machine running the runner? OSX/Windows/Linux/...
Linux (Debian 12)

What's not working?

Start-up and registration flow requires IPv4 to connect to a URL (pipelinesghubeus4.actions.githubusercontent.com)

nslookup for host indicates no IPv6 address;

$ nslookup pipelinesghubeus4.actions.githubusercontent.com
Server:  UnKnown
Address:  fdc9:993e:6df:0:3e9e:c7ff:fe92:2ee8

Non-authoritative answer:
Name:    pipelinesghubeus4.actions.githubusercontent.com
Address:  20.232.252.48

Job Log Output

Runner_20240208-075819-utc.log, potential token redacted;

[2024-02-08 07:59:02Z INFO GitHubActionsService] Starting operation Location.GetConnectionData
[2024-02-08 07:59:02Z WARN GitHubActionsService] Attempt 1 of GET request to https://pipelinesghubeus4.actions.githubusercontent.com/{{REDACTED}}/_apis/connectionData?connectOptions=0&lastChangeId=1500696579&lastChangeId64=1500696579 failed (Socket Error: NetworkUnreachable). The operation will be retried in 11.01 seconds.
[2024-02-08 07:59:11Z WARN GitHubActionsService] Attempt 2 of GET request to https://pipelinesghubeus4.actions.githubusercontent.com/{{REDACTED}}/_apis/connectionData?connectOptions=0&lastChangeId=1500696579&lastChangeId64=1500696579 failed (Socket Error: NetworkUnreachable). The operation will be retried in 12.97 seconds.
[2024-02-08 07:59:12Z WARN GitHubActionsService] Attempt 2 of GET request to https://pipelinesghubeus4.actions.githubusercontent.com/{{REDACTED}}/_apis/connectionData?connectOptions=0&lastChangeId=1500696579&lastChangeId64=1500696579 failed (Socket Error: NetworkUnreachable). The operation will be retried in 12.91 seconds.
[2024-02-08 07:59:13Z WARN GitHubActionsService] Attempt 2 of GET request to https://pipelinesghubeus4.actions.githubusercontent.com/{{REDACTED}}/_apis/connectionData?connectOptions=0&lastChangeId=1500696579&lastChangeId64=1500696579 failed (Socket Error: NetworkUnreachable). The operation will be retried in 12.829 seconds.
[2024-02-08 07:59:24Z WARN GitHubActionsService] Attempt 3 of GET request to https://pipelinesghubeus4.actions.githubusercontent.com/{{REDACTED}}/_apis/connectionData?connectOptions=0&lastChangeId=1500696579&lastChangeId64=1500696579 failed (Socket Error: NetworkUnreachable). The operation will be retried in 17.574 seconds.
[2024-02-08 07:59:25Z WARN GitHubActionsService] Attempt 3 of GET request to https://pipelinesghubeus4.actions.githubusercontent.com/{{REDACTED}}/_apis/connectionData?connectOptions=0&lastChangeId=1500696579&lastChangeId64=1500696579 failed (Socket Error: NetworkUnreachable). The operation will be retried in 18.057 seconds.
[2024-02-08 07:59:25Z WARN GitHubActionsService] Attempt 3 of GET request to https://pipelinesghubeus4.actions.githubusercontent.com/{{REDACTED}}/_apis/connectionData?connectOptions=0&lastChangeId=1500696579&lastChangeId64=1500696579 failed (Socket Error: NetworkUnreachable). The operation will be retried in 18.337 seconds.
[2024-02-08 07:59:41Z ERR  GitHubActionsService] Attempt 4 of GET request to https://pipelinesghubeus4.actions.githubusercontent.com/{{REDACTED}}/_apis/connectionData?connectOptions=0&lastChangeId=1500696579&lastChangeId64=1500696579 failed (Socket Error: NetworkUnreachable). The maximum number of attempts has been reached.
[2024-02-08 07:59:41Z INFO GitHubActionsService] Finished operation Location.GetConnectionData
[2024-02-08 07:59:41Z INFO RunnerServer] Catch exception during connect. 3 attempt left.
[2024-02-08 07:59:41Z ERR  RunnerServer] System.Net.Http.HttpRequestException: Network is unreachable (pipelinesghubeus4.actions.githubusercontent.com:443)
 ---> System.Net.Sockets.SocketException (101): Network is unreachable
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
   at System.Net.Sockets.Socket.<ConnectAsync>g__WaitForConnectWithCancellation|277_0(AwaitableSocketAsyncEventArgs saea, ValueTask connectTask, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectToTcpHostAsync(String host, Int32 port, HttpRequestMessage initialRequest, Boolean async, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at GitHub.Services.Common.VssHttpRetryMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
   at GitHub.Services.WebApi.VssHttpClientBase.SendAsync(HttpRequestMessage message, HttpCompletionOption completionOption, Object userState, CancellationToken cancellationToken)
   at GitHub.Services.WebApi.VssHttpClientBase.SendAsync[T](HttpRequestMessage message, Object userState, CancellationToken cancellationToken)
   at GitHub.Services.Location.Client.LocationHttpClient.GetConnectionDataAsync(ConnectOptions connectOptions, Int64 lastChangeId, CancellationToken cancellationToken, Object userState)
   at GitHub.Services.WebApi.Location.VssServerDataProvider.GetConnectionDataAsync(ConnectOptions connectOptions, Int32
lastChangeId, CancellationToken cancellationToken)
   at GitHub.Services.WebApi.Location.VssServerDataProvider.ConnectAsync(ConnectOptions connectOptions, CancellationToken cancellationToken)
   at GitHub.Runner.Common.RunnerService.EstablishVssConnection(Uri serverUrl, VssCredentials credentials, TimeSpan timeout)
[2024-02-08 07:59:41Z ERR  RunnerServer] #####################################################
[2024-02-08 07:59:41Z ERR  RunnerServer] System.Net.Sockets.SocketException (101): Network is unreachable
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
   at System.Net.Sockets.Socket.<ConnectAsync>g__WaitForConnectWithCancellation|277_0(AwaitableSocketAsyncEventArgs saea, ValueTask connectTask, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectToTcpHostAsync(String host, Int32 port, HttpRequestMessage initialRequest, Boolean async, CancellationToken cancellationToken)
[2024-02-08 07:59:41Z INFO GitHubActionsService] Starting operation Location.GetConnectionData

Runner and Worker's Diagnostic Logs

Logs given above

@alsutton alsutton added the bug Something isn't working label Feb 8, 2024
@alsutton
Copy link
Author

alsutton commented Feb 9, 2024

Possible duplicate of #2840. Is there any news on progress?

@DracoBlue
Copy link

I am interested in this, too.

alpeb added a commit to linkerd/linkerd2-proxy-init that referenced this issue Mar 28, 2024
## Flags Changes

This adds the proxy-init flag `--iptables-mode` (with possible values `legacy` and `nft`), which supersedes `--firewal-bin-path` and `firewall-save-bin-path` (which still remain supported).
Also the `--ipv6` flag has been added (default `true`).

After the set of rules run via iptables are processed, if `--ipv6` is true (which is the default), the same set of rules will be run via ip6tables.

Analog changes were applied to linkerd-cni as well.

## Backwards-Compatibility

This is backwards-compatible with older control planes and upcoming control planes.
If `--ipv6` is not passed (and thus defaults to true), this doesn't impact operation even if the cluster doesn't support IPv6; the ip6tables rules are applied but they're innocuous.
OTOH if there's no kernel support for IPv6 (which is the case for github runners*) then the ip6tables command will fail but we'll just log the failure and not fail the linkerd-init container (nor the `add` command for linkerd-cni). This avoids having to explicitly set `--ipv6=false`, but it can be set if the user is aware of such limitations and wants to get rid of the errors.

## Testing Improvements

The cni-plugin-integration workflow has been simplified by using a matrix strategy, and enhanced by parameterizing the iptables-mode config.

## Linkerd IPv6 Support

This allows routing IPv6 traffic to the proxy, but is just the first step towards IPv6/dual-stack support. Control plane and proxy changes will come up next.

## (*) Github Runners IPv6 Support

Even though `modinfo` signals support for IPv6, `ip6tables` commands throw modprobe errors. Indeed, according to actions/runner-images#668 support is not there yet.
Also, according to actions/runner#3138 there are issues with hosted runners as well, but that might not affect us if we still expose an IPv4 interface to interact with github. Something to take into account when we get to IPv6 integration testing.
undergroundwires added a commit to undergroundwires/privacy.sexy that referenced this issue Mar 29, 2024
This commit introduces the `force-ipv4` GitHub action to address
connectivity issues caused by the lack of IPv6 support in GitHub
runners. Details:
- actions/runner#3138
- actions/runner-images#668

This change solves connection problems when Node's `fetch` API fails due
to `UND_ERR_CONNECT_TIMEOUT` errors. Details:
- actions/runner-images#9540
- actions/runner#3213

This action disables IPv6 at the system level, ensuring all outging
requests use IPv4. Resolving connectivity issues when running external
URL checks and Docker build checks.

This solution is a temporary workaround until GitHub runners support
IPv6 or Node `fetch` API has a working solution such as Happy Eyeball.
Detais:
- nodejs/node#41625
- nodejs/undici#1531
@undergroundwires
Copy link

Here's my workaround (open-source and documented) that I hope that can help you too:

After days of research and trial/error, this is how I got this working:

  1. Create a script called force-ipv4.sh, that configures system to prefer IPv4 over IPv6, call it to configure the machine. It was not easy to find a reliable cross-platform solution and I went Cloudflare WARP for DNS resolution along with some system configurations.
  2. To easily use the script in GitHub workflows, create GitHub action called force-ipv4 and call it in GitHub runners.
  3. Fixes the IPv6 request issues, and you can happily run e.g. fetch API from Node.

Related commit introducing this fix: undergroundwires/privacy.sexy@52fadcd

undergroundwires added a commit to undergroundwires/privacy.sexy that referenced this issue Mar 30, 2024
This commit upgrades Node.js version to v20.x in CI/CD environment.

Previously used Node 18.x is moving towards end-of-life, with a planned
date of 2025-04-30. In contrast, Node 20.x has been offering long-term
support (LTS) since 2023-10-24. This makes Node 20.x a stable and
recommended version for production environments.

This commit also configures `actions/setup-node` with the
`check-latest` flag to always use the latest Node 20.x version, keeping
CI/CD setup up-to-date with minimal maintenance.
Details:
- actions/setup-node#165
- actions/setup-node#160

Using Node 20.x in CI/CD environments provides better compatibility with
Electron v29.0 which moves to Node 20.x.
Details:
- electron/electron#40343

This upgrade improves network connection handling in CI/CD pipelines
(where issues occur due to GitHub runners not supporting IPv6).
Details:
- actions/runner#3138
- actions/runner-images#668
- actions/runner#3213
- actions/runner-images#9540

Node 20.x adopts the Happy Eyeballs algorithm for improved IPv6
connectivity.
- nodejs/node#40702
- nodejs/node#41625
- nodejs/node#44731

This mitigates issues like `UND_ERR_CONNECT_TIMEOUT` and localhost DNS
resolution in CI/CD environments:
Details:
- nodejs/node#40537
- actions/runner#3213
- actions/runner-images#9540

Node 20 introduces `setDefaultAutoSelectFamily`, a global function from
Node 19.4.0, enabling better IPv4 support, especially in environments
with limited or problematic IPv6 support.
Details:
- nodejs/node#45777

Node 20.x defaults to the new `autoSelectFamily`, improving network
connection reliability in GitHub runners lacking full IPv6 support.
Details:
- nodejs/node#46790
undergroundwires added a commit to undergroundwires/privacy.sexy that referenced this issue Mar 30, 2024
This commit upgrades Node.js version to v20.x in CI/CD environment.

Previously used Node 18.x is moving towards end-of-life, with a planned
date of 2025-04-30. In contrast, Node 20.x has been offering long-term
support (LTS) since 2023-10-24. This makes Node 20.x a stable and
recommended version for production environments.

This commit also configures `actions/setup-node` with the
`check-latest` flag to always use the latest Node 20.x version, keeping
CI/CD setup up-to-date with minimal maintenance.
Details:
- actions/setup-node#165
- actions/setup-node#160

Using Node 20.x in CI/CD environments provides better compatibility with
Electron v29.0 which moves to Node 20.x.
Details:
- electron/electron#40343

This upgrade improves network connection handling in CI/CD pipelines
(where issues occur due to GitHub runners not supporting IPv6).
Details:
- actions/runner#3138
- actions/runner-images#668
- actions/runner#3213
- actions/runner-images#9540

Node 20.x adopts the Happy Eyeballs algorithm for improved IPv6
connectivity.
- nodejs/node#40702
- nodejs/node#41625
- nodejs/node#44731

This mitigates issues like `UND_ERR_CONNECT_TIMEOUT` and localhost DNS
resolution in CI/CD environments:
Details:
- nodejs/node#40537
- actions/runner#3213
- actions/runner-images#9540

Node 20 introduces `setDefaultAutoSelectFamily`, a global function from
Node 19.4.0, enabling better IPv4 support, especially in environments
with limited or problematic IPv6 support.
Details:
- nodejs/node#45777

Node 20.x defaults to the new `autoSelectFamily`, improving network
connection reliability in GitHub runners lacking full IPv6 support.
Details:
- nodejs/node#46790
@KeithYeh
Copy link

We faced the same issue when migrating our self-hosted runners to IPv6-only VPC on AWS.
BTW, their CloudFormationInit endpoint does not support IPv6 either.

@esantora
Copy link

The runner needs a configurable switch to use IPv4 only in dual-stack systems until GitHub actually supports IPv6.

@touhidurrr
Copy link

Still facing this issue.

@guidograzioli
Copy link

I have been experiencing this problem pretty rarely but in the last few weeks it started to happen ten times a day.

@basvandijk
Copy link

Note that our CI has to run tests that require IPv6 so we have to patch the runner (dfinity@3b9c15c) to not create a local docker network and instead use --network host (essentially using pod's network namespace that has IPv6 configured):

Ideally this is just a runner's configuration option so we don't have to maintain our own runner fork.

@marko-k0
Copy link

marko-k0 commented Dec 13, 2024

Note that our CI has to run tests that require IPv6 so we have to patch the runner (dfinity@3b9c15c) to not create a local docker network and instead use --network host (essentially using pod's network namespace that has IPv6 configured):

Ideally this is just a runner's configuration option so we don't have to maintain our own runner fork.

Correct. Changes proposed in #3622, would allow us to use --network option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

9 participants