You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
We are using self hosted runners, deployed on EKS and we recently started experiencing repeating crashes of our runners. This doesn't seem to be related to any change besides having the previously long running containers restarted.
Our Dockerfile:
FROM debian:bookworm
ARG RUNNER_VERSION
ENV GITHUB_PERSONAL_TOKEN=""RUN apt-get update && // ... install lots of background dependencies and packages
USER github
WORKDIR /home/github
RUN curl -O -L https://github.com/actions/runner/releases/download/v$RUNNER_VERSION/actions-runner-linux-x64-$RUNNER_VERSION.tar.gz && \
tar xzf ./actions-runner-linux-x64-$RUNNER_VERSION.tar.gz && \
sudo ./bin/installdependencies.sh
COPY --chown=github:github entrypoint.sh ./entrypoint.sh
RUN sudo chmod u+x ./entrypoint.sh
ENTRYPOINT ["/home/github/entrypoint.sh"]
To Reproduce
After runner for some amount of time, the runner container will crash and restart. Logs typically look similar to the following:
Requesting registration URL at 'https://api.github.com/orgs/acme/actions/runners/registration-token'
--------------------------------------------------------------------------------
| ____ _ _ _ _ _ _ _ _ |
| / ___(_) |_| | | |_ _| |__ / \ ___| |_(_) ___ _ __ ___ |
| | | _| | __| |_| | | | | '_ \ / _ \ / __| __| |/ _ \| '_ \/ __| |
| | |_| | | |_| _ | |_| | |_) | / ___ \ (__| |_| | (_) | | | \__ \ |
| \____|_|\__|_| |_|\__,_|_.__/ /_/ \_\___|\__|_|\___/|_| |_|___/ |
| |
| Self-hosted runner registration |
| |
--------------------------------------------------------------------------------
# Authentication
√ Connected to GitHub
# Runner Registration
A runner exists with the same name
√ Successfully replaced the runner
√ Runner connection is good
# Runner settings
√ Settings Saved.
√ Connected to GitHub
A session for this runner already exists.
2024-12-14 22:04:06Z: Runner connect error: The actions runner runner-5969d445d5-7vx8v already has an active session.. Retrying until reconnected.
√ Connected to GitHub
A session for this runner already exists.
√ Connected to GitHub
A session for this runner already exists.
√ Connected to GitHub
A session for this runner already exists.
√ Connected to GitHub
A session for this runner already exists.
√ Connected to GitHub
A session for this runner already exists.
√ Connected to GitHub
A session for this runner already exists.
√ Connected to GitHub
A session for this runner already exists.
Stop retry on SessionConflictException after retried for 240 seconds.
Failed to create session. The actions runner runner-5969d445d5-7vx8v already has an active session.
Runner listener exit with Session Conflict error, stop the service, no retry needed.
Exiting runner...
Expected behavior
Runner should not crashloop.
Runner Version and Platform
Runner version 2.320.0, running on debian:bookworm based Docker, on top of EKS version v1.30.6-eks-7f9249a.
Runner and Worker's Diagnostic Logs
We don't have diagnostic logs of crashed runners because this info is not persisted. If relevant, we could persist it for debugging.
As far as I can see in running containers, nothing besides INFO logs appear.
The text was updated successfully, but these errors were encountered:
Some unofficial hints to improve reliability of your container.
# don't use a runner name that persists,# e.g. don't use hostname alone append a uuid# A crashed runner doesn't exit the session# The session is blocked until the Actions Service cleans up# This takes more than 5 minutes, instant container restart with same runner name might reset the timer!# I assume `-replace` doesn't kill a stale session of a given runner name
./config.sh \
--name "$(hostname)-$(uuidgen)" \
--token ${RUNNER_TOKEN} \
--url https://github.com/acme \
--unattended \
--replace
remove() {
# you need to regenerate RUNNER_TOKEN if it is ca. 1 hour old!!! or you get peermission denied# at least the configure command has a `--pat` option to let the runner make the rest api call, but yes caching this token saves rate limited api calls if reused to remove it in less that 1h.
./config.sh remove --unattended --token "${RUNNER_TOKEN}"
}
Describe the bug
We are using self hosted runners, deployed on EKS and we recently started experiencing repeating crashes of our runners. This doesn't seem to be related to any change besides having the previously long running containers restarted.
Our Dockerfile:
With
entrypoint.sh
being:To Reproduce
After runner for some amount of time, the runner container will crash and restart. Logs typically look similar to the following:
Expected behavior
Runner should not crashloop.
Runner Version and Platform
Runner version
2.320.0
, running ondebian:bookworm
based Docker, on top of EKS versionv1.30.6-eks-7f9249a
.Runner and Worker's Diagnostic Logs
We don't have diagnostic logs of crashed runners because this info is not persisted. If relevant, we could persist it for debugging.
As far as I can see in running containers, nothing besides INFO logs appear.
The text was updated successfully, but these errors were encountered: