Jobs fails - shell not found (version python) - How to Use GitLab

link管理

链接快照平台

输入网页链接，自动生成快照
标签化管理网页链接

相关文章推荐

怕考试的日光灯 · Apple iPhone 13 - ...· 4 月前 ·

爱热闹的泡面 · hope to have a way to ...· 7 月前 ·

斯文的电影票 · 奇点iS6发布3年至今未量产 ...· 1 年前 ·

果断的针织衫 · Search | IBISWorld· 1 年前 ·

潇洒的大海 · 怒斩狂飙-电影-高清在线观看-百搜视频· 1 年前 ·

Hello,

jobs fail when we set the latest version or major version of python, for example, the settings 3.9, 3.8.17, 3.1.14 do not work, but versions 3.8.16 and 3.11.3 work

Using Docker executor with image python:3.9 …
Pulling docker image python:3.9 …
shell not found
ERROR: Job failed: exit code 1

What could be causing this? Thank you

Hi @lukgit , welcome to the GitLab Community Forum!

All the versions of python listed work in GitLab CI jobs for me, no problems: Pipeline · Greg Myers / 🐍 · GitLab

I suspect the contents of the script section of your CI jobs is causing this problem, not the image itself.

What happens between


    Pulling docker image python:3.9

and


    shell not found

Can you share the


    .gitlab-ci.yml

snippet where you define what this CI job does?

We’ve found the same problem with some of our clients’ pipelines. Noting that security bugfix releases went in on June 6th 2023 across several major versions of Python, perhaps their changes are incompatible with your Gitlab CI scripts in some fashion?

Release announcements for reference:

Python 3.8.17

Python 3.9.17

Python 3.10.12

Python 3.11.4

Thank you for looking into this. It’s strange that before the jobs were working and a few days ago they started crashing.

Pulling docker image python:3.9 …
Using docker image xyz for python:3.9 …
Running on runner-xyz1-project-123-concurrent-0 via ly9999…
Fetching changes with git depth set to 50…
Reinitialized existing Git repository in /construct/sem/reporting/yx_y1/.git/
Checking out 6xvbbg as master…
Skipping Git submodules setup
Checking cache for default-protected…
No URL provided, cache will not be downloaded from shared cache server. Instead a local version of cache will be extracted.
Successfully extracted cache
shell not found
shell not found
ERROR: Job failed: exit code 1

No proper solution yet, I’m afraid - rolling back one patch version has worked for us in the interim, but it’s not sustainable in the long run.

So far, all I can say is that it’s misbehaving in the wheel-building step of the pipeline, and not getting any further than that. We might have to revisit the build tooling and see if that helps.

Following up on that, using the python:3.10.12 image it’s not even getting to an “echo” statement at the start of the script block. The following minimal .gitlab-ci.yml file is failing.

default:
  image: python:3.10.12
stages:
  - setup
wheel:
  stage: setup
  script:
    - echo "Does it get this far?"
And here is the output (with #REDACTIONS#) from our Gitlab pipeline:
Running with gitlab-runner 16.0.1 (79704081)
  on #SERVER#
Preparing the "docker" executor
Using Docker executor with image python:3.10.12 ...
Pulling docker image python:3.10.12 ...
Using docker image sha256:23e11cf6844c334b2970fd265fb09cfe88ec250e1e80db7db973d69d757bdac4 for python:3.10.12 with digest docker.io/python@sha256:60ec661aff9aa0ec90bc10ceeab55d6d04ce7b384157d227917f3b49f2ddb32e ...
Preparing environment
Running on #RUNNER# via #SERVER#...
Getting source from Git repository 00:03
Fetching changes with git depth set to 50...
Initialized empty Git repository in #BUILD_GITDIR#
Created fresh repository.
Checking out #HASH# as detached HEAD (ref is test-build-change)...
Skipping Git submodules setup
Executing "step_script" stage of the job script 00:01
Using docker image sha256:23e11cf6844c334b2970fd265fb09cfe88ec250e1e80db7db973d69d757bdac4 for python:3.10.12 with digest docker.io/python@sha256:60ec661aff9aa0ec90bc10ceeab55d6d04ce7b384157d227917f3b49f2ddb32e ...
shell not found
Cleaning up project directory and file based variables 00:01
ERROR: Job failed: exit code 1
That works fine if we pin it back to python:3.10.11.
              A further bit of discovery: the new Docker images build from a Debian 12 (bookworm) base image, rather than the previous Debian 11 (bullseye) image, presumably because there was a high severity OpenSSL vulnerability (CVE-2023-2650).
python:3.10.11 image on Docker Hub
python:3.10.12 image on Docker Hub
Is it possible that the change in the underlying OS base image could have also changed the shell configuration/availability for these images, such that it’s not holding hands with the Gitlab runner correctly anymore?
              Could be.
What you are seeing, I’ve experienced once (similar) with a Windows runner, when a wrong shell command was defined in the runner configuration.
AFAIK:
Every docker image provides one or more shells (terminals) that can be used by a runner to execute script defined in .gitlab-ci.yml. This is prerequisite for any script to run in the job and I believe this might be the reason why your script part is not executing. E.g. if I use ubuntu:latest - it provides “/bin/sh” shells and “/bin/bash” shells → this means Runner has to use one of those shells as well.
GitLab runner supports different shells, depending on the platform - Types of shells supported by GitLab Runner | GitLab . It can be configured in config.toml file of the runner. Normally, default works, but this is where things can be mismatched.
I might be wrong as well, but this could be something to check.
Are you using your own GitLab runners or shared runners from gitlab.com ? If you have your own runners, can you please share your config.toml with us?
P.S. Have you tried adding this to your config file?
              Confirming essentially what @DrCuriosity wrote above – the images that fail here was rebuilt from bookworm to bullseye, but in some cases, the release number was not bumped.  Several work-arounds below, including using 3.10.11 if you previously relied on 3.10.
It’s not clear to me what the source problem is. A similar problem occurred many years ago and is referenced "shell not found" when trying to use Ubuntu or Fedora image (#27614) · Issues · GitLab.org / gitlab-runner · GitLab, but that task is still open! Some suggest a newer version of docker fixes the problem.  However, I don’t think that’s the right answer.
I suspect that the gitlab-ci runner does actually have a problem, perhaps by relying on the use of bash, instead of using purely posix shell scripts. But I could not reproduce the problem running a container directly using the same inputs. The source code of the gitlab-ci runner is quite convoluted. Even with debugging, I could not ascertain what is really going on.
I also cannot understand why there is a difference because of Debian11 to 12. In analyzing diffs across the exported containers, I could not understand why the third workaround (see below) would have the effect it does:
On both exported filesystems, /bin/sh points to /bin/dash
On both exported filesystems, /bin/dash and /bin/bash are real executables about the same size from their corresponding mate on the other image.
Perhaps gitlab-ci-runner is invoking a scriptlet or the container in some way that the gitlab-runner’s --debug mode does not indicate.
OK, taking a step back:
python:3.10.12 is seen with bookworm in digest           python@sha256:aa79a3d35cb9787452dad51e17e4b6e06822a1a601f8b4ac4ddf74f0babcbfd5 . There are no problems with this image.
However, the same version of python with the same minor version number was release under bullseye with the digest         python@sha256:a8462db480ec3a74499a297b1f8e074944283407b7a417f22f20d8e2e1619782. This image will fail without workarounds.
Workarounds
Use the digest of the last working image, as suggested above
Find the most recent minor version number that still works: For 3.10, it’s 3.10.11.  And pray some idiot doesn’t rebuild and re-push that image.
Use the fugly hack suggested on gitlab issue tracker.
image:
    name: python:3.10
    entrypoint: [ '/bin/bash', '-c', 'ln -snf /bin/bash /bin/sh && /bin/bash -c $0' ]
Are you using your own GitLab runners or shared runners from gitlab.com ? If you have your own runners, can you please share your config.toml with us?
I’m working with a GitLab instance internal to an institution. Community Edition v16.0.2, runner is currently gitlab-runner 16.0.1 (79704081). The runner configuration is locked down and not available to me. I’ll see if I can find the right person to make aware of this thread, though.
              I’m having the same issues. Running gitlab-runner v. 16.0.2.
My config.toml:
concurrent = 1
check_interval = 0
shutdown_timeout = 0
[session_server]
  session_timeout = 1800
[[runners]]
  name = "*************************************************"
  url = "https://gitlab.com/"
  id = 22901457
  token = "*********************************"
  token_obtained_at = 2023-04-24T16:18:40Z
  token_expires_at = 0001-01-01T00:00:00Z
  executor = "docker"
  [runners.docker]
    tls_verify = false
    image = "alpine:latest"
    privileged = false
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/cache"]
    shm_size = 0
              As mentioned earlier, this appears to be a problem with any image built using bookworm. Looking at the projects in my GitLab instance, we use different containers for different tasks. Our composer images are built using alpine which run fine. However, the latest version of node, python, and php use bookworm, those all give the no shell found error. If I change the job that uses node:latest to node:18-alpine it will work.
I tried specifying an entrypoint for the image but get the following error:
install_npm_dependencies:
  image: 
    name: "node:latest"
    entrypoint: ["/bin/bash"] # also tried /bin/sh, /usr/bin/bash, and /usr/bin/sh
/usr/bin/sh: /usr/bin/sh: cannot execute binary file
When the entry point is set to /usr/bin/sh then I get a message saying that it can’t open the file. When I run the container locally with either /bin/bash or /usr/bin/sh it works.
              I can confirm that this
 entrypoint: [ '/bin/bash', '-c', 'ln -snf /bin/bash /bin/sh && /bin/bash -c $0' ]
works.  Apparently you have to override the entrypoint.  I’ve also set my shell=“bash” in the [[runners]] section of config.toml if that matters.
              In my case gitlab-runner’s shell detection script was failing to stat the available shell executables due to an incompatibility between the container and the host, thus returning failure for every check and giving up with the “shell not found” error.
This sometimes happens when running bleeding edge images on older hosts, but typically it’s more obvious and often presents itself as a filesystem permissions error or some other system call failure. Essentially, the binaries/libraries in the container are using new/modified system calls that the dockerd/containerd’s seccomp layer doesn’t understand yet. Updating the host kernel and container runtime tends to fix this.
              Thanks @rpetti!

We faced same kind of issue while using oracle linux 9 build image on lower version of VM. Akash has come across your comment.

We are getting your insights in and out of opentext