-
Marker file
-
System Python
-
Implementation Notes
-
References
Attention
This PEP is a historical document. The up-to-date, canonical spec,
Externally Managed Environments
, is maintained on the
PyPA specs page
.
See the
PyPA specification update process
for how to propose changes.
A long-standing practical problem for Python users has been conflicts
between OS package managers and Python-specific package management
tools like pip. These conflicts include both Python-level API
incompatibilities and conflicts over file ownership.
Historically, Python-specific package management tools have defaulted
to installing packages into an implicit global context. With the
standardization and popularity of virtual environments, a better
solution for most (but not all) use cases is to use Python-specific
package management tools only within a virtual environment.
This PEP proposes a mechanism for a Python installation to communicate
to tools like pip that its global package installation context is
managed by some means external to Python, such as an OS package
manager. It specifies that Python-specific package management tools
should neither install nor remove packages into the interpreter’s
global context, by default, and should instead guide the end user
towards using a virtual environment.
It also standardizes an interpretation of the
sysconfig
schemes so
that, if a Python-specific package manager is about to install a
package in an interpreter-wide context, it can do so in a manner that
will avoid conflicting with the external package manager and reduces
the risk of breaking software shipped by the external package manager.
A few terms used in this PEP have multiple meanings in the contexts
that it spans. For clarity, this PEP uses the following terms in
specific ways:
-
distro
-
Short for “distribution,” a collection of various sorts of
software, ideally designed to work properly together, including
(in contexts relevant to this document) the Python interpreter
itself, software written in Python, and software written in other
languages. That is, this is the sense used in phrases such as
“Linux distro” or “Berkeley Software Distribution.”
A distro can be an operating system (OS) of its own, such as
Debian, Fedora, or FreeBSD. It can also be an overlay distribution
that installs on top of an existing OS, such as Homebrew or
MacPorts.
This document uses the short term “distro,” because the term
“distribution” has another meaning in Python packaging contexts: a
source or binary distribution package of a single piece of Python
language software, that is, in the sense of
setuptools.dist.Distribution
or “sdist”. To avoid confusion,
this document does not use the plain term “distribution” at all.
In the Python packaging sense, it uses the full phrase
“distribution package” or just “package” (see below).
The provider of a distro - the team or company that collects and
publishes the software and makes any needed modifications - is its
distributor
.
-
package
-
A unit of software that can be installed and used within Python.
That is, this refers to what Python-specific packaging tools tend
to call a “
distribution package
” or simply a “distribution”;
the colloquial abbreviation “package” is used in the sense of the
Python Package Index.
This document does not use “package” in the sense of an importable
name that contains Python modules, though in many cases, a
distribution package consists of a single importable package of
the same name.
This document generally does not use the term “package” to refer
to units of installation by a distro’s package manager (such as
.deb
or
.rpm
files). When needed, it uses phrasing such as
“a distro’s package.” (Again, in many cases, a Python package is
shipped inside a distro’s package named something like
python-
plus the Python package name.)
-
Python-specific package manager
-
A tool for installing, upgrading, and/or removing Python packages
in a manner that conforms to Python packaging standards (such as
PEP 376
and
PEP 427
). The most popular Python-specific package
manager is pip
; other examples include the old Easy
Install command
as well as direct usage of a
setup.py
command.
(Conda
is a bit of a special case, as the
conda
command can install much more than just Python packages, making it
more like a distro package manager in some senses. Since the
conda
command generally only operates on Conda-created
environments, most of the concerns in this document do not apply
to
conda
when acting as a Python-specific package manager.)
-
distro package manager
-
A tool for installing, upgrading, and/or removing a distro’s
packages in an installed instance of that distro, which is capable
of installing Python packages as well as non-Python packages, and
therefore generally has its own database of installed software
unrelated to
PEP 376
. Examples include
apt
,
dpkg
,
dnf
,
rpm
,
pacman
, and
brew
. The salient feature is that if
a package was installed by a distro package manager, removing or
upgrading it in a way that would satisfy a Python-specific package
manager will generally leave a distro package manager in an
inconsistent state.
This document also uses phrases like “external package manager” or
“system’s package manager” to refer to a distro package manager in
certain contexts.
-
shadow
-
To shadow an installed Python package is to cause some other
package to be preferred for imports without removing any files
from the shadowed package. This requires multiple entries on
sys.path
: if package A 2.0 installs module
a.py
in one
sys.path
entry, and package A 1.0 installs module
a.py
in
a later
sys.path
entry, then
import
a
returns the module
from the former, and we say that A 2.0 shadows A 1.0.
Thanks to Python’s immense popularity, software distros (by which we
mean Linux and other OS distros as well as overlay distros like
Homebrew and MacPorts) generally ship Python for two purposes: as a
software package to be used in its own right by end users, and as a
language dependency for other software in the distro.
For example, Fedora and Debian (and their downstream distros, as well
as many others) ship a
/usr/bin/python3
binary which provides the
python3
command available to end users as well as the
#!/usr/bin/python3
shebang for Python-language software included
in the distro. Because there are no official binary releases of Python
for Linux/UNIX, almost all Python end users on these OSes use the
Python interpreter built and shipped with their distro.
The
python3
executable available to the users of the distro and
the
python3
executable available as a dependency for other
software in the distro are typically the same binary. This means that
if an end user installs a Python package using a tool like
pip
outside the context of a virtual environment, that package is visible
to Python-language software shipped by the distro. If the
newly-installed package (or one of its dependencies) is a newer,
backwards-incompatible version of a package that was installed through
the distro, it may break software shipped by the distro.
This may pose a critical problem for the integrity of distros, which
often have package-management tools that are themselves written in
Python. For example, it’s possible to unintentionally break Fedora’s
dnf
command with a
pip
install
command, making it hard to
recover.
This applies both to system-wide installs (
sudo
pip
install
) as
well as user home directory installs (
pip
install
--user
), since
packages in either location show up on the
sys.path
of
/usr/bin/python3
.
There is a worse problem with system-wide installs: if you attempt to
recover from this situation with
sudo
pip
uninstall
, you may end
up removing packages that are shipped by the system’s package manager.
In fact, this can even happen if you simply upgrade a package - pip
will try to remove the old version of the package, as shipped by the
OS. At this point it may not be possible to recover the system to a
consistent state using just the software remaining on the system.
Over the past many years, a consensus has emerged that the best way to
install Python libraries or applications (when not using a distro’s
package) is to use a virtual environment. This approach was
popularized by the PyPA
virtualenv
project, and a simple version of
that approach is now available in the Python standard library as
venv
. Installing a Python package into a virtualenv prevents it
from being visible to the unqualified
/usr/bin/python3
interpreter
and prevents breaking system software.
In some cases, however, it’s useful and intentional to install a
Python package from outside of the distro that influences the behavior
of distro-shipped commands. This is common in the case of software
like Sphinx or Ansible which have a mechanism for writing
Python-language extensions. A user may want to use their distro’s
version of the base software (for reasons of paid support or security
updates) but install a small extension from PyPI, and they’d want that
extension to be importable by the software in their base system.
While this continues to carry the risk of installing a newer version
of a dependency than the operating system expects or otherwise
negatively affecting the behavior of an application, it does not need
to carry the risk of removing files from the operating system. A tool
like pip should be able to install packages in some directory on the
default
sys.path
, if specifically requested, without deleting
files owned by the system’s package manager.
Therefore, this PEP proposes two things.
First, it proposes
a way for distributors of a Python interpreter to
mark that interpreter as having its packages managed by means external
to Python
, such that Python-specific tools like pip should not
change the installed packages in the interpreter’s global
sys.path
in any way (add, upgrade/downgrade, or remove) unless specifically
overridden. It also provides a means for the distributor to indicate
how to use a virtual environment as an alternative.
This is an opt-in mechanism: by default, the Python interpreter
compiled from upstream sources will not be so marked, and so running
pip
install
with a self-compiled interpreter, or with a distro
that has not explicitly marked its interpreter, will work as it always
has worked.
Second, it sets the rule that when installing packages to an
interpreter’s global context (either to an unmarked interpreter, or if
overriding the marking),
Python-specific package managers should
modify or delete files only within the directories of the sysconfig
scheme in which they would create files
. This permits a distributor
of a Python interpreter to set up two directories, one for its own
managed packages, and one for unmanaged packages installed by the end
user, and ensure that installing unmanaged packages will not delete
(or overwrite) files owned by the external package manager.
As described in detail in the next section, the first behavior change
involves creating a marker file named
EXTERNALLY-MANAGED
, whose
presence indicates that non-virtual-environment package installations
are managed by some means external to Python, such as a distro’s
package manager. This file is specified to live in the
stdlib
directory in the default
sysconfig
scheme, which marks the
interpreter / installation as a whole, not a particular location on
sys.path
. The reason for this is that, as identified above, there
are two related problems that risk breaking an externally-managed
Python: you can install an incompatible new version of a package
system-wide (e.g., with
sudo
pip
install
), and you can install one
in your user account alone, but in a location that is on the standard
Python command’s
sys.path
(e.g., with
pip
install
--user
). If
the marker file were in the system-wide
site-packages
directory,
it would not clearly apply to the second case. The
Alternatives
section has further discussion of possible locations.
The second behavior change takes advantage of the existing
sysconfig
setup in distros that have already encountered this
class of problem, and specifically addresses the problem of a
Python-specific package manager deleting or overwriting files that are
owned by an external package manager.
The changed behavior in this PEP is intended to “do the right thing”
for as many use cases as possible. In this section, we consider the
changes specified by this PEP for several representative use cases /
contexts. Specifically, we ask about the two behaviors that could be
changed by this PEP:
-
Will a Python-specific installer tool like
pip
install
permit
installations by default, after implementation of this PEP?
-
If you do run such a tool, should it be willing to delete packages
shipped by the external (non-Python-specific) package manager for
that context, such as a distro package manager?
(For simplicity, this section discusses pip as the Python-specific
installer tool, though the analysis should apply equally to any other
Python-specific package management tool.)
This table summarizes the use cases discussed in detail below:
Description
pip
install
permitted
Deleting externally-installed packages permitted
Unpatched CPython
Currently yes; stays yes
Currently yes; stays yes
Distro
/usr/bin/python3
Currently yes; becomes no
(assuming the distro
adds a marker file)
Currently yes (except on Debian); becomes no
Distro Python in venv
Currently yes; stays yes
There are no externally-installed packages
Distro Python in venv
with
--system-site-packages
Currently yes; stays yes
Currently no; stays no
Distro Python in Docker
Currently yes; becomes no
(assuming the distro
adds a marker file)
Currently yes; becomes no
Conda environment
Currently yes; stays yes
Currently yes; stays yes
Dev-facing distro
Currently yes; becomes no
(assuming they add a
marker file)
Currently often yes; becomes no
(assuming they configure
sysconfig
as needed)
Distro building packages
Currently yes; can stay yes
Currently yes; becomes no
PYTHONHOME
copied from
a distro Python stdlib
Currently yes; becomes no
Currently yes; becomes no
PYTHONHOME
copied from
upstream Python stdlib
Currently yes; stays yes
Currently yes; stays yes
In more detail, the use cases above are:
-
A standard unpatched CPython, without any special configuration of
or patches to
sysconfig
and without a marker file. This PEP
does not change its behavior.
Such a CPython should (regardless of this PEP) not be installed in
a way that overlaps any distro-installed Python on the same system.
For instance, on an OS that ships Python in
/usr/bin
, you
should not install a custom CPython built with
./configure
--prefix=/usr
, or it will overwrite some files from the distro
and the distro will eventually overwrite some files from your
installation. Instead, your installation should be in a separate
directory (perhaps
/usr/local
,
/opt
, or your home
directory).
Therefore, we can assume that such a CPython has its own
stdlib
directory and its own
sysconfig
schemes that do not overlap any
distro-installed Python. So any OS-installed packages are not
visible or relevant here.
If there is a concept of “externally-installed” packages in this
case, it’s something outside the OS and generally managed by
whoever built and installed this CPython. Because the installer
chose not to add a marker file or modify
sysconfig
schemes,
they’re choosing the current behavior, and
pip
install
can
remove any packages available in this CPython.
-
A distro’s
/usr/bin/python3
, either when running
pip
install
as root or
pip
install
--user
, following our
Recommendations for distros
.
These recommendations include shipping a marker file in the
stdlib
directory, to prevent
pip
install
by default, and
placing distro-shipped packages in a location other than the
default
sysconfig
scheme, so that
pip
as root does not
write to that location.
Many distros (including Debian, Fedora, and their derivatives) are
already doing the latter.
On Debian and derivatives,
pip
install
does not currently
delete distro-installed packages, because Debian carries a
patch
to pip to prevent this
. So, for those distros, this PEP is not a
behavior change; it simply standardizes that behavior in a way that
is no longer Debian-specific and can be included into upstream pip.
(We have seen user reports of externally-installed packages being
deleted on Debian or a derivative. We suspect this is because the
user has previously run
sudo
pip
install
--upgrade
pip
and
therefore now has a version of
/usr/bin/pip
without the Debian
patch; standardizing this behavior in upstream package installers
would address this problem.)
-
A distro Python when used inside a virtual environment (either from
venv
or
virtualenv
).
Inside a virtual environment, all packages are owned by that
environment. Even when
pip
,
setuptools
, etc. are installed
into the environment, they are and should be managed by tools
specific to that environment; they are not system-managed.
-
A distro Python when used inside a virtual environment with
--system-site-packages
. This is like the previous case, but
worth calling out explicitly, because anything on the global
sys.path
is visible.
Currently, the answer to “Will
pip
delete externally-installed
packages” is no, because pip has a special case for running in a
virtual environment and attempting to delete packages outside it.
After this PEP, the answer remains no, but the reasoning becomes
more general: system site packages will be outside any of the
sysconfig
schemes used for package management in the
environment.
-
A distro Python when used in a single-application container image
(e.g., a Docker container). In this use case, the risk of breaking
system software is lower, since generally only a single application
runs in the container, and the impact is lower, since you can
rebuild the container and you don’t have to struggle to recover a
running machine. There are also a large number of existing
Dockerfiles with an unqualified
RUN
pip
install
...
statement,
etc., and it would be good not to break those. So, builders of base
container images may want to ensure that the marker file is not
present, even if the underlying OS ships one by default.
There is a small behavior change: currently,
pip
run as root
will delete externally-installed packages, but after this PEP it
will not. We don’t propose a way to override this. However, since
the base image is generally minimal, there shouldn’t be much of a
use case for simply uninstalling packages (especially without using
the distro’s own tools). The common case is when pip wants to
upgrade a package, which previously would have deleted the old
version (except on Debian). After this change, the old version will
still be on disk, but pip will still
shadow
externally-installed
packages, and we believe this to be sufficient for this not to be a
breaking change in practice - a Python
import
statement will
still get you the newly-installed package.
If it becomes necessary to have a way to do this, we suggest that
the distro should document a way for the installer tool to access
the
sysconfig
scheme used by the distro itself. See the
Recommendations for distros
section for more discussion.
It is the view of the authors of this PEP that it’s still a good
idea to use virtual environments with distro-installed Python
interpreters, even in single-application container images. Even
though they run a single
application
, that application may run
commands from the OS that are implemented in Python, and if you’ve
installed or upgraded the distro-shipped Python packages using
Python-specific tools, those commands may break.
-
Conda specifically supports the use of non-
conda
tools like pip
to install software not available in the Conda repositories. In
this context, Conda acts as the external package manager / distro
and pip as the Python-specific one.
In some sense, this is similar to the first case, since Conda
provides its own installation of the Python interpreter.
We don’t believe this PEP requires any changes to Conda, and
versions of pip that have implemented the changes in this PEP will
continue to behave as they currently do inside Conda environments.
(That said, it may be worth considering whether to use separate
sysconfig
schemes for pip-installed and Conda-installed
software, for the same reasons it’s a good idea for other distros.)
-
By a “developer-facing distro,” we mean a specific type of distro
where direct users of Python or other languages in the distro are
expected or encouraged to make changes to the distro itself if they
wish to add libraries. Common examples include private “monorepos”
at software development companies, where a single repository builds
both third-party and in-house software, and the direct users of the
distro’s Python interpreter are generally software developers
writing said in-house software. User-level package managers like
Nixpkgs
may also count, because they encourage users of Nix who
are Python developers to
package their software for Nix
.
In these cases, the distro may want to respond to an attempted
pip
install
with guidance encouraging use of the distro’s own
facilities for adding new packages, along with a link to
documentation.
If the distro supports/encourages creating a virtual environment
from the distro’s Python interpreter, there may also be custom
instructions for how to properly set up a virtual environment (as
for example Nixpkgs does).
-
When building distro Python packages for a distro Python (case 2),
it may be useful to have
pip
install
be usable as part of the
distro’s package build process. (Consider, for instance, building a
python-xyz
RPM by using
pip
install
.
inside an sdist /
source tarball for
xyz
.) The distro may also want to use a more
targeted but still Python-specific installation tool such as
installer
.
For this case, the build process will need to find some way to
suppress the marker file to allow
pip
install
to work, and will
probably need to point the Python-specific tool at the distro’s
sysconfig
scheme instead of the shipped default. See the
Recommendations for distros
section for more discussion on how
to implement this.
As a result of this PEP, pip will no longer be able to remove
packages already on the system. However, this behavior change is
fine because a package build process should not (and generally
cannot) include instructions to delete some other files on the
system; it can only package up its own files.
-
A distro Python used with
PYTHONHOME
to set up an alternative
Python environment (as opposed to a virtual environment), where
PYTHONHOME
is set to some directory copied directly from the
distro Python (e.g.,
cp
-a
/usr/lib/python3.x
pyhome/lib
).
Assuming there are no modifications, then the behavior is just like
the underlying distro Python (case 2). So there are behavior
changes - you can no longer
pip
install
by default, and if you
override it, it will no longer delete externally-installed packages
(i.e., Python packages that were copied from the OS and live in the
OS-managed
sys.path
entry).
This behavior change seems to be defensible, in that if your
PYTHONHOME
is a straight copy of the distro’s Python, it should
behave like the distro’s Python.
-
A distro Python (or any Python interpreter) used with a
PYTHONHOME
taken from a compatible unmodified upstream Python.
Because the behavior changes in this PEP are keyed off of files in
the standard library (the marker file in
stdlib
and the
behavior of the
sysconfig
module), the behavior is just like
an unmodified upstream CPython (case 1).
Before a Python-specific package installer (that is, a tool such as
pip - not an external tool such as apt) installs a package into a
certain Python context, it should make the following checks by
default:
-
Is it running outside of a virtual environment? It can determine
this by whether
sys.prefix
==
sys.base_prefix
(but see
Backwards Compatibility
).
-
Is there an
EXTERNALLY-MANAGED
file in the directory identified
by
sysconfig.get_path("stdlib",
sysconfig.get_default_scheme())
?
If both of these conditions are true, the installer should exit with
an error message indicating that package installation into this Python
interpreter’s directory are disabled outside of a virtual environment.
The installer should have a way for the user to override these rules,
such as a command-line flag
--break-system-packages
. This option
should not be enabled by default and should carry some connotation
that its use is risky.
The
EXTERNALLY-MANAGED
file is an INI-style metadata file intended
to be parsable by the standard library
configparser
module. If the
file can be parsed by
configparser.ConfigParser(interpolation=None)
using the UTF-8
encoding, and it contains a section
[externally-managed]
, then the
installer should look for an error message specified in the file and
output it as part of its error. If the first element of the tuple
returned by
locale.getlocale(locale.LC_MESSAGES)
, i.e., the
language code, is not
None
, it should look for the error message
as the value of a key named
Error-
followed by the language code.
If that key does not exist, and if the language code contains
underscore or hyphen, it should look for a key named
Error-
followed by the portion of the language code before the underscore or
hyphen. If it cannot find either of those, or if the language code is
None
, it should look for a key simply named
Error
.
If the installer cannot find an error message in the file (either
because the file cannot be parsed or because no suitable error key
exists), then the installer should just use a pre-defined error
message of its own, which should suggest that the user create a
virtual environment to install packages.
Software distributors who have a non-Python-specific package manager
that manages libraries in the
sys.path
of their Python package
should, in general, ship a
EXTERNALLY-MANAGED
file in their
standard library directory. For instance, Debian may ship a file in
/usr/lib/python3.9/EXTERNALLY-MANAGED
consisting of something like
[externally-managed]
Error=To install Python packages system-wide, try apt install
python3-xyz, where xyz is the package you are trying to
install.
If you wish to install a non-Debian-packaged Python package,
create a virtual environment using python3 -m venv path/to/venv.
Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make
sure you have python3-full installed.
If you wish to install a non-Debian packaged Python application,
it may be easiest to use pipx install xyz, which will manage a
virtual environment for you. Make sure you have pipx installed.
See /usr/share/doc/python3.9/README.venv for more information.
which provides useful and distro-relevant information
to a user trying to install a package. Optionally,
translations can be provided in the same file:
Error-de_DE=Wenn ist das Nunstück git und Slotermeyer?
Ja! Beiherhund das Oder die Virtualenvironment gersput!
In certain contexts, such as single-application container images that
aren’t updated after creation, a distributor may choose not to ship an
EXTERNALLY-MANAGED
file, so that users can install whatever they
like (as they can today) without having to manually override this
rule.
Usually, a Python package installer installs to directories in a
scheme returned by the sysconfig
standard library package.
Ordinarily, this is the scheme returned by
sysconfig.get_default_scheme()
, but based on configuration (e.g.
pip install --user
), it may use a different scheme.
Whenever the installer is installing to a sysconfig
scheme, this
PEP specifies that the installer should never modify or delete files
outside of that scheme. For instance, if it’s upgrading a package, and
the package is already installed in a directory outside that scheme
(perhaps in a directory from another scheme), it should leave the
existing files alone.
If the installer does end up shadowing an existing installation during
an upgrade, we recommend that it produces a warning at the end of its
If the installer is installing to a location outside of a
sysconfig
scheme (e.g., pip install --target
), then this
subsection does not apply.
This section is non-normative. It provides best practices we believe
distros should follow unless they have a specific reason otherwise.
Distros should create an EXTERNALLY-MANAGED
file in their
stdlib
directory.
The file should contain a useful and distro-relevant error message
indicating both how to install system-wide packages via the distro’s
package manager and how to set up a virtual environment. If your
distro is often used by users in a state where the python3
command
is available (and especially where pip
or get-pip
is
available) but python3 -m venv
does not work, the message should
indicate clearly how to make python3 -m venv
work properly.
Consider packaging pipx, a tool for installing Python-language
applications, and suggesting it in the error. pipx automatically
creates a virtual environment for that application alone, which is a
much better default for end users who want to install some
Python-language software (which isn’t available in the distro) but are
not themselves Python users. Packaging pipx in the distro avoids the
irony of instructing users to pip install --user
--break-system-packages pipx
to avoid breaking system packages.
Consider arranging things so your distro’s package / environment for
Python for end users (e.g., python3
on Fedora or python3-full
on Debian) depends on pipx.
Distros that produce official images for single-application containers
(e.g., Docker container images) should keep the
EXTERNALLY-MANAGED
file, preferably in a way that makes it not
go away if a user of that image installs package updates inside
their image (think RUN apt-get dist-upgrade
).
Distros should place two separate paths on the system interpreter’s
sys.path
, one for distro-installed packages and one for packages
installed by the local system administrator, and configure
sysconfig.get_default_scheme()
to point at the latter path. This
ensures that tools like pip will not modify distro-installed packages.
The path for the local system administrator should come before the
distro path on sys.path
so that local installs take preference
over distro packages.
For example, Fedora and Debian (and their derivatives) both implement
this split by using /usr/local
for locally-installed packages and
/usr
for distro-installed packages. Fedora uses
/usr/local/lib/python3.x/site-packages
vs.
/usr/lib/python3.x/site-packages
. (Debian uses
/usr/local/lib/python3/dist-packages
vs.
/usr/lib/python3/dist-packages
as an additional layer of
separation from a locally-compiled Python interpreter: if you build
and install upstream CPython in /usr/local/bin
, it will look at
/usr/local/lib/python3/site-packages
, and Debian wishes to make
sure that packages installed via the locally-built interpreter don’t
show up on sys.path
for the distro interpreter.)
Note that the /usr/local
vs. /usr
split is analogous to how
the PATH
environment variable typically includes
/usr/local/bin:/usr/bin
and non-distro software installs to
/usr/local
by default. This split is recommended by the
Filesystem Hierarchy Standard.
There are two ways you could do this. One is, if you are building and
packaging Python libraries directly (e.g., your packaging helpers
unpack a PEP 517-built wheel or call setup.py install
), arrange
for those tools to use a directory that is not in a sysconfig
scheme but is still on sys.path
.
The other is to arrange for the default sysconfig
scheme to change
when running inside a package build versus when running on an
installed system. The sysconfig
customization hooks from
bpo-43976 should make this easy (once accepted and implemented):
make your packaging tool set an
environment variable or some other detectable configuration, and
define a get_preferred_schemes
function to return a different
scheme when called from inside a package build. Then you can use pip
install
as part of your distro packaging.
We propose adding a --scheme=...
option to instruct pip to run
against a specific scheme. (See Implementation Notes below for how
pip currently determines schemes.) Once that’s available, for local
testing and possibly for actual packaging, you would be able to run
something like pip install --scheme=posix_distro
to explicitly
install a package into your distro’s location (bypassing
get_preferred_schemes
). One could also, if absolutely needed, use
pip uninstall --scheme=posix_distro
to use pip to remove packages
from the system-managed directory, which addresses the (hopefully
theoretical) regression in use case 5 in Rationale.
To install packages with pip, you would also need to either suppress
the EXTERNALLY-MANAGED
marker file to allow pip to run or to
override it on the command line. You may want to use the same means
for suppressing the marker file in build chroots as you do in
container images.
The advantage of setting these up to be automatic (suppressing the
marker file in your build environment and having
get_preferred_schemes
automatically return your distro’s scheme)
is that an unadorned pip install
will work inside a package build,
which generally means that an unmodified upstream build script that
happens to internally call pip install
will do the right thing.
You can, of course, just ensure that your packaging process always
calls pip install --scheme=posix_distro --break-system-packages
,
which would work too.
The best approach here depends a lot on your distro’s conventions and
mechanisms for packaging.
Similarly, the sysconfig
paths that are not for importable Python
code - that is, include
, platinclude
, scripts
, and
data
- should also have two variants, one for use by
distro-packaged software and one for use for locally-installed
software, and the distro should be set up such that both are usable.
For instance, a typical FHS-compliant distro will use
/usr/local/include
for the default scheme’s include
and
/usr/include
for distro-packaged headers and place both on the
compiler’s search path, and it will use /usr/local/bin
for the
default scheme’s scripts
and /usr/bin
for distro-packaged
entry points and place both on $PATH
.
All of these mechanisms are proposed for new distro releases and new
versions of tools like pip only.
In particular, we strongly recommend that distros with a concept of
major versions only add the marker file or change sysconfig
schemes in a new major version; otherwise there is a risk that, on an
existing system, software installed via a Python-specific package
manager now becomes unmanageable (without an override option). For a
rolling-release distro, if possible, only add the marker file or
change sysconfig
schemes in a new Python minor version.
One particular backwards-compatibility difficulty for package
installation tools is likely to be managing environments created by
old versions of virtualenv
which have the latest version of the
tool installed. A “virtual environment” now has a fairly precise
definition: it uses the pyvenv.cfg
mechanism, which causes
sys.base_prefix != sys.prefix
. It is possible, however, that a
user may have an old virtual environment created by an older version
of virtualenv
; as of this writing, pip supports Python 3.6
onwards, which is in turn supported by virtualenv
15.1.0 onwards,
so this scenario is possible. In older versions of virtualenv
, the
mechanism is instead to set a new attribute, sys.real_prefix
, and
it does not use the standard library support for virtual environments,
so sys.base_prefix
is the same as sys.prefix
. So the logic for
robustly detecting a virtual environment is something like:
def is_virtual_environment():
return sys.base_prefix != sys.prefix or hasattr(sys, "real_prefix")
The purpose of this feature is not to implement a security boundary;
it is to discourage well-intended changes from unexpectedly breaking a
user’s environment. That is to say, the reason this PEP restricts
pip install
outside a virtual environment is not that it’s a
security risk to be able to do so; it’s that “There should be one–
and preferably only one –obvious way to do it,” and that way should
be using a virtual environment. pip install
outside a virtual
environment is rather too obvious for what is almost always the wrong
way to do it.
If there is a case where a user should not be able to sudo pip
install
or pip install --user
and add files to sys.path
for
security reasons, that needs to be implemented either via access
control rules on what files the user can write to or an explicitly
secured sys.path
for the program in question. Neither of the
mechanisms in this PEP should be interpreted as a way to address such
a scenario.
For those reasons, an attempted install with a marker file present is
not a security incident, and there is no need to raise an auditing
event for it. If the calling user legitimately has access to sudo
pip install
or pip install --user
, they can accomplish the same
installation entirely outside of Python; if they do not legitimately
have such access, that’s a problem outside the scope of this PEP.
The marker file itself is located in the standard library directory,
which is a trusted location (i.e., anyone who can write to the marker
file used by a particular installer could, presumably, run arbitrary
code inside the installer). Therefore, there is generally no need to
filter out terminal escape sequences or other potentially-malicious
content in the error message.
There are a number of similar proposals we considered that this PEP
rejects or defers, largely to preserve the behavior in the
case-by-case analysis in Rationale.
Should the marker file be in sys.path
, marking a particular
directory as not to be written to by a Python-specific package
manager? This would help with the second problem addressed by this PEP
(not overwriting deleting distro-owned files) but not the first
(incompatible installs). A directory-specific marker in
/usr/lib/python3.x/site-packages
would not discourage
installations into either /usr/local/lib/python3.x/site-packages
or ~/.local/lib/python3.x/site-packages
, both of which are on
sys.path
for /usr/bin/python3
. In other words, the marker file
should not be interpreted as marking a single directory as
externally managed (even though it happens to be in a directory on
sys.path
); it marks the entire Python installation as externally
managed.
Another variant of the above: should the marker file be in
sys.path
, where if it can be found in any directory in
sys.path
, it marks the installation as externally managed? An
apparent advantage of this approach is that it automatically disables
itself in virtual environments. Unfortunately, This has the wrong
behavior with a --system-site-packages
virtual environment, where
the system-wide sys.path
is visible but package installations are
allowed. (It could work if the rule of exempting virtual environments
is preserved, but that seems to have no advantage over the current
scheme.)
Should the marker just be a new attribute of a sysconfig
scheme?
There is some conceptual cleanliness to this, except that it’s hard to
override. We want to make it easy for container images, package build
environments, etc. to suppress the marker file. A file that you can
remove is easy; code in sysconfig
is much harder to modify.
Should the file be in /etc
? No, because again, it refers to a
specific Python installation. A user who installs their own Python may
well want to install packages within the global context of that
interpreter.
Should the configuration setting be in pip.conf
or
distutils.cfg
? Apart from the above objections about marking an
installation, this mechanism isn’t specific to either of those tools.
(It seems reasonable for pip to also implement a configuration flag
for users to prevent themselves from performing accidental
non-virtual-environment installs in any Python installation, but that
is outside the scope of this PEP.)
Should the file be TOML? TOML is gaining popularity for packaging (see
e.g. PEP 517) but does not yet have an implementation in the standard
library. Strictly speaking, this isn’t a blocker - distros need only
write the file, not read it, so they don’t need a TOML library (the
file will probably be written by hand, regardless of format), and
packaging tools likely have a TOML reader already. However, the INI
format is currently used for various other forms of packaging metadata
(e.g., pydistutils.cfg
and setup.cfg
), meets our needs, and is
parsable by the standard library, and the pip maintainers expressed a
preference to avoid using TOML for this yet.
Should the file be email.message
-style? While this format is also
used for packaging metadata (e.g. sdist and wheel metadata) and is
also parsable by the standard library, it doesn’t handle multi-line
entries quite as clearly, and that is our primary use case.
Should the marker file be executable Python code that evaluates
whether installation should be allowed or not? Apart from the concerns
above about having the file in sys.path
, we have a concern that
making it executable is committing to too powerful of an API and risks
making behavior harder to understand. (Note that the
get_default_scheme
hook of bpo-43976 is in fact executable, but
that code needs to be supplied when the interpreter builds; it isn’t
intended to be supplied post-build.)
When overriding the marker, should a Python-specific package manager
be disallowed from shadowing a package installed by the external
package manager (i.e., installing modules of the same name)? This
would minimize the risk of breaking system software, but it’s not
clear it’s worth the additional user experience complexity. There are
legitimate use cases for shadowing system packages, and an additional
command-line option to permit it would be more confusing. Meanwhile,
not passing that option wouldn’t eliminate the risk of breaking system
software, which may be relying on a try: import xyz
failing,
finding a limited set of entry points, etc. Communicating this
distinction seems difficult. We think it’s a good idea for
Python-specific package managers to print a warning if they shadow a
package, but we think it’s not worth disabling it by default.
Why not use the INSTALLER
file from PEP 376 to determine who
installed a package and whether it can be removed? First, it’s
specific to a particular package (it’s in the package’s dist-info
directory), so like some of the alternatives above, it doesn’t provide
information on an entire environment and whether package installations
are permissible. PEP 627 also updates PEP 376 to prevent programmatic
use of INSTALLER
, specifying that the file is “to be used for
informational purposes only. […] Our goal is supporting
interoperating tools, and basing any action on which tool happened to
install a package runs counter to that goal.” Finally, as PEP 627
envisions, there are legitimate use cases for one tool knowing how to
handle packages installed by another tool; for instance, conda
can
safely remove a package installed by pip
into a Conda environment.
Why does the specification give no means for disabling package
installations inside a virtual environment? We can’t see a
particularly strong use case for it (at least not one related to the
purposes of this PEP). If you need it, it’s simple enough to pip
uninstall pip
inside that environment, which should discourage at
least unintentional changes to the environment (and this specification
makes no provision to disable intentional changes, since after all
the marker file can be easily removed).
Shouldn’t distro software just run with the distro site-packages
directory alone on sys.path
and ignore the local system
administrator’s site-packages
as well as the user-specific one?
This is a worthwhile idea, and various versions of it have been
circulating for a while under the name of “system Python” or “platform
Python” (with a separate “user Python” for end users writing Python or
installing Python software separate from the system). However, it’s
much more involved of a change. First, it would be a
backwards-incompatible change. As mentioned in the Motivation
section, there are valid use cases for running distro-installed Python
applications like Sphinx or Ansible with locally-installed Python
libraries available on their sys.path
. A wholesale switch to
ignoring local packages would break these use cases, and a distro
would have to make a case-by-case analysis of whether an application
ought to see locally-installed libraries or not.
Furthermore, Fedora attempted this change and reverted it, finding,
ironically, that their implementation of the change broke their
package manager. Given that experience, there are clearly details to
be worked out before distros can reliably implement that approach, and
a PEP recommending it would be premature.
This PEP is intended to be a complete and self-contained change that
is independent of a distributor’s decision for or against “system
Python” or similar proposals. It is not incompatible with a distro
implementing “system Python” in the future, and even though both
proposals address the same class of problems, there are still
arguments in favor of implementing something like “system Python” even
after implementing this PEP. At the same time, though, this PEP
specifically tries to make a more targeted and minimal change, such
that it can be implemented by distributors who don’t expect to adopt
“system Python” (or don’t expect to implement it immediately). The
changes in this PEP stand on their own merits and are not an
intermediate step for some future proposal. This PEP reduces (but does
not eliminate) the risk of breaking system software while minimizing
(but not completely avoiding) breaking changes, which should therefore
be much easier to implement than the full “system Python” idea, which
comes with the downsides mentioned above.
We expect that the guidance in this PEP - that users should use
virtual environments whenever possible and that distros should have
separate sys.path
directories for distro-managed and
locally-managed modules - should make further experiments easier in
the future. These may include distributing wholly separate “system”
and “user” Python interpreters, running system software out of a
distro-owned virtual environment or PYTHONHOME
(but shipping a
single interpreter), or modifying the entry points for certain
software (such as the distro’s package manager) to use a sys.path
that only sees distro-managed directories. Those ideas themselves,
however, remain outside the scope of this PEP.
This section is non-normative and contains notes relevant to both the
specification and potential implementations.
Currently, pip does not directly expose a way to choose a target
sysconfig
scheme, but it has three ways of looking up schemes when
installing:
pip install
- Calls
sysconfig.get_default_scheme()
, which is usually (in
upstream CPython and most current distros) the same as
get_preferred_scheme('prefix')
.
pip install --prefix=/some/path
- Calls
sysconfig.get_preferred_scheme('prefix')
.
pip install --user
- Calls
sysconfig.get_preferred_scheme('user')
.
Finally, pip install --target=/some/path
writes directly to
/some/path
without looking up any schemes.
Debian currently carries a patch to change the default install
location inside a virtual environment, using a few heuristics
(including checking for the VIRTUAL_ENV
environment variable),
largely so that the directory used in a virtual environment remains
site-packages
and not dist-packages
. This does not
particularly affect this proposal, because the implementation of that
patch does not actually change the default sysconfig
scheme, and
notably does not change the result of
sysconfig.get_path("stdlib")
.
Fedora currently carries a patch to change the default install
location when not running inside rpmbuild, which they use to
implement the two-system-wide-directories approach. This is
conceptually the sort of hook envisioned by bpo-43976, except
implemented as a code patch to distutils
instead of as a changed
sysconfig
scheme.
The implementation of is_virtual_environment
above, as well as the
logic to load the EXTERNALLY-MANAGED
file and find the error
message from it, may as well get added to the standard library
(sys
and sysconfig
, respectively), to centralize their
implementations, but they don’t need to be added yet.
For additional background on these problems and previous attempts to
solve them, see Debian bug 771794 “pip silently removes/updates
system provided python packages” from 2014, Fedora’s 2018 article
Making sudo pip safe about pointing sudo pip
at /usr/local
(which acknowledges that the changes still do not make sudo pip
completely safe), pip issues 5605 (“Disable upgrades to existing
python modules which were not installed via pip”) and 5722 (“pip
should respect /usr/local”) from 2018, and the post-PyCon US 2019
discussion thread Playing nice with external package managers.
- [1]
- https://pip.pypa.io/en/stable/
- [2]
- https://setuptools.readthedocs.io/en/latest/deprecated/easy_install.html
(Note that the
easy_install
command was removed in
setuptools version 52, released 23 January 2021.)
- [3]
- https://conda.io
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.
Source: https://github.com/python/peps/blob/main/peps/pep-0668.rst
Last modified: 2024-05-17 01:32:43 GMT
Contents
- Abstract
- Terminology
- Motivation
- Rationale