Reads list of objects from the standard input, and writes either one or
more packed archives with the specified base-name to disk, or a packed
archive to the standard output.
A packed archive is an efficient way to transfer a set of objects
between two repositories as well as an access efficient archival
format. In a packed archive, an object is either stored as a
compressed whole or as a difference from some other object.
The latter is often called a delta.
The packed archive format (.pack) is designed to be self-contained
so that it can be unpacked without any further information. Therefore,
each object that a delta depends upon must be present within the pack.
A pack index file (.idx) is generated for fast, random access to the
objects in the pack. Placing both the index file (.idx) and the packed
archive (.pack) in the pack/ subdirectory of $GIT_OBJECT_DIRECTORY (or
any of the directories on $GIT_ALTERNATE_OBJECT_DIRECTORIES)
enables Git to read from the pack archive.
Write into pairs of files (.pack and .idx), using
<base-name> to determine the name of the created file.
When this option is used, the two files in a pair are written in
<base-name>-<SHA-1>.{pack,idx} files. <SHA-1> is a hash
based on the pack content and is written to the standard
output of the command.
Packs unreachable objects into a separate "cruft" pack, denoted
by the existence of a
.mtimes
file. Typically used by
git
repack --cruft
. Callers provide a list of pack names and
indicate which packs will remain in the repository, along with
which packs will be deleted (indicated by the
-
prefix). The
contents of the cruft pack are all objects not contained in the
surviving packs which have not exceeded the grace period (see
--cruft-expiration
below), or which have exceeded the grace
period, but are reachable from an other object which hasn’t.
When the input lists a pack containing all reachable objects (and lists
all other packs as pending deletion), the corresponding cruft pack will
contain all unreachable objects (with mtime newer than the
--cruft-expiration
) along with any unreachable objects whose mtime is
older than the
--cruft-expiration
, but are reachable from an
unreachable object whose mtime is newer than the
--cruft-expiration
).
Incompatible with
--unpack-unreachable
,
--keep-unreachable
,
--pack-loose-unreachable
,
--stdin-packs
, as well as any other
options which imply
--revs
.
A debug option to help with future "partial clone" development.
This option specifies how missing objects are handled.
The form
--missing=error
requests that pack-objects stop with an error if
a missing object is encountered. If the repository is a partial clone, an
attempt to fetch missing objects will be made before declaring them missing.
This is the default action.
The form
--missing=allow-any
will allow object traversal to continue
if a missing object is encountered. No fetch of a missing object will occur.
Missing objects will silently be omitted from the results.
The form
--missing=allow-promisor
is like
allow-any
, but will only
allow object traversal to continue for EXPECTED promisor missing objects.
No fetch of a missing object will occur. An unexpected missing object will
raise an error.
Restrict delta matches based on "islands". See DELTA ISLANDS
below.
When possible,
pack-objects
tries to reuse existing on-disk deltas to
avoid having to search for new ones on the fly. This is an important
optimization for serving fetches, because it means the server can avoid
inflating most objects at all and just send the bytes directly from
disk. This optimization can’t work when an object is stored as a delta
against a base which the receiver does not have (and which we are not
already sending). In that case the server "breaks" the delta and has to
find a new one, which has a high CPU cost. Therefore it’s important for
performance that the set of objects in on-disk delta relationships match
what a client would fetch.
In a normal repository, this tends to work automatically. The objects
are mostly reachable from the branches and tags, and that’s what clients
fetch. Any deltas we find on the server are likely to be between objects
the client has or will have.
But in some repository setups, you may have several related but separate
groups of ref tips, with clients tending to fetch those groups
independently. For example, imagine that you are hosting several "forks"
of a repository in a single shared object store, and letting clients
view them as separate repositories through
GIT_NAMESPACE
or separate
repos using the alternates mechanism. A naive repack may find that the
optimal delta for an object is against a base that is only found in
another fork. But when a client fetches, they will not have the base
object, and we’ll have to find a new delta on the fly.
A similar situation may exist if you have many refs outside of
refs/heads/
and
refs/tags/
that point to related objects (e.g.,
refs/pull
or
refs/changes
used by some hosting providers). By
default, clients fetch only heads and tags, and deltas against objects
found only in those other groups cannot be sent as-is.
Delta islands solve this problem by allowing you to group your refs into
distinct "islands". Pack-objects computes which objects are reachable
from which islands, and refuses to make a delta from an object
A
against a base which is not present in all of
A
's islands. This
results in slightly larger packs (because we miss some delta
opportunities), but guarantees that a fetch of one island will not have
to recompute deltas on the fly due to crossing island boundaries.
When repacking with delta islands the delta window tends to get
clogged with candidates that are forbidden by the config. Repacking
with a big --window helps (and doesn’t take as long as it otherwise
might because we can reject some object pairs based on islands before
doing any computation on the content).
Islands are configured via the
pack.island
option, which can be
specified multiple times. Each value is a left-anchored regular
expressions matching refnames. For example:
[pack]
island = refs/heads/
island = refs/tags/
puts heads and tags into an island (whose name is the empty string; see
below for more on naming). Any refs which do not match those regular
expressions (e.g.,
refs/pull/123
) is not in any island. Any object
which is reachable only from
refs/pull/
(but not heads or tags) is
therefore not a candidate to be used as a base for
refs/heads/
.
Refs are grouped into islands based on their "names", and two regexes
that produce the same name are considered to be in the same
island. The names are computed from the regexes by concatenating any
capture groups from the regex, with a
-
dash in between. (And if
there are no capture groups, then the name is the empty string, as in
the above example.) This allows you to create arbitrary numbers of
islands. Only up to 14 such capture groups are supported though.
For example, imagine you store the refs for each fork in
refs/virtual/ID
, where
ID
is a numeric identifier. You might then
configure:
[pack]
island = refs/virtual/([0-9]+)/heads/
island = refs/virtual/([0-9]+)/tags/
island = refs/virtual/([0-9]+)/(pull)/
That puts the heads and tags for each fork in their own island (named
"1234" or similar), and the pull refs for each go into their own
"1234-pull".
Note that we pick a single island for each regex to go into, using "last
one wins" ordering (which allows repo-specific config to take precedence
over user-wide config, and so forth).