VYPR
Medium severity5.7GHSA Advisory· Published Jun 10, 2026

CVE-2026-47734

CVE-2026-47734

Description

Dulwich before 1.2.5 allocates excessive memory on crafted thin packs, enabling DoS against push-enabled servers.

AI Insight

LLM-synthesized narrative grounded in this CVE's description and references.

Dulwich before 1.2.5 allocates excessive memory on crafted thin packs, enabling DoS against push-enabled servers.

Vulnerability

In Dulwich versions 0.1.0 through 1.2.4, the add_thin_pack and apply_delta functions allocate memory based on an attacker-controlled dest_size value in a thin pack delta header, with no correlation to the actual bytes received. A push client can send a crafted thin pack as small as ~174 bytes declaring a huge dest_size, causing the server to allocate hundreds of megabytes of memory. This affects any Dulwich-based server exposing git-receive-pack (e.g., via dulwich.server, HTTP smart server, or ReceivePackHandler) [1][2][3].

Exploitation

An attacker with push access to a Dulwich repository sends a crafted thin pack with a delta header specifying an inflated dest_size. The server processes the pack through add_thin_pack, allocating memory proportional to the declared dest_size without checking the actual data size. The attacker does not need to send a large payload; the tiny crafted pack triggers the large allocation [2][3].

Impact

Successful exploitation causes uncontrolled memory consumption (CWE-400/CWE-789), leading to a denial-of-service condition. The server may become unresponsive or crash due to memory exhaustion, potentially affecting other services on the same host [2][3].

Mitigation

Dulwich 1.2.5 [4] patches the issue by adding a max_input_size parameter to add_thin_pack and having ReceivePackHandler enforce the receive.maxInputSize config option, raising PackInputTooLarge on excess. Users should upgrade to 1.2.5 and set a sensible receive.maxInputSize in the repository config. On unpatched versions, receive.maxInputSize has no effect. Until upgrading, restrict push access to trusted clients, disable push entirely on fetch-only servers, or use OS-level memory limits (e.g., ulimit, cgroups) to contain a malicious push [2][3].

AI Insight generated on Jun 11, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.

Affected packages

Versions sourced from the GitHub Security Advisory.

PackageAffected versionsPatched versions
dulwichPyPI
>= 0.1.0, < 1.2.51.2.5

Affected products

2
  • Dulwich Project/DulwichGHSA2 versions
    >= 0.1.0, < 1.2.5+ 1 more
    • (no CPE)range: >= 0.1.0, < 1.2.5
    • (no CPE)range: >=0.1.0 <1.2.5

Patches

4
f860ca489d63

server: Honour receive.maxInputSize to bound received packs

https://github.com/jelmer/dulwichJelmer VernooijMay 19, 2026Fixed in dulwich-1.2.5via ghsa-release-walk
4 files changed · +117 4
  • dulwich/object_store.py+58 0 modified
    @@ -43,6 +43,7 @@
         "PackBasedObjectStore",
         "PackCapableObjectStore",
         "PackContainer",
    +    "PackInputTooLarge",
         "commit_tree_changes",
         "find_shallow",
         "get_depth",
    @@ -221,6 +222,51 @@ def get_tree_objects(
     DEFAULT_TEMPFILE_GRACE_PERIOD = 14 * 24 * 60 * 60  # 2 weeks
     
     
    +class PackInputTooLarge(OSError):
    +    """Raised when a received pack exceeds the configured input size cap.
    +
    +    Mirrors the failure mode of git's ``receive.maxInputSize`` /
    +    ``git index-pack --max-input-size``.
    +    """
    +
    +
    +def _bound_read_callables(
    +    read_all: Callable[[int], bytes],
    +    read_some: Callable[[int], bytes] | None,
    +    max_input_size: int,
    +) -> tuple[Callable[[int], bytes], Callable[[int], bytes] | None]:
    +    """Wrap pack-stream read callbacks so total bytes are capped.
    +
    +    When the cumulative number of bytes returned across ``read_all`` and
    +    ``read_some`` exceeds ``max_input_size``, the next read raises
    +    ``PackInputTooLarge``. This is the in-process analogue of
    +    ``git index-pack --max-input-size``.
    +    """
    +    bytes_read = [0]
    +
    +    def _check(n: int) -> None:
    +        bytes_read[0] += n
    +        if bytes_read[0] > max_input_size:
    +            raise PackInputTooLarge(
    +                f"pack exceeds maximum input size of {max_input_size} bytes"
    +            )
    +
    +    def wrapped_read_all(n: int) -> bytes:
    +        data = read_all(n)
    +        _check(len(data))
    +        return data
    +
    +    if read_some is None:
    +        return wrapped_read_all, None
    +
    +    def wrapped_read_some(n: int) -> bytes:
    +        data = read_some(n)
    +        _check(len(data))
    +        return data
    +
    +    return wrapped_read_all, wrapped_read_some
    +
    +
     def find_shallow(
         store: ObjectContainer, heads: Iterable[ObjectID], depth: int
     ) -> tuple[set[ObjectID], set[ObjectID]]:
    @@ -2133,6 +2179,8 @@ def add_thin_pack(
             read_all: Callable[[int], bytes],
             read_some: Callable[[int], bytes] | None,
             progress: Callable[..., None] | None = None,
    +        *,
    +        max_input_size: int | None = None,
         ) -> "Pack":
             """Add a new thin pack to this object store.
     
    @@ -2146,11 +2194,21 @@ def add_thin_pack(
               read_some: Read function that returns at least one byte, but may
                 not return the number of bytes requested.
               progress: Optional progress reporting function.
    +          max_input_size: Maximum number of bytes that may be read from
    +            the wire while ingesting this pack. Matches git's
    +            ``receive.maxInputSize`` / ``index-pack --max-input-size``
    +            semantics: ``None`` (the default) or ``0`` mean unlimited.
    +            Exceeding the cap raises ``PackInputTooLarge``.
             Returns: A Pack object pointing at the now-completed thin pack in the
                 objects/pack directory.
             """
             import tempfile
     
    +        if max_input_size:
    +            read_all, read_some = _bound_read_callables(
    +                read_all, read_some, max_input_size
    +            )
    +
             fd, path = tempfile.mkstemp(dir=self.path, prefix="tmp_pack_")
             with os.fdopen(fd, "w+b") as f:
                 os.chmod(path, PACK_MODE)
    
  • dulwich/server.py+24 1 modified
    @@ -1433,6 +1433,25 @@ def capabilities(self) -> Iterable[bytes]:
                 capability_object_format(self.repo.object_format.name),
             ]
     
    +    def _receive_max_input_size(self) -> int | None:
    +        """Return the configured ``receive.maxInputSize`` for this repo.
    +
    +        Mirrors git: the value is in bytes, and ``0`` (the default) means
    +        unlimited. Returned as ``None`` when unset or zero so it can be
    +        passed verbatim as ``add_thin_pack(..., max_input_size=...)``.
    +        """
    +        config = self.repo.get_config_stack()  # type: ignore[attr-defined]
    +        try:
    +            raw = config.get((b"receive",), b"maxInputSize")
    +        except KeyError:
    +            return None
    +        try:
    +            value = int(raw.decode())
    +        except ValueError:
    +            logger.warning("Ignoring invalid receive.maxInputSize value %r", raw)
    +            return None
    +        return value if value > 0 else None
    +
         def _apply_pack(
             self, refs: list[tuple[ObjectID, ObjectID, Ref]]
         ) -> Iterator[tuple[bytes, bytes]]:
    @@ -1466,7 +1485,11 @@ def _apply_pack(
                 # string
                 try:
                     recv = getattr(self.proto, "recv", None)
    -                self.repo.object_store.add_thin_pack(self.proto.read, recv)  # type: ignore[attr-defined]
    +                self.repo.object_store.add_thin_pack(  # type: ignore[attr-defined]
    +                    self.proto.read,
    +                    recv,
    +                    max_input_size=self._receive_max_input_size(),
    +                )
                     yield (b"unpack", b"ok")
                 except all_exceptions as e:
                     yield (b"unpack", str(e).replace("\n", "").encode("utf-8"))
    
  • NEWS+12 0 modified
    @@ -31,6 +31,18 @@
        user had a merge driver configured that referenced ``%P``.
        Reported by Ravishanker Kusuma (hayageek). (Jelmer Vernooij)
     
    + * SECURITY: Honour ``receive.maxInputSize`` in
    +   ``ReceivePackHandler``. Previously a remote unauthenticated client
    +   could send a tiny crafted pack (~174 bytes) that declared a huge
    +   ``dest_size`` in its delta header and trigger hundreds of MB of
    +   allocation in ``apply_delta`` / ``add_thin_pack``, exhausting
    +   server memory over ``git-receive-pack``. ``add_thin_pack`` now
    +   accepts a ``max_input_size`` keyword (in bytes, ``0`` / ``None`` =
    +   unlimited, matching git's semantics) and ``ReceivePackHandler``
    +   reads ``receive.maxInputSize`` from the repository config and
    +   passes it through. Exceeding the cap raises ``PackInputTooLarge``.
    +   (Jelmer Vernooij; Reported by Liyi, Ziyue, Strick, Maurice and Chenchen @ Univeristy of Sydney)
    +
     1.2.4	2026-05-21
     
      * Tolerate ref names with empty path components (e.g. ``refs/tags//v1.0``)
    
  • tests/test_object_store.py+23 3 modified
    @@ -29,14 +29,15 @@
     from contextlib import closing
     from io import BytesIO
     
    -from dulwich.errors import NotTreeError
    +from dulwich.errors import NotTreeError, ObjectFormatException
     from dulwich.index import commit_tree
     from dulwich.object_format import DEFAULT_OBJECT_FORMAT
     from dulwich.object_store import (
         DiskObjectStore,
         MemoryObjectStore,
         ObjectStoreGraphWalker,
         OverlayObjectStore,
    +    PackInputTooLarge,
         commit_tree_changes,
         read_packs_file,
         tree_lookup_path,
    @@ -380,8 +381,6 @@ def test_add_pack_rejects_malformed_tree(self) -> None:
             # be ingested: MemoryObjectStore and ``git fsck`` already reject
             # such objects, so DiskObjectStore must too. Otherwise a malicious
             # remote can poison the repository.
    -        from dulwich.errors import ObjectFormatException
    -
             o = DiskObjectStore(self.store_dir)
             self.addCleanup(o.close)
             f, commit, abort = o.add_pack()
    @@ -397,6 +396,27 @@ def test_add_pack_rejects_malformed_tree(self) -> None:
             # No pack/index files should have been left behind.
             self.assertEqual([], os.listdir(o.pack_dir))
     
    +    def test_add_thin_pack_max_input_size(self) -> None:
    +        """Bounding wire input rejects packs exceeding the cap.
    +
    +        Mirrors git's ``receive.maxInputSize`` semantics.
    +        """
    +        o = DiskObjectStore(self.store_dir)
    +        self.addCleanup(o.close)
    +
    +        blob = make_object(Blob, data=b"yummy data")
    +        o.add_object(blob)
    +
    +        f = BytesIO()
    +        build_pack(
    +            f,
    +            [(REF_DELTA, (blob.id, b"more yummy data"))],
    +            store=o,
    +        )
    +
    +        with self.assertRaises(PackInputTooLarge):
    +            o.add_thin_pack(f.read, None, max_input_size=8)
    +
         def test_pack_index_version_config(self) -> None:
             # Test that pack.indexVersion configuration is respected
             from dulwich.config import ConfigDict
    
82dcfd922e2e

index: reject reserved Windows device names in NTFS validator

https://github.com/jelmer/dulwichJelmer VernooijMay 19, 2026Fixed in dulwich-1.2.5via ghsa-release-walk
3 files changed · +80 0
  • dulwich/index.py+21 0 modified
    @@ -1978,6 +1978,25 @@ def _is_ntfs_dotgit_short_name(normalized: bytes) -> bool:
         return len(tail) > 0 and tail.isdigit()
     
     
    +# Reserved Windows device names. Opening any of these on Windows
    +# resolves to a device rather than a file, regardless of any
    +# extension or trailing dots/spaces (``NUL``, ``NUL.txt``,
    +# ``aux.foo.bar`` all hit the device).
    +RESERVED_WINDOWS_DEVICE_NAMES = frozenset(
    +    [b"con", b"prn", b"aux", b"nul"]
    +    + [b"com%d" % i for i in range(1, 10)]
    +    + [b"lpt%d" % i for i in range(1, 10)]
    +)
    +
    +
    +def _is_reserved_windows_device_name(normalized: bytes) -> bool:
    +    """Match Windows reserved device names regardless of extension."""
    +    # The "stem" is the portion before the first ``.``; Windows
    +    # also strips trailing spaces from that stem when resolving.
    +    stem = normalized.split(b".", 1)[0].rstrip(b" ")
    +    return stem in RESERVED_WINDOWS_DEVICE_NAMES
    +
    +
     def validate_path_element_ntfs(element: bytes) -> bool:
         """Validate a path element using NTFS filesystem rules.
     
    @@ -2002,6 +2021,8 @@ def validate_path_element_ntfs(element: bytes) -> bool:
             return False
         if _is_ntfs_dotgit_short_name(normalized):
             return False
    +    if _is_reserved_windows_device_name(normalized):
    +        return False
         return True
     
     
    
  • NEWS+9 0 modified
    @@ -64,6 +64,15 @@
        / ``core.protectHFS`` configuration keys are now read under their
        documented names. Reported by Christopher Toth.
     
    + * Reject tree entries whose name resolves to a reserved Windows
    +   device (``CON``, ``PRN``, ``AUX``, ``NUL``, ``COM1``-``COM9``,
    +   ``LPT1``-``LPT9``), with or without an extension. ``NUL.txt`` and
    +   ``AUX.foo`` open the device rather than a disk file on Windows,
    +   so a tree authored on POSIX containing such a name would either
    +   fail to check out or write to the device on a Windows clone —
    +   matching Git's behaviour under ``core.protectNTFS``. Reported by
    +   Christopher Toth.
    +
      * Deduplicate objects when writing a multi-pack-index. Objects present
        in multiple packs (e.g. after ``git gc`` creates a cruft pack) would
        otherwise produce an OIDL chunk with repeated SHAs, causing ``git
    
  • tests/test_index.py+50 0 modified
    @@ -1567,6 +1567,56 @@ def test_ntfs_rejects_alternate_data_stream(self) -> None:
             self.assertFalse(validate_path_element_ntfs(b".git:evil"))
             self.assertFalse(validate_path_element_ntfs(b"foo:bar"))
     
    +    def test_ntfs_rejects_reserved_device_names(self) -> None:
    +        # CON, PRN, AUX, NUL and COM1..9 / LPT1..9 are reserved
    +        # devices on Windows. Opening them resolves to the device
    +        # rather than a disk file, with or without an extension and
    +        # regardless of case.
    +        for name in (
    +            b"NUL",
    +            b"nul",
    +            b"NuL",
    +            b"CON",
    +            b"PRN",
    +            b"AUX",
    +            b"COM1",
    +            b"COM9",
    +            b"LPT1",
    +            b"LPT9",
    +        ):
    +            self.assertFalse(
    +                validate_path_element_ntfs(name),
    +                f"{name!r} should be rejected on NTFS",
    +            )
    +
    +    def test_ntfs_rejects_reserved_device_names_with_extension(self) -> None:
    +        # Extensions do not make a reserved name safe on Windows —
    +        # ``NUL.txt`` still opens the NUL device.
    +        self.assertFalse(validate_path_element_ntfs(b"NUL.txt"))
    +        self.assertFalse(validate_path_element_ntfs(b"aux.foo"))
    +        self.assertFalse(validate_path_element_ntfs(b"COM1.bar"))
    +        # Multiple extensions still match the stem.
    +        self.assertFalse(validate_path_element_ntfs(b"nul.tar.gz"))
    +        # Trailing dots/spaces are stripped by NTFS before resolution.
    +        self.assertFalse(validate_path_element_ntfs(b"NUL."))
    +        self.assertFalse(validate_path_element_ntfs(b"NUL "))
    +        self.assertFalse(validate_path_element_ntfs(b"NUL ..."))
    +        # A trailing space on the stem itself is also stripped, so
    +        # ``NUL .txt`` still resolves to the NUL device.
    +        self.assertFalse(validate_path_element_ntfs(b"NUL .txt"))
    +
    +    def test_ntfs_accepts_names_that_only_resemble_devices(self) -> None:
    +        # Only the exact reserved names are devices; longer names
    +        # that merely start with one of them are fine.
    +        self.assertTrue(validate_path_element_ntfs(b"null"))
    +        self.assertTrue(validate_path_element_ntfs(b"console"))
    +        self.assertTrue(validate_path_element_ntfs(b"prnt"))
    +        self.assertTrue(validate_path_element_ntfs(b"myaux"))
    +        # COM0/LPT0 and COM10+ are not in the reserved range.
    +        self.assertTrue(validate_path_element_ntfs(b"com0"))
    +        self.assertTrue(validate_path_element_ntfs(b"com10"))
    +        self.assertTrue(validate_path_element_ntfs(b"lpt0"))
    +
     
     class TestDecodeUTF8WithFallback(TestCase):
         """Tests for the xutftowcsn-style lossy UTF-8 decoder."""
    
a248efd10f36

index: harden validate_path_element_ntfs against Windows path smuggling

https://github.com/jelmer/dulwichJelmer VernooijApr 22, 2026Fixed in dulwich-1.2.5via ghsa-release-walk
3 files changed · +141 1
  • dulwich/index.py+19 1 modified
    @@ -1970,6 +1970,14 @@ def validate_path_element_default(element: bytes) -> bool:
         return _normalize_path_element_default(element) not in INVALID_DOTNAMES
     
     
    +def _is_ntfs_dotgit_short_name(normalized: bytes) -> bool:
    +    """Match NTFS 8.3 short-name forms of ``.git`` (``git~<digits>``)."""
    +    if not normalized.startswith(b"git~"):
    +        return False
    +    tail = normalized[4:]
    +    return len(tail) > 0 and tail.isdigit()
    +
    +
     def validate_path_element_ntfs(element: bytes) -> bool:
         """Validate a path element using NTFS filesystem rules.
     
    @@ -1979,10 +1987,20 @@ def validate_path_element_ntfs(element: bytes) -> bool:
         Returns:
           True if path element is valid for NTFS, False otherwise
         """
    +    # A backslash is a path separator on Windows, so accepting it
    +    # here would let a tree authored on POSIX escape the work tree
    +    # or plant files under ``.git\`` when checked out on Windows.
    +    if b"\\" in element:
    +        return False
    +    # NTFS alternate data streams are addressed as ``name:stream``;
    +    # reject any element containing ``:`` so ``.git::$INDEX_ALLOCATION``
    +    # and similar forms cannot bypass the ``.git`` check.
    +    if b":" in element:
    +        return False
         normalized = _normalize_path_element_ntfs(element)
         if normalized in INVALID_DOTNAMES:
             return False
    -    if normalized == b"git~1":
    +    if _is_ntfs_dotgit_short_name(normalized):
             return False
         return True
     
    
  • NEWS+14 0 modified
    @@ -50,6 +50,20 @@
        that did not exist on disk, leaving LFS-tracked files as pointers when
        cloning from a local repo. (Jelmer Vernooij)
     
    + * SECURITY: Reject tree entries whose path components would be
    +   interpreted as path separators or alternate-data-stream markers on
    +   Windows. A malicious repository could previously craft entries such
    +   as ``.git\hooks\pre-commit.exe``, ``..\outside.txt``,
    +   ``.git::$INDEX_ALLOCATION`` or any ``git~<digits>`` 8.3 short-name
    +   alias of ``.git`` and have them materialized on a Windows clone,
    +   planting files under ``.git\`` (which Git for Windows then
    +   executes) or escaping the work tree. The NTFS path-element
    +   validator now rejects ``\``, ``:``, and all ``git~<digits>`` forms.
    +   ``core.protectNTFS`` now defaults to true on every platform
    +   (matching Git's ``PROTECT_NTFS_DEFAULT=1``), and ``core.protectNTFS``
    +   / ``core.protectHFS`` configuration keys are now read under their
    +   documented names. Reported by Christopher Toth.
    +
      * Deduplicate objects when writing a multi-pack-index. Objects present
        in multiple packs (e.g. after ``git gc`` creates a cruft pack) would
        otherwise produce an OIDL chunk with repeated SHAs, causing ``git
    
  • tests/test_index.py+108 0 modified
    @@ -621,6 +621,73 @@ def test_git_dir(self) -> None:
                 )
                 self.assertFileContents(epath, b"d")
     
    +    def test_ntfs_malicious_entries_dropped(self) -> None:
    +        # A malicious tree authored on POSIX containing NTFS-hostile
    +        # entries must not materialize any of them under the NTFS
    +        # validator — the combination would let an attacker plant
    +        # ``.git\hooks\pre-commit.exe`` or ``.git::$INDEX_ALLOCATION``
    +        # on a Windows clone.
    +        from dulwich.index import validate_path_element_ntfs
    +
    +        repo_dir = tempfile.mkdtemp()
    +        self.addCleanup(shutil.rmtree, repo_dir)
    +        with Repo.init(repo_dir) as repo:
    +            hook = Blob.from_string(b"malicious hook")
    +            escape = Blob.from_string(b"outside payload")
    +            shortname = Blob.from_string(b"masquerading as .git")
    +            ads = Blob.from_string(b"alternate data stream payload")
    +            benign = Blob.from_string(b"ok")
    +
    +            tree = Tree()
    +            tree[b".git\\hooks\\pre-commit.exe"] = (
    +                stat.S_IFREG | 0o755,
    +                hook.id,
    +            )
    +            tree[b"..\\outside.txt"] = (stat.S_IFREG | 0o644, escape.id)
    +            tree[b"git~1"] = (stat.S_IFREG | 0o644, shortname.id)
    +            tree[b".git::$INDEX_ALLOCATION"] = (
    +                stat.S_IFREG | 0o644,
    +                ads.id,
    +            )
    +            tree[b"ok.txt"] = (stat.S_IFREG | 0o644, benign.id)
    +
    +            repo.object_store.add_objects(
    +                [(o, None) for o in [hook, escape, shortname, ads, benign, tree]]
    +            )
    +
    +            build_index_from_tree(
    +                repo.path,
    +                repo.index_path(),
    +                repo.object_store,
    +                tree.id,
    +                validate_path_element=validate_path_element_ntfs,
    +            )
    +
    +            index = repo.open_index()
    +            self.assertEqual(list(index), [b"ok.txt"])
    +
    +            # Nothing written under the literal paths (the POSIX form)
    +            # or under `.git/` (the Windows decomposition of `\`).
    +            self.assertFalse(
    +                os.path.exists(os.path.join(repo.path, ".git\\hooks\\pre-commit.exe"))
    +            )
    +            self.assertFalse(
    +                os.path.exists(
    +                    os.path.join(repo.path, ".git", "hooks", "pre-commit.exe")
    +                )
    +            )
    +            # ``git~1`` and ``.git::$INDEX_ALLOCATION`` would resolve
    +            # against the existing ``.git`` directory on NTFS (8.3
    +            # short-name and alternate-data-stream resolution), so
    +            # ``os.path.exists`` can return true even when nothing was
    +            # materialized as a literal entry. Check the directory
    +            # listing instead.
    +            work_tree_entries = os.listdir(repo.path)
    +            self.assertNotIn("git~1", work_tree_entries)
    +            self.assertNotIn(".git::$INDEX_ALLOCATION", work_tree_entries)
    +            # Nothing escaped the work tree either.
    +            self.assertNotIn("outside.txt", os.listdir(os.path.dirname(repo.path)))
    +
         def test_nonempty(self) -> None:
             repo_dir = tempfile.mkdtemp()
             self.addCleanup(shutil.rmtree, repo_dir)
    @@ -1459,6 +1526,47 @@ def test_hfs(self) -> None:
             self.assertTrue(validate_path_element_hfs(b".g\xc3\xaft"))  # .gït
             self.assertTrue(validate_path_element_hfs(b"git"))  # git without dot
     
    +    def test_ntfs_rejects_backslash(self) -> None:
    +        # A backslash is a path separator on Windows, so a tree entry
    +        # containing one would materialize as nested directories and
    +        # let an attacker plant ``.git\hooks\pre-commit`` or escape
    +        # the work tree with ``..\outside``.
    +        self.assertFalse(validate_path_element_ntfs(b".git\\hooks\\pre-commit"))
    +        self.assertFalse(validate_path_element_ntfs(b"..\\outside"))
    +        self.assertFalse(validate_path_element_ntfs(b"a\\b"))
    +        self.assertFalse(validate_path_element_ntfs(b"foo\\"))
    +
    +    def test_non_ntfs_validators_accept_backslash(self) -> None:
    +        # On POSIX/HFS a backslash is a valid filename byte. The
    +        # protection is gated on the NTFS validator (selected by
    +        # core.protectNTFS), so the other validators still accept it.
    +        self.assertTrue(validate_path_element_default(b"a\\b"))
    +        self.assertTrue(validate_path_element_hfs(b"a\\b"))
    +
    +    def test_ntfs_rejects_all_short_name_variants(self) -> None:
    +        # Git's is_ntfs_dotgit rejects any ``git~<digits>`` 8.3
    +        # short-name form; previously only the literal ``git~1`` was
    +        # checked.
    +        for name in (b"git~1", b"git~2", b"git~10", b"GIT~1", b"gIt~3"):
    +            self.assertFalse(
    +                validate_path_element_ntfs(name),
    +                f"{name!r} should be rejected on NTFS",
    +            )
    +        # Trailing ``.``/space is stripped by NTFS — same names.
    +        self.assertFalse(validate_path_element_ntfs(b"git~1."))
    +        self.assertFalse(validate_path_element_ntfs(b"git~1 "))
    +        # Names that merely contain ``git~`` are still accepted.
    +        self.assertTrue(validate_path_element_ntfs(b"git~foo"))
    +        self.assertTrue(validate_path_element_ntfs(b"mygit~1"))
    +
    +    def test_ntfs_rejects_alternate_data_stream(self) -> None:
    +        # NTFS alternate data streams are addressed as ``name:stream``;
    +        # a ``:`` anywhere in an element can smuggle a write to
    +        # ``.git::$INDEX_ALLOCATION`` etc.
    +        self.assertFalse(validate_path_element_ntfs(b".git::$INDEX_ALLOCATION"))
    +        self.assertFalse(validate_path_element_ntfs(b".git:evil"))
    +        self.assertFalse(validate_path_element_ntfs(b"foo:bar"))
    +
     
     class TestDecodeUTF8WithFallback(TestCase):
         """Tests for the xutftowcsn-style lossy UTF-8 decoder."""
    
1ca18147a1d0

submodule: Reject unsafe submodule paths in submodule_update

https://github.com/jelmer/dulwichJelmer VernooijMay 28, 2026Fixed in dulwich-1.2.5via ghsa-release-walk
6 files changed · +171 30
  • dulwich/index.py+25 0 modified
    @@ -60,6 +60,7 @@
         "commit_tree",
         "detect_case_only_renames",
         "get_path_element_normalizer",
    +    "get_path_element_validator",
         "get_unstaged_changes",
         "index_entry_from_stat",
         "make_path_normalizer",
    @@ -2073,6 +2074,30 @@ def validate_path_element_hfs(element: bytes) -> bool:
         return True
     
     
    +def get_path_element_validator(config: "Config") -> Callable[[bytes], bool]:
    +    """Get the path-element validator to use when checking out a tree.
    +
    +    ``core.protectNTFS`` defaults to true on every platform (matching Git's
    +    ``PROTECT_NTFS_DEFAULT=1``) because a repository authored on POSIX can
    +    still be cloned on Windows later; ``core.protectHFS`` defaults to true on
    +    macOS. With both disabled this falls back to the default validator, which
    +    only refuses ``.git``, ``.`` and ``..``.
    +
    +    Args:
    +        config: Repository configuration object
    +
    +    Returns:
    +        Function that validates a single path element for the configured
    +        filesystem protections.
    +    """
    +    if config.get_boolean(b"core", b"protectNTFS", True):
    +        return validate_path_element_ntfs
    +    elif config.get_boolean(b"core", b"protectHFS", sys.platform == "darwin"):
    +        return validate_path_element_hfs
    +    else:
    +        return validate_path_element_default
    +
    +
     def validate_path(
         path: bytes,
         element_validator: Callable[[bytes], bool] = validate_path_element_default,
    
  • dulwich/porcelain/submodule.py+28 2 modified
    @@ -22,7 +22,7 @@
     """Porcelain functions for working with submodules."""
     
     import os
    -from collections.abc import Iterator, Sequence
    +from collections.abc import Callable, Iterator, Sequence
     from typing import TYPE_CHECKING, BinaryIO
     
     from ..config import ConfigFile, read_submodules
    @@ -87,6 +87,29 @@ def submodule_init(repo: str | os.PathLike[str] | Repo) -> None:
             config.write_to_path()
     
     
    +def _check_submodule_path(path: bytes, validator: Callable[[bytes], bool]) -> None:
    +    """Reject submodule paths that would escape the working tree.
    +
    +    Args:
    +      path: Submodule path as it appears in the tree gitlink entry.
    +      validator: Path-element validator selected for this repository.
    +
    +    Raises:
    +      Error: If the path is absolute or carries a component (e.g. ``.git`` or
    +        ``..``) that the validator rejects. This is the same bar git applies
    +        to submodule paths, not a stricter one.
    +    """
    +    from ..index import validate_path
    +    from . import Error
    +
    +    # Tree paths always use "/" as the separator; a leading "/" or "\\" would
    +    # make os.path.join discard the repository root, so treat it as absolute.
    +    if path.startswith((b"/", b"\\")):
    +        raise Error(f"refusing submodule with absolute path: {path!r}")
    +    if not validate_path(path, validator):
    +        raise Error(f"refusing submodule with unsafe path: {path!r}")
    +
    +
     def submodule_list(repo: "RepoPath") -> Iterator[tuple[str, str]]:
         """List submodules.
     
    @@ -122,7 +145,7 @@ def submodule_update(
           errstream: Error stream for error messages
         """
         from ..client import get_transport_and_path
    -    from ..index import build_index_from_tree
    +    from ..index import build_index_from_tree, get_path_element_validator
         from ..refs import HEADREF
         from ..submodule import iter_cached_submodules
         from . import (
    @@ -138,12 +161,14 @@ def submodule_update(
     
             config = r.get_config()
             gitmodules_path = os.path.join(r.path, ".gitmodules")
    +        path_validator = get_path_element_validator(config)
     
             # Get list of submodules to update
             submodules_to_update = []
             head_commit = r[r.head()]
             assert isinstance(head_commit, Commit)
             for path, sha in iter_cached_submodules(r.object_store, head_commit.tree):
    +            _check_submodule_path(path, path_validator)
                 path_str = (
                     path.decode(DEFAULT_ENCODING) if isinstance(path, bytes) else path
                 )
    @@ -231,6 +256,7 @@ def submodule_update(
                             sub_repo.index_path(),
                             sub_repo.object_store,
                             tree_id,
    +                        validate_path_element=path_validator,
                         )
                 else:
                     # Fetch and checkout in existing submodule
    
  • dulwich/stash.py+2 14 modified
    @@ -28,7 +28,6 @@
     ]
     
     import os
    -import sys
     from typing import TYPE_CHECKING, TypedDict
     
     from .diff_tree import tree_changes
    @@ -38,14 +37,12 @@
         _tree_to_fs_path,
         build_file_from_blob,
         commit_tree,
    +    get_path_element_validator,
         index_entry_from_stat,
         iter_fresh_objects,
         symlink,
         update_working_tree,
         validate_path,
    -    validate_path_element_default,
    -    validate_path_element_hfs,
    -    validate_path_element_ntfs,
     )
     from .object_store import iter_tree_contents
     from .objects import S_IFGITLINK, Blob, Commit, ObjectID, TreeEntry
    @@ -163,16 +160,7 @@ def pop(self, index: int, *, config: "Config | None" = None) -> "Entry":
             # Get config for working directory update
             config = self._repo.get_config()
             honor_filemode = config.get_boolean(b"core", b"filemode", os.name != "nt")
    -
    -        # core.protectNTFS defaults to True on all platforms (matching
    -        # Git's PROTECT_NTFS_DEFAULT=1) because a repo authored on
    -        # POSIX can still be cloned on Windows later.
    -        if config.get_boolean(b"core", b"protectNTFS", True):
    -            validate_path_element = validate_path_element_ntfs
    -        elif config.get_boolean(b"core", b"protectHFS", sys.platform == "darwin"):
    -            validate_path_element = validate_path_element_hfs
    -        else:
    -            validate_path_element = validate_path_element_default
    +        validate_path_element = get_path_element_validator(config)
     
             if config.get_boolean(b"core", b"symlinks", True):
                 symlink_fn = symlink
    
  • dulwich/worktree.py+2 13 modified
    @@ -43,7 +43,6 @@
     import os
     import shutil
     import stat
    -import sys
     import tempfile
     import time
     import warnings
    @@ -807,10 +806,8 @@ def reset_index(
             stacked_config = config
             from .index import (
                 build_index_from_tree,
    +            get_path_element_validator,
                 symlink,
    -            validate_path_element_default,
    -            validate_path_element_hfs,
    -            validate_path_element_ntfs,
             )
     
             if tree is None:
    @@ -824,15 +821,7 @@ def reset_index(
                 tree = head.tree
             config = self._repo.get_config()
             honor_filemode = config.get_boolean(b"core", b"filemode", os.name != "nt")
    -        # core.protectNTFS defaults to True on all platforms (matching
    -        # Git's PROTECT_NTFS_DEFAULT=1) because a repo authored on
    -        # POSIX can still be cloned on Windows later.
    -        if config.get_boolean(b"core", b"protectNTFS", True):
    -            validate_path_element = validate_path_element_ntfs
    -        elif config.get_boolean(b"core", b"protectHFS", sys.platform == "darwin"):
    -            validate_path_element = validate_path_element_hfs
    -        else:
    -            validate_path_element = validate_path_element_default
    +        validate_path_element = get_path_element_validator(config)
             if config.get_boolean(b"core", b"symlinks", True):
                 symlink_fn = symlink
             else:
    
  • NEWS+14 1 modified
    @@ -1,5 +1,18 @@
     1.2.5	UNRELEASED
     
    + * SECURITY(GHSA-gfhv-vqv2-4544): Validate submodule paths in
    +   ``porcelain.submodule_update`` (and thus
    +   ``porcelain.clone(recurse_submodules=True)``). A crafted upstream
    +   repository could carry a submodule whose path was ``.git/hooks`` (or
    +   any other path inside ``.git`` or above the work tree), causing the
    +   submodule's tree contents to be written there with their executable
    +   bits intact -- dropping a hook that later commands would run. Submodule
    +   paths are now rejected if they are absolute or carry a component that
    +   the configured path validator refuses, and the submodule's own tree is
    +   materialized with the same validator. This is the dulwich analogue of git's
    +   CVE-2024-32002 / CVE-2024-32004.
    +   (Jelmer Vernooij; reported by tonghuaroot)
    +
      * SECURITY(CVE-2026-42305): Harden tree path validation against entry
        names that are harmless on POSIX but dangerous when checked out on
        Windows. A crafted tree could previously carry such names through to
    @@ -47,7 +60,7 @@
        unlimited, matching git's semantics) and ``ReceivePackHandler``
        reads ``receive.maxInputSize`` from the repository config and
        passes it through. Exceeding the cap raises ``PackInputTooLarge``.
    -   (Jelmer Vernooij; Reported by Liyi, Ziyue, Strick, Maurice and Chenchen @ Univeristy of Sydney)
    +   (Jelmer Vernooij; Reported by Liyi, Ziyue, Strick, Maurice and Chenchen @ University of Sydney)
     
     1.2.4	2026-05-21
     
    
  • tests/porcelain/__init__.py+100 0 modified
    @@ -5645,6 +5645,106 @@ def test_update_recursive(self) -> None:
             with open(nested_submodule_file) as f:
                 self.assertEqual(f.read(), "nested submodule content")
     
    +    def _build_malicious_submodule_repo(self, submodule_path):
    +        """Build a parent repo whose gitlink path is attacker-controlled.
    +
    +        Returns the path to a bare attacker submodule repository and commits a
    +        matching ``.gitmodules`` plus tree gitlink entry into ``self.repo``,
    +        both pointing at ``submodule_path``.
    +        """
    +        attacker_path = tempfile.mkdtemp()
    +        self.addCleanup(shutil.rmtree, attacker_path)
    +        attacker = Repo.init_bare(attacker_path, mkdir=False)
    +        self.addCleanup(attacker.close)
    +
    +        payload = Blob.from_string(b"#!/bin/sh\necho PWNED\n")
    +        attacker.object_store.add_object(payload)
    +        tree = Tree()
    +        tree.add(b"post-checkout", 0o100755, payload.id)
    +        attacker.object_store.add_object(tree)
    +        commit = Commit()
    +        commit.tree = tree.id
    +        commit.author = commit.committer = b"a <a@a>"
    +        commit.author_time = commit.commit_time = 0
    +        commit.author_timezone = commit.commit_timezone = 0
    +        commit.message = b"payload"
    +        attacker.object_store.add_object(commit)
    +        attacker.refs[b"refs/heads/master"] = commit.id
    +        attacker.refs.set_symbolic_ref(b"HEAD", b"refs/heads/master")
    +
    +        gitmodules = (
    +            b'[submodule "evil"]\n'
    +            b"\tpath = " + submodule_path + b"\n"
    +            b"\turl = " + attacker_path.encode() + b"\n"
    +        )
    +        # A real clone checks .gitmodules out into the work tree; write it
    +        # directly so submodule_update can read it without a full checkout.
    +        with open(os.path.join(self.repo.path, ".gitmodules"), "wb") as f:
    +            f.write(gitmodules)
    +        gmb = Blob.from_string(gitmodules)
    +        self.repo.object_store.add_object(gmb)
    +        vt = Tree()
    +        vt.add(b".gitmodules", 0o100644, gmb.id)
    +        vt.add(submodule_path, 0o160000, commit.id)
    +        self.repo.object_store.add_object(vt)
    +        vc = Commit()
    +        vc.tree = vt.id
    +        vc.author = vc.committer = b"a <a@a>"
    +        vc.author_time = vc.commit_time = 0
    +        vc.author_timezone = vc.commit_timezone = 0
    +        vc.message = b"parent"
    +        self.repo.object_store.add_object(vc)
    +        self.repo.refs[b"refs/heads/master"] = vc.id
    +        self.repo.refs.set_symbolic_ref(b"HEAD", b"refs/heads/master")
    +        return attacker_path
    +
    +    def test_update_rejects_dotgit_path(self) -> None:
    +        # A submodule path of .git/hooks would drop the attacker's tree
    +        # into the parent's .git/hooks directory (CVE-style RCE via hooks).
    +        self._build_malicious_submodule_repo(b".git/hooks")
    +        self.assertRaises(
    +            porcelain.Error,
    +            porcelain.submodule_update,
    +            self.repo,
    +            init=True,
    +        )
    +        hook = os.path.join(self.repo.path, ".git", "hooks", "post-checkout")
    +        self.assertFalse(os.path.exists(hook))
    +
    +    def test_update_rejects_parent_traversal_path(self) -> None:
    +        self._build_malicious_submodule_repo(b"../escape")
    +        self.assertRaises(
    +            porcelain.Error,
    +            porcelain.submodule_update,
    +            self.repo,
    +            init=True,
    +        )
    +
    +    def test_check_submodule_path(self) -> None:
    +        from dulwich.index import (
    +            validate_path_element_default,
    +            validate_path_element_ntfs,
    +        )
    +        from dulwich.porcelain.submodule import _check_submodule_path
    +
    +        # .git and .. components are rejected on every platform.
    +        for bad in (b".git/hooks", b"..", b"a/../b", b"/abs", b".git"):
    +            self.assertRaises(
    +                porcelain.Error, _check_submodule_path, bad, validate_path_element_ntfs
    +            )
    +
    +        # A path that is only unsafe on NTFS (a reserved device name) is
    +        # refused under the default protectNTFS validator but, like git with
    +        # core.protectNTFS=false, accepted by the default validator so an
    +        # existing POSIX repository can still be updated.
    +        self.assertRaises(
    +            porcelain.Error, _check_submodule_path, b"aux", validate_path_element_ntfs
    +        )
    +        _check_submodule_path(b"aux", validate_path_element_default)
    +
    +        # Ordinary nested paths pass under either validator.
    +        _check_submodule_path(b"libs/foo", validate_path_element_ntfs)
    +
     
     class PushTests(PorcelainTestCase):
         def test_simple(self) -> None:
    

Vulnerability mechanics

No source-code context for this CVE — mechanics is only generated when we can read the actual fix diff. Without that, the four sections (root cause, attack vector, affected code, fix) would be speculation rather than analysis.

References

3

News mentions

0

No linked articles in our index yet.