Medium severity4.7NVD Advisory· Published Apr 1, 2026· Updated Apr 15, 2026
CVE-2026-34446
CVE-2026-34446
Description
Open Neural Network Exchange (ONNX) is an open standard for machine learning interoperability. Prior to version 1.21.0, there is an issue in onnx.load, the code checks for symlinks to prevent path traversal, but completely misses hardlinks because a hardlink looks exactly like a regular file on the filesystem. This issue has been patched in version 1.21.0.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
onnxPyPI | < 1.21.0 | 1.21.0 |
Affected products
1Patches
14755f8053928Improve external data file handling - onnx.load (#7717)
6 files changed · +407 −46
docs/Security.md+95 −0 added@@ -0,0 +1,95 @@ +<!-- +Copyright (c) ONNX Project Contributors + +SPDX-License-Identifier: Apache-2.0 +--> + +# External Data Security + +This document describes the security model for loading and saving external data files in ONNX models. It is intended for maintainers working on the external data code paths. + +## Threat Model + +When an ONNX model references external data files via relative paths, an attacker who controls the model file can attempt: + +- **Symlink traversal**: A final-component symlink in the external data path pointing to a sensitive file (e.g., `/etc/shadow`), causing ONNX to read or overwrite arbitrary files. +- **Parent-directory symlink**: A symlink in a parent directory component of the external data path, bypassing a check that only inspects the final component. +- **Hardlink attacks**: A hardlink to a sensitive file appearing as a normal file, bypassing symlink-only checks while still exposing unintended data. +- **Path traversal**: Using `..` segments or absolute paths to escape the model directory. + +## Defense Layers + +We use a 4-layer defense-in-depth approach. Each layer is applied at every entry point that opens external data files. + +### Layer 1: Canonical Path Containment + +- **C++**: `std::filesystem::weakly_canonical()` resolves the path, then verifies it starts with the canonical base directory. +- **Python**: `os.path.realpath()` resolves all symlinks in the full path, then verifies the result is within the model base directory. + +This catches `..` traversal and symlinks in any path component (not just the final one). + +### Layer 2: Symlink Detection + +- **C++**: `std::filesystem::is_symlink(data_path)` rejects the final-component symlink. +- **Python**: `os.path.islink(path)` rejects the final-component symlink. + +This is a belt-and-suspenders check alongside containment. It provides a clear, specific error message when the final path component is a symlink. + +### Layer 3: O_NOFOLLOW on File Open (Python only) + +- **Python**: `os.O_NOFOLLOW` added to `os.open()` flags where available (`hasattr(os, "O_NOFOLLOW")`). + +The C++ checker validates paths but does not open files, so `O_NOFOLLOW` is not applicable there. In Python, this is the last-resort defense: even if a symlink is created between the check and the open (TOCTOU race), the kernel rejects the open with `ELOOP` on Linux/macOS. + +### Layer 4: Hardlink Count Check + +- **C++**: `std::filesystem::hard_link_count(data_path) > 1` rejects files with multiple hardlinks. +- **Python**: `os.stat(path).st_nlink > 1` rejects files with multiple hardlinks. + +This prevents an attacker from using a hardlink (which is not a symlink) to point external data at a sensitive file. Note that `O_NOFOLLOW` does **not** protect against hardlinks — only this explicit check does. + +## Protected Entry Points + +Not all layers apply at every entry point. The C++ checker validates paths but does not open files, so Layer 3 (O_NOFOLLOW) is Python-only. + +| Entry Point | File | Layers | +|---|---|---| +| `_resolve_external_data_location` | `onnx/checker.cc` | 1, 2, 4 | +| `load_external_data_for_tensor` | `onnx/external_data_helper.py` | 1, 2, 3, 4 | +| `save_external_data` | `onnx/external_data_helper.py` | 1, 2, 3, 4 | +| `ModelContainer._load_large_initializers` | `onnx/model_container.py` | 1, 2, 3, 4 | + +The C++ checker runs first for all Python load paths (via `c_checker._resolve_external_data_location`). The Python checks serve as defense-in-depth. + +## Known Limitations + +### TOCTOU (Time-of-Check-to-Time-of-Use) + +There is an inherent race window between the security checks (Layers 1-2, 4) and the file open (Layer 3). An attacker with write access to the model directory could: + +1. Place a legitimate file to pass checks. +2. Replace it with a symlink or hardlink between the check and the open. + +**Mitigation**: `O_NOFOLLOW` (Layer 3) catches late symlink replacement on Linux/macOS at the kernel level. However, `O_NOFOLLOW` does **not** protect against hardlink replacement — this TOCTOU gap cannot be fully closed at the application level. + +### Windows + +- `O_NOFOLLOW` is **not available** on Windows (`hasattr(os, "O_NOFOLLOW")` returns `False`). The TOCTOU window for symlink attacks is fully open on Windows, relying solely on Layers 1-2. +- Symlink and hardlink tests are skipped on Windows in the test suite. + +### Case-Insensitive Filesystems + +The canonical path containment check uses string comparison. On case-insensitive filesystems (Windows NTFS, macOS HFS+), paths with different casing may incorrectly fail containment. This fails closed (false rejection, not a bypass). + +## Testing + +Test coverage is in: + +- **C++**: `onnx/test/cpp/checker_test.cc` — `SymLink*` tests for symlink detection and containment. +- **Python**: `onnx/test/test_external_data.py`: + - `TestSaveExternalDataSymlinkProtection` — save-side symlink rejection. + - `TestLoadExternalDataSymlinkProtection` — load-side symlink rejection, parent-directory symlink, `load_external_data_for_model` rejection. + - `TestLoadExternalDataHardlinkProtection` — load-side hardlink rejection. + - `TestSaveExternalDataAbsolutePathValidation` — absolute path rejection. + +Symlink and hardlink tests are skipped on Windows (`os.name == "nt"`).
onnx/checker.cc+39 −0 modified@@ -1021,6 +1021,45 @@ std::string resolve_external_data_location( data_path_str, ", but it is a symbolic link."); } + // Verify the resolved path stays within the base directory to prevent + // path traversal via symlinks in parent directory components. + // is_symlink() only checks the final component; a path like + // "symlink_subdir/real_file.data" would bypass it. + if (data_path_str[0] != '#') { + std::error_code ec; + auto canonical_base = std::filesystem::weakly_canonical(base_dir_path, ec); + if (ec) { + fail_check( + "Data of TensorProto ( tensor name: ", + tensor_name, + ") references external data at ", + data_path_str, + ", but the model directory path could not be resolved."); + } + auto canonical_data = std::filesystem::weakly_canonical(data_path, ec); + if (ec) { + fail_check( + "Data of TensorProto ( tensor name: ", + tensor_name, + ") references external data at ", + data_path_str, + ", but the data path could not be resolved."); + } + auto canonical_base_native = canonical_base.native(); + auto canonical_data_native = canonical_data.native(); + if (!canonical_base_native.empty() && canonical_base_native.back() != std::filesystem::path::preferred_separator) { + canonical_base_native += std::filesystem::path::preferred_separator; + } + if (canonical_data_native.find(canonical_base_native) != 0) { + fail_check( + "Data of TensorProto ( tensor name: ", + tensor_name, + ") at ", + data_path_str, + " resolves to a location outside the model directory, " + "indicating a potential path traversal attack via symbolic links in directory components."); + } + } if (data_path_str[0] != '#' && !std::filesystem::is_regular_file(data_path)) { fail_check( "Data of TensorProto ( tensor name: ",
onnx/external_data_helper.py+60 −16 modified@@ -43,6 +43,53 @@ def __init__(self, tensor: TensorProto) -> None: self.length = int(self.length) +def _validate_external_data_path( + base_dir: str, + data_path: str, + tensor_name: str, + *, + check_exists: bool = True, +) -> str: + """Validate that an external data path is safe to open. + + Performs three security checks: + 1. Canonical path containment — resolved path must stay within base_dir. + 2. Symlink rejection — final-component symlinks are not allowed. + 3. Hardlink count — files with multiple hard links are rejected. + + Args: + base_dir: The model base directory that data_path must be contained in. + data_path: The external data file path to validate. + tensor_name: Tensor name for error messages. + check_exists: If True (default), check hardlink count. Set to False + for save-side paths where the file may not exist yet. + + Returns: + The validated data_path (unchanged). + + Raises: + onnx.checker.ValidationError: If any security check fails. + """ + real_base = os.path.realpath(base_dir) + real_path = os.path.realpath(data_path) + if not real_path.startswith(real_base + os.sep) and real_path != real_base: + raise onnx_checker.ValidationError( + f"Tensor {tensor_name!r} external data path resolves to " + f"{real_path!r} which is outside the model directory {real_base!r}." + ) + if os.path.islink(data_path): + raise onnx_checker.ValidationError( + f"Tensor {tensor_name!r} external data path {data_path!r} " + f"is a symbolic link, which is not allowed for security reasons." + ) + if check_exists and os.path.exists(data_path) and os.stat(data_path).st_nlink > 1: + raise onnx_checker.ValidationError( + f"Tensor {tensor_name!r} external data path {data_path!r} " + f"has multiple hard links, which is not allowed for security reasons." + ) + return data_path + + def load_external_data_for_tensor(tensor: TensorProto, base_dir: str) -> None: """Loads data from an external file for tensor. Ideally TensorProto should not hold any raw data but if it does it will be ignored. @@ -55,7 +102,14 @@ def load_external_data_for_tensor(tensor: TensorProto, base_dir: str) -> None: external_data_file_path = c_checker._resolve_external_data_location( # type: ignore[attr-defined] base_dir, info.location, tensor.name ) - with open(external_data_file_path, "rb") as data_file: + # Security checks (symlink, containment, hardlink) already performed + # by C++ _resolve_external_data_location() above. + # Use O_NOFOLLOW where available as defense-in-depth for symlink protection + open_flags = os.O_RDONLY + if hasattr(os, "O_NOFOLLOW"): + open_flags |= os.O_NOFOLLOW + fd = os.open(external_data_file_path, open_flags) + with os.fdopen(fd, "rb") as data_file: if info.offset: data_file.seek(info.offset) @@ -219,21 +273,11 @@ def save_external_data(tensor: TensorProto, base_path: str) -> None: external_data_file_path = os.path.join(base_path, info.location) - # Verify the resolved path stays within base_path (prevent symlink-based path traversal) - real_base = os.path.realpath(base_path) - real_path = os.path.realpath(external_data_file_path) - if not real_path.startswith(real_base + os.sep) and real_path != real_base: - raise onnx_checker.ValidationError( - f"Tensor {tensor.name!r} external data path resolves to " - f"{real_path!r} which is outside the model directory {real_base!r}." - ) - - # Reject symlinks to prevent arbitrary file overwrites - if os.path.islink(external_data_file_path): - raise onnx_checker.ValidationError( - f"Tensor {tensor.name!r} external data path {external_data_file_path!r} " - f"is a symbolic link, which is not allowed for security reasons." - ) + # C++ _resolve_external_data_location() cannot be used on save path + # (file may not exist yet), so Python performs its own security validation. + _validate_external_data_path( + base_path, external_data_file_path, tensor.name, check_exists=True + ) # Retrieve the tensor's data from raw_data or load external file if not tensor.HasField("raw_data"):
onnx/model_container.py+8 −1 modified@@ -293,10 +293,17 @@ def _load_large_initializers(self, file_path): external_data_file_path = c_checker._resolve_external_data_location( # type: ignore[attr-defined] base_dir, info.location, tensor.name ) + # Security checks (symlink, containment, hardlink) already performed + # by C++ _resolve_external_data_location() above. key = f"#t{i}" _set_external_data(tensor, location=key) - with open(external_data_file_path, "rb") as data_file: + # Use O_NOFOLLOW where available for symlink protection + open_flags = os.O_RDONLY + if hasattr(os, "O_NOFOLLOW"): + open_flags |= os.O_NOFOLLOW + fd = os.open(external_data_file_path, open_flags) + with os.fdopen(fd, "rb") as data_file: if info.offset: data_file.seek(info.offset)
onnx/test/cpp/checker_test.cc+61 −13 modified@@ -3,6 +3,7 @@ // SPDX-License-Identifier: Apache-2.0 #include <filesystem> +#include <fstream> #include <memory> #include <string> @@ -32,23 +33,70 @@ TEST(CHECKER, ValidDataLocationTest) { } TEST(CHECKER, ValidDataLocationSymLinkTest) { -#ifndef ONNX_NO_EXCEPTIONS - fs::path tempDir = fs::temp_directory_path() / "symlink_test-%%%%%%"; // NOSONAR - fs::create_directories(tempDir); - fs::path target = tempDir / "model.data"; - fs::path link = tempDir / "link.data"; +#if !defined(ONNX_NO_EXCEPTIONS) && !defined(_WIN32) + // Use a temp directory as the base_dir (simulating the model directory). + // We pass a relative filename as the location so that the absolute-path + // rejection (checker.cc line 986) is NOT triggered, and the is_symlink() + // check (checker.cc line 1016) is actually exercised. + fs::path modelDir = fs::temp_directory_path() / "onnx_symlink_checker_test"; + fs::remove_all(modelDir); + fs::create_directories(modelDir); + + // Create a regular target file so the symlink has a valid target. + fs::path target = modelDir / "target.data"; + { + std::ofstream ofs(target); + ofs << "test data"; + } + + // Create a symlink pointing to the target file. + fs::path link = modelDir / "link.data"; fs::create_symlink(target, link); -#ifdef WIN32 - std::string location = link.u8string(); -#else - std::string location = link.c_str(); + + // Pass relative filename "link.data" — the checker resolves it to + // modelDir/link.data and should reject it because it is a symlink. + EXPECT_THROW( + ONNX_NAMESPACE::checker::resolve_external_data_location(modelDir.string(), "link.data", "tensor_name"), + ONNX_NAMESPACE::checker::ValidationError); + + fs::remove_all(modelDir); #endif +} + +TEST(CHECKER, ValidDataLocationParentDirSymLinkTest) { +#if !defined(ONNX_NO_EXCEPTIONS) && !defined(_WIN32) + // Test that symlinks in parent directory components are detected. + // A location like "symlink_subdir/real_file.data" where symlink_subdir + // is a symlink to an outside directory should be rejected by the + // canonical path containment check in checker.cc. + fs::path modelDir = fs::temp_directory_path() / "onnx_parent_symlink_test"; + fs::remove_all(modelDir); + fs::create_directories(modelDir); + + // Create a target directory outside the model directory. + fs::path outsideDir = fs::temp_directory_path() / "onnx_outside_target"; + fs::remove_all(outsideDir); + fs::create_directories(outsideDir); + + // Create a real file in the outside directory. + fs::path targetFile = outsideDir / "secret.data"; + { + std::ofstream ofs(targetFile); + ofs << "sensitive data"; + } + + // Create a directory symlink inside modelDir pointing outside. + fs::path symlinkSubdir = modelDir / "subdir"; + fs::create_directory_symlink(outsideDir, symlinkSubdir); + + // "subdir/secret.data" is a relative path where "subdir" is a symlink. + // The canonical path resolves outside modelDir, so this should be rejected. EXPECT_THROW( - ONNX_NAMESPACE::checker::resolve_external_data_location("localfolder", location, "tensor_name"), + ONNX_NAMESPACE::checker::resolve_external_data_location(modelDir.string(), "subdir/secret.data", "tensor_name"), ONNX_NAMESPACE::checker::ValidationError); - fs::remove(link); - fs::remove(target); - fs::remove(tempDir); + + fs::remove_all(modelDir); + fs::remove_all(outsideDir); #endif }
onnx/test/test_external_data.py+144 −16 modified@@ -6,6 +6,7 @@ import itertools import os import pathlib +import shutil import tempfile import unittest import uuid @@ -885,6 +886,21 @@ def test_subgraph(self) -> None: self._check(model, constant_nodes) +def _make_external_data_test_model() -> tuple[ModelProto, np.ndarray]: + """Create a simple model with a large initializer suitable for external data tests.""" + model = parser.parse_model( + """ + <ir_version: 7, opset_import: ["": 17]> + agraph (float[100, 100] input) => (float[100, 100] output) { + output = Identity(input) + } + """ + ) + array = np.ones((100, 100), dtype=np.float32) + model.graph.initializer.append(from_array(array, name="weight")) + return model, array + + @unittest.skipIf( os.name == "nt", reason="Symlinks require elevated privileges on Windows" ) @@ -897,22 +913,7 @@ def test_save_rejects_symlink_target(self) -> None: with open(sensitive_file, "w") as f: f.write("SENSITIVE DATA") - # Create a model with external data - array = np.ones((100, 100), dtype=np.float32) - tensor = from_array(array, name="weight") - model = helper.make_model( - helper.make_graph( - [helper.make_node("Identity", ["input"], ["output"])], - "test", - [helper.make_tensor_value_info("input", TensorProto.FLOAT, [100, 100])], - [ - helper.make_tensor_value_info( - "output", TensorProto.FLOAT, [100, 100] - ) - ], - [tensor], - ) - ) + model, array = _make_external_data_test_model() model_path = os.path.join(self.temp_dir, "model.onnx") ext_data = "data.bin" onnx.save_model( @@ -947,6 +948,133 @@ def test_save_rejects_symlink_target(self) -> None: self.assertEqual(f.read(), "SENSITIVE DATA") +@unittest.skipIf( + os.name == "nt", reason="Symlinks require elevated privileges on Windows" +) +class TestLoadExternalDataSymlinkProtection(TestLoadExternalDataBase): + """Test that loading external data rejects symlinks to prevent arbitrary file reads.""" + + def test_load_rejects_symlink_external_data(self) -> None: + """Loading a model whose external data is a symlink must raise ValidationError.""" + model, _ = _make_external_data_test_model() + model_path = os.path.join(self.temp_dir, "model.onnx") + ext_data = "data.bin" + onnx.save_model( + model, + model_path, + save_as_external_data=True, + all_tensors_to_one_file=True, + location=ext_data, + size_threshold=1024, + ) + + # Create a target file and replace external data with a symlink to it + target_file = os.path.join(self.temp_dir, "target.txt") + with open(target_file, "w") as f: + f.write("SENSITIVE DATA") + + ext_data_path = os.path.join(self.temp_dir, ext_data) + os.remove(ext_data_path) + os.symlink(target_file, ext_data_path) + + # Loading with onnx.load (which loads external data) must fail + with self.assertRaises(checker.ValidationError): + onnx.load(model_path) + + def test_load_external_data_for_model_rejects_symlink(self) -> None: + """load_external_data_for_model must reject symlinked external data.""" + model, _ = _make_external_data_test_model() + model_path = os.path.join(self.temp_dir, "model.onnx") + ext_data = "data.bin" + onnx.save_model( + model, + model_path, + save_as_external_data=True, + all_tensors_to_one_file=True, + location=ext_data, + size_threshold=1024, + ) + + # Replace external data with a symlink + target_file = os.path.join(self.temp_dir, "target.txt") + with open(target_file, "w") as f: + f.write("SENSITIVE DATA") + + ext_data_path = os.path.join(self.temp_dir, ext_data) + os.remove(ext_data_path) + os.symlink(target_file, ext_data_path) + + # Load model without external data, then try to load external data explicitly + loaded_model = onnx.load(model_path, load_external_data=False) + with self.assertRaises(checker.ValidationError): + load_external_data_for_model(loaded_model, self.temp_dir) + + def test_load_rejects_parent_directory_symlink(self) -> None: + """A symlink in the parent directory must be caught by realpath containment.""" + # Create a "sensitive" directory outside the model directory with a data file + sensitive_dir = os.path.join(self.temp_dir, "sensitive") + os.makedirs(sensitive_dir) + secret_file = os.path.join(sensitive_dir, "secret.bin") + with open(secret_file, "wb") as f: + f.write(b"SENSITIVE DATA" * 100) + + # Create a model directory with a real subdir for saving + model_dir = os.path.join(self.temp_dir, "model_dir") + os.makedirs(model_dir) + subdir_path = os.path.join(model_dir, "subdir") + os.makedirs(subdir_path) + + # Create model with external data location "subdir/secret.bin" + model, _ = _make_external_data_test_model() + model_path = os.path.join(model_dir, "model.onnx") + onnx.save_model( + model, + model_path, + save_as_external_data=True, + all_tensors_to_one_file=True, + location="subdir/secret.bin", + size_threshold=1024, + ) + + # Replace the real subdir with a symlink to the sensitive directory + shutil.rmtree(subdir_path) + os.symlink(sensitive_dir, subdir_path) + + # Loading must fail because realpath resolves outside model_dir + loaded_model = onnx.load(model_path, load_external_data=False) + with self.assertRaises(checker.ValidationError): + load_external_data_for_model(loaded_model, model_dir) + + +@unittest.skipIf(os.name == "nt", reason="Hardlinks behave differently on Windows") +class TestLoadExternalDataHardlinkProtection(TestLoadExternalDataBase): + """Test that loading external data rejects files with multiple hardlinks.""" + + def test_load_rejects_hardlinked_external_data(self) -> None: + """Loading a model whose external data has multiple hardlinks must raise ValidationError.""" + model, _ = _make_external_data_test_model() + model_path = os.path.join(self.temp_dir, "model.onnx") + ext_data = "data.bin" + onnx.save_model( + model, + model_path, + save_as_external_data=True, + all_tensors_to_one_file=True, + location=ext_data, + size_threshold=1024, + ) + + # Create a hardlink to the external data file + ext_data_path = os.path.join(self.temp_dir, ext_data) + hardlink_path = os.path.join(self.temp_dir, "hardlink_data.bin") + os.link(ext_data_path, hardlink_path) + + # Loading must fail because the external data file has multiple hardlinks. + # Either the C++ checker or Python code catches this as ValidationError. + with self.assertRaises(checker.ValidationError): + onnx.load(model_path) + + class TestSaveExternalDataAbsolutePathValidation(TestLoadExternalDataBase): """Test that save_external_data rejects absolute paths."""
Vulnerability mechanics
Generated by null/stub on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
4- github.com/onnx/onnx/commit/4755f8053928dce18a61db8fec71b69c74f786cbnvdPatchWEB
- github.com/advisories/GHSA-cmw6-hcpp-c6jpghsaADVISORY
- github.com/onnx/onnx/security/advisories/GHSA-cmw6-hcpp-c6jpnvdVendor AdvisoryWEB
- nvd.nist.gov/vuln/detail/CVE-2026-34446ghsaADVISORY
News mentions
0No linked articles in our index yet.