High severity7.5NVD Advisory· Published Apr 1, 2026· Updated Apr 7, 2026
CVE-2026-27489
CVE-2026-27489
Description
Open Neural Network Exchange (ONNX) is an open standard for machine learning interoperability. Prior to version 1.21.0, a path traversal vulnerability via symlink allows to read arbitrary files outside model or user-provided directory. This issue has been patched in version 1.21.0.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
onnxPyPI | < 1.21.0 | 1.21.0 |
Affected products
1Patches
14755f8053928Improve external data file handling - onnx.load (#7717)
6 files changed · +407 −46
docs/Security.md+95 −0 added@@ -0,0 +1,95 @@ +<!-- +Copyright (c) ONNX Project Contributors + +SPDX-License-Identifier: Apache-2.0 +--> + +# External Data Security + +This document describes the security model for loading and saving external data files in ONNX models. It is intended for maintainers working on the external data code paths. + +## Threat Model + +When an ONNX model references external data files via relative paths, an attacker who controls the model file can attempt: + +- **Symlink traversal**: A final-component symlink in the external data path pointing to a sensitive file (e.g., `/etc/shadow`), causing ONNX to read or overwrite arbitrary files. +- **Parent-directory symlink**: A symlink in a parent directory component of the external data path, bypassing a check that only inspects the final component. +- **Hardlink attacks**: A hardlink to a sensitive file appearing as a normal file, bypassing symlink-only checks while still exposing unintended data. +- **Path traversal**: Using `..` segments or absolute paths to escape the model directory. + +## Defense Layers + +We use a 4-layer defense-in-depth approach. Each layer is applied at every entry point that opens external data files. + +### Layer 1: Canonical Path Containment + +- **C++**: `std::filesystem::weakly_canonical()` resolves the path, then verifies it starts with the canonical base directory. +- **Python**: `os.path.realpath()` resolves all symlinks in the full path, then verifies the result is within the model base directory. + +This catches `..` traversal and symlinks in any path component (not just the final one). + +### Layer 2: Symlink Detection + +- **C++**: `std::filesystem::is_symlink(data_path)` rejects the final-component symlink. +- **Python**: `os.path.islink(path)` rejects the final-component symlink. + +This is a belt-and-suspenders check alongside containment. It provides a clear, specific error message when the final path component is a symlink. + +### Layer 3: O_NOFOLLOW on File Open (Python only) + +- **Python**: `os.O_NOFOLLOW` added to `os.open()` flags where available (`hasattr(os, "O_NOFOLLOW")`). + +The C++ checker validates paths but does not open files, so `O_NOFOLLOW` is not applicable there. In Python, this is the last-resort defense: even if a symlink is created between the check and the open (TOCTOU race), the kernel rejects the open with `ELOOP` on Linux/macOS. + +### Layer 4: Hardlink Count Check + +- **C++**: `std::filesystem::hard_link_count(data_path) > 1` rejects files with multiple hardlinks. +- **Python**: `os.stat(path).st_nlink > 1` rejects files with multiple hardlinks. + +This prevents an attacker from using a hardlink (which is not a symlink) to point external data at a sensitive file. Note that `O_NOFOLLOW` does **not** protect against hardlinks — only this explicit check does. + +## Protected Entry Points + +Not all layers apply at every entry point. The C++ checker validates paths but does not open files, so Layer 3 (O_NOFOLLOW) is Python-only. + +| Entry Point | File | Layers | +|---|---|---| +| `_resolve_external_data_location` | `onnx/checker.cc` | 1, 2, 4 | +| `load_external_data_for_tensor` | `onnx/external_data_helper.py` | 1, 2, 3, 4 | +| `save_external_data` | `onnx/external_data_helper.py` | 1, 2, 3, 4 | +| `ModelContainer._load_large_initializers` | `onnx/model_container.py` | 1, 2, 3, 4 | + +The C++ checker runs first for all Python load paths (via `c_checker._resolve_external_data_location`). The Python checks serve as defense-in-depth. + +## Known Limitations + +### TOCTOU (Time-of-Check-to-Time-of-Use) + +There is an inherent race window between the security checks (Layers 1-2, 4) and the file open (Layer 3). An attacker with write access to the model directory could: + +1. Place a legitimate file to pass checks. +2. Replace it with a symlink or hardlink between the check and the open. + +**Mitigation**: `O_NOFOLLOW` (Layer 3) catches late symlink replacement on Linux/macOS at the kernel level. However, `O_NOFOLLOW` does **not** protect against hardlink replacement — this TOCTOU gap cannot be fully closed at the application level. + +### Windows + +- `O_NOFOLLOW` is **not available** on Windows (`hasattr(os, "O_NOFOLLOW")` returns `False`). The TOCTOU window for symlink attacks is fully open on Windows, relying solely on Layers 1-2. +- Symlink and hardlink tests are skipped on Windows in the test suite. + +### Case-Insensitive Filesystems + +The canonical path containment check uses string comparison. On case-insensitive filesystems (Windows NTFS, macOS HFS+), paths with different casing may incorrectly fail containment. This fails closed (false rejection, not a bypass). + +## Testing + +Test coverage is in: + +- **C++**: `onnx/test/cpp/checker_test.cc` — `SymLink*` tests for symlink detection and containment. +- **Python**: `onnx/test/test_external_data.py`: + - `TestSaveExternalDataSymlinkProtection` — save-side symlink rejection. + - `TestLoadExternalDataSymlinkProtection` — load-side symlink rejection, parent-directory symlink, `load_external_data_for_model` rejection. + - `TestLoadExternalDataHardlinkProtection` — load-side hardlink rejection. + - `TestSaveExternalDataAbsolutePathValidation` — absolute path rejection. + +Symlink and hardlink tests are skipped on Windows (`os.name == "nt"`).
onnx/checker.cc+39 −0 modified@@ -1021,6 +1021,45 @@ std::string resolve_external_data_location( data_path_str, ", but it is a symbolic link."); } + // Verify the resolved path stays within the base directory to prevent + // path traversal via symlinks in parent directory components. + // is_symlink() only checks the final component; a path like + // "symlink_subdir/real_file.data" would bypass it. + if (data_path_str[0] != '#') { + std::error_code ec; + auto canonical_base = std::filesystem::weakly_canonical(base_dir_path, ec); + if (ec) { + fail_check( + "Data of TensorProto ( tensor name: ", + tensor_name, + ") references external data at ", + data_path_str, + ", but the model directory path could not be resolved."); + } + auto canonical_data = std::filesystem::weakly_canonical(data_path, ec); + if (ec) { + fail_check( + "Data of TensorProto ( tensor name: ", + tensor_name, + ") references external data at ", + data_path_str, + ", but the data path could not be resolved."); + } + auto canonical_base_native = canonical_base.native(); + auto canonical_data_native = canonical_data.native(); + if (!canonical_base_native.empty() && canonical_base_native.back() != std::filesystem::path::preferred_separator) { + canonical_base_native += std::filesystem::path::preferred_separator; + } + if (canonical_data_native.find(canonical_base_native) != 0) { + fail_check( + "Data of TensorProto ( tensor name: ", + tensor_name, + ") at ", + data_path_str, + " resolves to a location outside the model directory, " + "indicating a potential path traversal attack via symbolic links in directory components."); + } + } if (data_path_str[0] != '#' && !std::filesystem::is_regular_file(data_path)) { fail_check( "Data of TensorProto ( tensor name: ",
onnx/external_data_helper.py+60 −16 modified@@ -43,6 +43,53 @@ def __init__(self, tensor: TensorProto) -> None: self.length = int(self.length) +def _validate_external_data_path( + base_dir: str, + data_path: str, + tensor_name: str, + *, + check_exists: bool = True, +) -> str: + """Validate that an external data path is safe to open. + + Performs three security checks: + 1. Canonical path containment — resolved path must stay within base_dir. + 2. Symlink rejection — final-component symlinks are not allowed. + 3. Hardlink count — files with multiple hard links are rejected. + + Args: + base_dir: The model base directory that data_path must be contained in. + data_path: The external data file path to validate. + tensor_name: Tensor name for error messages. + check_exists: If True (default), check hardlink count. Set to False + for save-side paths where the file may not exist yet. + + Returns: + The validated data_path (unchanged). + + Raises: + onnx.checker.ValidationError: If any security check fails. + """ + real_base = os.path.realpath(base_dir) + real_path = os.path.realpath(data_path) + if not real_path.startswith(real_base + os.sep) and real_path != real_base: + raise onnx_checker.ValidationError( + f"Tensor {tensor_name!r} external data path resolves to " + f"{real_path!r} which is outside the model directory {real_base!r}." + ) + if os.path.islink(data_path): + raise onnx_checker.ValidationError( + f"Tensor {tensor_name!r} external data path {data_path!r} " + f"is a symbolic link, which is not allowed for security reasons." + ) + if check_exists and os.path.exists(data_path) and os.stat(data_path).st_nlink > 1: + raise onnx_checker.ValidationError( + f"Tensor {tensor_name!r} external data path {data_path!r} " + f"has multiple hard links, which is not allowed for security reasons." + ) + return data_path + + def load_external_data_for_tensor(tensor: TensorProto, base_dir: str) -> None: """Loads data from an external file for tensor. Ideally TensorProto should not hold any raw data but if it does it will be ignored. @@ -55,7 +102,14 @@ def load_external_data_for_tensor(tensor: TensorProto, base_dir: str) -> None: external_data_file_path = c_checker._resolve_external_data_location( # type: ignore[attr-defined] base_dir, info.location, tensor.name ) - with open(external_data_file_path, "rb") as data_file: + # Security checks (symlink, containment, hardlink) already performed + # by C++ _resolve_external_data_location() above. + # Use O_NOFOLLOW where available as defense-in-depth for symlink protection + open_flags = os.O_RDONLY + if hasattr(os, "O_NOFOLLOW"): + open_flags |= os.O_NOFOLLOW + fd = os.open(external_data_file_path, open_flags) + with os.fdopen(fd, "rb") as data_file: if info.offset: data_file.seek(info.offset) @@ -219,21 +273,11 @@ def save_external_data(tensor: TensorProto, base_path: str) -> None: external_data_file_path = os.path.join(base_path, info.location) - # Verify the resolved path stays within base_path (prevent symlink-based path traversal) - real_base = os.path.realpath(base_path) - real_path = os.path.realpath(external_data_file_path) - if not real_path.startswith(real_base + os.sep) and real_path != real_base: - raise onnx_checker.ValidationError( - f"Tensor {tensor.name!r} external data path resolves to " - f"{real_path!r} which is outside the model directory {real_base!r}." - ) - - # Reject symlinks to prevent arbitrary file overwrites - if os.path.islink(external_data_file_path): - raise onnx_checker.ValidationError( - f"Tensor {tensor.name!r} external data path {external_data_file_path!r} " - f"is a symbolic link, which is not allowed for security reasons." - ) + # C++ _resolve_external_data_location() cannot be used on save path + # (file may not exist yet), so Python performs its own security validation. + _validate_external_data_path( + base_path, external_data_file_path, tensor.name, check_exists=True + ) # Retrieve the tensor's data from raw_data or load external file if not tensor.HasField("raw_data"):
onnx/model_container.py+8 −1 modified@@ -293,10 +293,17 @@ def _load_large_initializers(self, file_path): external_data_file_path = c_checker._resolve_external_data_location( # type: ignore[attr-defined] base_dir, info.location, tensor.name ) + # Security checks (symlink, containment, hardlink) already performed + # by C++ _resolve_external_data_location() above. key = f"#t{i}" _set_external_data(tensor, location=key) - with open(external_data_file_path, "rb") as data_file: + # Use O_NOFOLLOW where available for symlink protection + open_flags = os.O_RDONLY + if hasattr(os, "O_NOFOLLOW"): + open_flags |= os.O_NOFOLLOW + fd = os.open(external_data_file_path, open_flags) + with os.fdopen(fd, "rb") as data_file: if info.offset: data_file.seek(info.offset)
onnx/test/cpp/checker_test.cc+61 −13 modified@@ -3,6 +3,7 @@ // SPDX-License-Identifier: Apache-2.0 #include <filesystem> +#include <fstream> #include <memory> #include <string> @@ -32,23 +33,70 @@ TEST(CHECKER, ValidDataLocationTest) { } TEST(CHECKER, ValidDataLocationSymLinkTest) { -#ifndef ONNX_NO_EXCEPTIONS - fs::path tempDir = fs::temp_directory_path() / "symlink_test-%%%%%%"; // NOSONAR - fs::create_directories(tempDir); - fs::path target = tempDir / "model.data"; - fs::path link = tempDir / "link.data"; +#if !defined(ONNX_NO_EXCEPTIONS) && !defined(_WIN32) + // Use a temp directory as the base_dir (simulating the model directory). + // We pass a relative filename as the location so that the absolute-path + // rejection (checker.cc line 986) is NOT triggered, and the is_symlink() + // check (checker.cc line 1016) is actually exercised. + fs::path modelDir = fs::temp_directory_path() / "onnx_symlink_checker_test"; + fs::remove_all(modelDir); + fs::create_directories(modelDir); + + // Create a regular target file so the symlink has a valid target. + fs::path target = modelDir / "target.data"; + { + std::ofstream ofs(target); + ofs << "test data"; + } + + // Create a symlink pointing to the target file. + fs::path link = modelDir / "link.data"; fs::create_symlink(target, link); -#ifdef WIN32 - std::string location = link.u8string(); -#else - std::string location = link.c_str(); + + // Pass relative filename "link.data" — the checker resolves it to + // modelDir/link.data and should reject it because it is a symlink. + EXPECT_THROW( + ONNX_NAMESPACE::checker::resolve_external_data_location(modelDir.string(), "link.data", "tensor_name"), + ONNX_NAMESPACE::checker::ValidationError); + + fs::remove_all(modelDir); #endif +} + +TEST(CHECKER, ValidDataLocationParentDirSymLinkTest) { +#if !defined(ONNX_NO_EXCEPTIONS) && !defined(_WIN32) + // Test that symlinks in parent directory components are detected. + // A location like "symlink_subdir/real_file.data" where symlink_subdir + // is a symlink to an outside directory should be rejected by the + // canonical path containment check in checker.cc. + fs::path modelDir = fs::temp_directory_path() / "onnx_parent_symlink_test"; + fs::remove_all(modelDir); + fs::create_directories(modelDir); + + // Create a target directory outside the model directory. + fs::path outsideDir = fs::temp_directory_path() / "onnx_outside_target"; + fs::remove_all(outsideDir); + fs::create_directories(outsideDir); + + // Create a real file in the outside directory. + fs::path targetFile = outsideDir / "secret.data"; + { + std::ofstream ofs(targetFile); + ofs << "sensitive data"; + } + + // Create a directory symlink inside modelDir pointing outside. + fs::path symlinkSubdir = modelDir / "subdir"; + fs::create_directory_symlink(outsideDir, symlinkSubdir); + + // "subdir/secret.data" is a relative path where "subdir" is a symlink. + // The canonical path resolves outside modelDir, so this should be rejected. EXPECT_THROW( - ONNX_NAMESPACE::checker::resolve_external_data_location("localfolder", location, "tensor_name"), + ONNX_NAMESPACE::checker::resolve_external_data_location(modelDir.string(), "subdir/secret.data", "tensor_name"), ONNX_NAMESPACE::checker::ValidationError); - fs::remove(link); - fs::remove(target); - fs::remove(tempDir); + + fs::remove_all(modelDir); + fs::remove_all(outsideDir); #endif }
onnx/test/test_external_data.py+144 −16 modified@@ -6,6 +6,7 @@ import itertools import os import pathlib +import shutil import tempfile import unittest import uuid @@ -885,6 +886,21 @@ def test_subgraph(self) -> None: self._check(model, constant_nodes) +def _make_external_data_test_model() -> tuple[ModelProto, np.ndarray]: + """Create a simple model with a large initializer suitable for external data tests.""" + model = parser.parse_model( + """ + <ir_version: 7, opset_import: ["": 17]> + agraph (float[100, 100] input) => (float[100, 100] output) { + output = Identity(input) + } + """ + ) + array = np.ones((100, 100), dtype=np.float32) + model.graph.initializer.append(from_array(array, name="weight")) + return model, array + + @unittest.skipIf( os.name == "nt", reason="Symlinks require elevated privileges on Windows" ) @@ -897,22 +913,7 @@ def test_save_rejects_symlink_target(self) -> None: with open(sensitive_file, "w") as f: f.write("SENSITIVE DATA") - # Create a model with external data - array = np.ones((100, 100), dtype=np.float32) - tensor = from_array(array, name="weight") - model = helper.make_model( - helper.make_graph( - [helper.make_node("Identity", ["input"], ["output"])], - "test", - [helper.make_tensor_value_info("input", TensorProto.FLOAT, [100, 100])], - [ - helper.make_tensor_value_info( - "output", TensorProto.FLOAT, [100, 100] - ) - ], - [tensor], - ) - ) + model, array = _make_external_data_test_model() model_path = os.path.join(self.temp_dir, "model.onnx") ext_data = "data.bin" onnx.save_model( @@ -947,6 +948,133 @@ def test_save_rejects_symlink_target(self) -> None: self.assertEqual(f.read(), "SENSITIVE DATA") +@unittest.skipIf( + os.name == "nt", reason="Symlinks require elevated privileges on Windows" +) +class TestLoadExternalDataSymlinkProtection(TestLoadExternalDataBase): + """Test that loading external data rejects symlinks to prevent arbitrary file reads.""" + + def test_load_rejects_symlink_external_data(self) -> None: + """Loading a model whose external data is a symlink must raise ValidationError.""" + model, _ = _make_external_data_test_model() + model_path = os.path.join(self.temp_dir, "model.onnx") + ext_data = "data.bin" + onnx.save_model( + model, + model_path, + save_as_external_data=True, + all_tensors_to_one_file=True, + location=ext_data, + size_threshold=1024, + ) + + # Create a target file and replace external data with a symlink to it + target_file = os.path.join(self.temp_dir, "target.txt") + with open(target_file, "w") as f: + f.write("SENSITIVE DATA") + + ext_data_path = os.path.join(self.temp_dir, ext_data) + os.remove(ext_data_path) + os.symlink(target_file, ext_data_path) + + # Loading with onnx.load (which loads external data) must fail + with self.assertRaises(checker.ValidationError): + onnx.load(model_path) + + def test_load_external_data_for_model_rejects_symlink(self) -> None: + """load_external_data_for_model must reject symlinked external data.""" + model, _ = _make_external_data_test_model() + model_path = os.path.join(self.temp_dir, "model.onnx") + ext_data = "data.bin" + onnx.save_model( + model, + model_path, + save_as_external_data=True, + all_tensors_to_one_file=True, + location=ext_data, + size_threshold=1024, + ) + + # Replace external data with a symlink + target_file = os.path.join(self.temp_dir, "target.txt") + with open(target_file, "w") as f: + f.write("SENSITIVE DATA") + + ext_data_path = os.path.join(self.temp_dir, ext_data) + os.remove(ext_data_path) + os.symlink(target_file, ext_data_path) + + # Load model without external data, then try to load external data explicitly + loaded_model = onnx.load(model_path, load_external_data=False) + with self.assertRaises(checker.ValidationError): + load_external_data_for_model(loaded_model, self.temp_dir) + + def test_load_rejects_parent_directory_symlink(self) -> None: + """A symlink in the parent directory must be caught by realpath containment.""" + # Create a "sensitive" directory outside the model directory with a data file + sensitive_dir = os.path.join(self.temp_dir, "sensitive") + os.makedirs(sensitive_dir) + secret_file = os.path.join(sensitive_dir, "secret.bin") + with open(secret_file, "wb") as f: + f.write(b"SENSITIVE DATA" * 100) + + # Create a model directory with a real subdir for saving + model_dir = os.path.join(self.temp_dir, "model_dir") + os.makedirs(model_dir) + subdir_path = os.path.join(model_dir, "subdir") + os.makedirs(subdir_path) + + # Create model with external data location "subdir/secret.bin" + model, _ = _make_external_data_test_model() + model_path = os.path.join(model_dir, "model.onnx") + onnx.save_model( + model, + model_path, + save_as_external_data=True, + all_tensors_to_one_file=True, + location="subdir/secret.bin", + size_threshold=1024, + ) + + # Replace the real subdir with a symlink to the sensitive directory + shutil.rmtree(subdir_path) + os.symlink(sensitive_dir, subdir_path) + + # Loading must fail because realpath resolves outside model_dir + loaded_model = onnx.load(model_path, load_external_data=False) + with self.assertRaises(checker.ValidationError): + load_external_data_for_model(loaded_model, model_dir) + + +@unittest.skipIf(os.name == "nt", reason="Hardlinks behave differently on Windows") +class TestLoadExternalDataHardlinkProtection(TestLoadExternalDataBase): + """Test that loading external data rejects files with multiple hardlinks.""" + + def test_load_rejects_hardlinked_external_data(self) -> None: + """Loading a model whose external data has multiple hardlinks must raise ValidationError.""" + model, _ = _make_external_data_test_model() + model_path = os.path.join(self.temp_dir, "model.onnx") + ext_data = "data.bin" + onnx.save_model( + model, + model_path, + save_as_external_data=True, + all_tensors_to_one_file=True, + location=ext_data, + size_threshold=1024, + ) + + # Create a hardlink to the external data file + ext_data_path = os.path.join(self.temp_dir, ext_data) + hardlink_path = os.path.join(self.temp_dir, "hardlink_data.bin") + os.link(ext_data_path, hardlink_path) + + # Loading must fail because the external data file has multiple hardlinks. + # Either the C++ checker or Python code catches this as ValidationError. + with self.assertRaises(checker.ValidationError): + onnx.load(model_path) + + class TestSaveExternalDataAbsolutePathValidation(TestLoadExternalDataBase): """Test that save_external_data rejects absolute paths."""
Vulnerability mechanics
Generated by null/stub on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
4- github.com/onnx/onnx/commit/4755f8053928dce18a61db8fec71b69c74f786cbnvdPatchWEB
- github.com/onnx/onnx/security/advisories/GHSA-3r9x-f23j-gc73nvdExploitMitigationPatchVendor AdvisoryWEB
- github.com/advisories/GHSA-3r9x-f23j-gc73ghsaADVISORY
- nvd.nist.gov/vuln/detail/CVE-2026-27489ghsaADVISORY
News mentions
0No linked articles in our index yet.