Apache Airflow SFTP provider: Path traversal in SFTPHook.retrieve_directory allows local file write outside the destination directory via malicious server-supplied directory-entry names
Description
Path traversal in Apache Airflow SFTP provider allows a malicious SFTP server to write files outside the intended destination directory.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
Path traversal in Apache Airflow SFTP provider allows a malicious SFTP server to write files outside the intended destination directory.
Vulnerability
The vulnerability resides in the SFTPHook.retrieve_directory method and the SFTPOperator(operation=get) operator of the Apache Airflow SFTP provider. When downloading a directory recursively, the code constructs local destination paths by joining the configured local directory with path components derived from directory-entry names returned by the remote SFTP server. Because these names can contain .. sequences, the download can write files outside the intended local destination directory. All versions prior to apache-airflow-providers-sftp 5.8.1 are affected [1].
Exploitation
An attacker must control or compromise an SFTP server that the Airflow deployment connects to. No Airflow account is required; the attack surface is any deployment that downloads directories from an untrusted SFTP server. The attacker crafts directory-entry names containing .. components. When Airflow recursively downloads the directory, the path traversal causes files to be written to arbitrary locations on the filesystem where the Airflow worker runs [1].
Impact
Successful exploitation enables arbitrary file write outside the configured local destination directory. Depending on the write location, this could lead to code execution, privilege escalation, or other compromise of the Airflow worker environment [1].
Mitigation
Upgrade apache-airflow-providers-sftp to version 5.8.1 or later. The fix introduces a containment check (_validate_within_directory) that resolves each computed local path and refuses to write when it falls outside the destination directory. This check is applied to both the serial and concurrent retrieval paths. No workarounds are documented [1].
AI Insight generated on Jun 17, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.
Affected products
1- Range: <5.8.1
Patches
116547dfef2e0Validate downloaded paths stay within the destination directory in SFTPHook.retrieve_directory (#67985)
2 files changed · +56 −4
providers/sftp/src/airflow/providers/sftp/hooks/sftp.py+33 −4 modified@@ -384,6 +384,25 @@ def delete_file(self, path: str) -> None: """ self.conn.remove(path) # type: ignore[arg-type, union-attr] + @staticmethod + def _validate_within_directory(base_dir: str, candidate: str) -> str: + """ + Ensure ``candidate`` resolves to a path inside ``base_dir``. + + Directory-entry names are returned by the remote SFTP server and may + contain ``..`` components; joining them into the local destination path + could otherwise write outside it. Containment is verified before any + local write or ``mkdir``. + """ + base_real = os.path.realpath(base_dir) + candidate_real = os.path.realpath(candidate) + if candidate_real != base_real and os.path.commonpath([base_real, candidate_real]) != base_real: + raise ValueError( + f"Refusing to write outside the destination directory: " + f"{candidate!r} resolves outside {base_dir!r}" + ) + return candidate + def retrieve_directory(self, remote_full_path: str, local_full_path: str, prefetch: bool = True) -> None: """ Transfer the remote directory to a local location. @@ -400,10 +419,14 @@ def retrieve_directory(self, remote_full_path: str, local_full_path: str, prefet Path(local_full_path).mkdir(parents=True) files, dirs, _ = self.get_tree_map(remote_full_path) for dir_path in dirs: - new_local_path = os.path.join(local_full_path, os.path.relpath(dir_path, remote_full_path)) + new_local_path = self._validate_within_directory( + local_full_path, os.path.join(local_full_path, os.path.relpath(dir_path, remote_full_path)) + ) Path(new_local_path).mkdir(parents=True, exist_ok=True) for file_path in files: - new_local_path = os.path.join(local_full_path, os.path.relpath(file_path, remote_full_path)) + new_local_path = self._validate_within_directory( + local_full_path, os.path.join(local_full_path, os.path.relpath(file_path, remote_full_path)) + ) self.retrieve_file(file_path, new_local_path, prefetch) def retrieve_directory_concurrently( @@ -438,12 +461,18 @@ def retrieve_file_chunk( new_local_file_paths, remote_file_paths = [], [] files, dirs, _ = self.get_tree_map(remote_full_path) for dir_path in dirs: - new_local_path = os.path.join(local_full_path, os.path.relpath(dir_path, remote_full_path)) + new_local_path = self._validate_within_directory( + local_full_path, + os.path.join(local_full_path, os.path.relpath(dir_path, remote_full_path)), + ) Path(new_local_path).mkdir(parents=True, exist_ok=True) for file in files: remote_file_paths.append(file) new_local_file_paths.append( - os.path.join(local_full_path, os.path.relpath(file, remote_full_path)) + self._validate_within_directory( + local_full_path, + os.path.join(local_full_path, os.path.relpath(file, remote_full_path)), + ) ) remote_file_chunks = [remote_file_paths[i::workers] for i in range(workers)] local_file_chunks = [new_local_file_paths[i::workers] for i in range(workers)]
providers/sftp/tests/unit/sftp/hooks/test_sftp.py+23 −0 modified@@ -633,6 +633,29 @@ def test_store_and_retrieve_directory_concurrently(self): ) assert retrieved_dir_name in os.listdir(os.path.join(self.temp_dir, TMP_DIR_FOR_TESTS)) + def test_validate_within_directory_rejects_escape(self): + base = os.path.join(self.temp_dir, "download") + with pytest.raises(ValueError, match="outside the destination directory"): + SFTPHook._validate_within_directory(base, os.path.join(base, "..", "victim")) + # An in-bounds candidate is returned unchanged. + inside = os.path.join(base, "sub", "file") + assert SFTPHook._validate_within_directory(base, inside) == inside + + def test_retrieve_directory_rejects_server_path_traversal(self): + # A remote SFTP server can return a directory-entry name containing ".." + # so the recursive download would escape the local destination directory. + remote = "/srv/export" + local = os.path.join(self.temp_dir, "download_traversal") + escaping_file = "/srv/export/../victim/payload" + with ( + patch.object(SFTPHook, "get_tree_map", return_value=([escaping_file], [], [])), + patch.object(SFTPHook, "retrieve_file") as mock_retrieve, + ): + with pytest.raises(ValueError, match="outside the destination directory"): + self.hook.retrieve_directory(remote_full_path=remote, local_full_path=local) + mock_retrieve.assert_not_called() + assert not os.path.exists(os.path.join(self.temp_dir, "victim")) + @patch("paramiko.SSHClient") @patch("paramiko.ProxyCommand") @patch("airflow.providers.sftp.hooks.sftp.SFTPHook.get_connection")
Vulnerability mechanics
Root cause
"Missing containment validation when joining directory-entry names from the remote SFTP server into local destination paths allows path traversal via crafted '..' components."
Attack vector
An attacker who controls or compromises an SFTP server can craft directory-entry names containing `..` components. When an Airflow deployment uses `SFTPOperator(operation=get)` or calls `SFTPHook.retrieve_directory` to download a directory from that server, the recursive download joins the local destination directory with the attacker-controlled entry name, causing files to be written outside the intended local directory. No Airflow account or authentication is required; the attack surface is any deployment that downloads directories from an untrusted SFTP server.
Affected code
The vulnerability resides in `SFTPHook.retrieve_directory` and `SFTPHook.retrieve_directory_concurrently` within `providers/sftp/src/airflow/providers/sftp/hooks/sftp.py`. Both methods build local destination paths by joining the configured local directory with paths derived from directory-entry names returned by the remote SFTP server, without validating that the resulting path stays within the intended destination directory [patch_id=6240365].
What the fix does
The patch introduces a static helper `_validate_within_directory` that resolves both the base directory and the candidate path via `os.path.realpath` and uses `os.path.commonpath` to verify the candidate is contained within the base directory. This check is inserted before every `mkdir` and `retrieve_file` call in both the serial (`retrieve_directory`) and concurrent (`retrieve_directory_concurrently`) code paths. If the candidate path resolves outside the destination directory, a `ValueError` is raised and no write occurs. The test suite confirms that a remote entry like `/srv/export/../victim/payload` is rejected and `retrieve_file` is never called [patch_id=6240365].
Preconditions
- configThe Airflow deployment must use SFTPOperator(operation=get) or call SFTPHook.retrieve_directory to download a directory from an SFTP server.
- inputThe remote SFTP server must be malicious or compromised so it returns directory-entry names containing '..' components.
- authNo Airflow account or authentication is required; the attacker only needs network access to serve crafted directory listings.
Generated on Jun 17, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
2- github.com/apache/airflow/pull/67985mitrepatch
- lists.apache.org/thread/7f4b284oh44c1n95oq8mh1qc7y1lr9dxmitrevendor-advisory
News mentions
0No linked articles in our index yet.