CVE-2026-40861
Description
A symlink or path traversal in Apache Airflow's FileTaskHandler allows a Dag author to read or write arbitrary files on the API server when worker logs are shared.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
A symlink or path traversal in Apache Airflow's FileTaskHandler allows a Dag author to read or write arbitrary files on the API server when worker logs are shared.
Vulnerability
In Apache Airflow versions prior to 3.2.2, the FileTaskHandler in airflow/utils/log/file_task_handler.py resolves log file paths without validating that the resolved path stays within the configured base_log_folder. A Dag author can either (a) create a symlink under their task's log directory pointing to an arbitrary file readable by the API server process (read-path attack, e.g., /etc/passwd or airflow.cfg) or (b) supply a task_id containing .. sequences accepted by the Task SDK's KEY_REGEX (write-path attack). In both cases, the FileTaskHandler._read_from_local method opens the resolved path outside the intended log directory. This only affects deployments where the worker log folder is shared with the API server. [1]
Exploitation
For the read-path attack, the attacker needs to be a Dag author with the ability to create a symlink in the task's log directory (e.g., via a custom operator or a pre-existing symlink). The attacker creates a symlink named to match the log glob pattern (e.g., a file with .log extension) pointing to a target file. When the API server's log viewer requests logs for that task, FileTaskHandler._read_from_local globs the directory and opens the symlink, streaming the target file's content. For the write-path attack, the attacker supplies a task_id containing .. sequences (e.g., ../../etc/cron.d/malicious) that passes the KEY_REGEX validation. The handler constructs a path outside base_log_folder and writes log data there, potentially overwriting sensitive files. [1]
Impact
Successful exploitation allows an authenticated Dag author to read arbitrary files readable by the API server process (information disclosure) or write arbitrary files writable by the API server process (potential privilege escalation or code execution). The attack is limited to deployments where the worker log folder is shared with the API server; if separate volumes are used, the attack is not possible. [1]
Mitigation
Upgrade to Apache Airflow 3.2.2 or later, which includes the fix from pull request #65325. The fix canonicalizes self.local_base via os.path.realpath and for each glob hit resolves the path and skips it if not contained in the canonicalized base log folder. As a defense-in-depth measure, deploy the worker and API server with separate log volumes so that worker-controlled paths cannot reach the API server's filesystem. [1]
AI Insight generated on Jun 1, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.
Affected products
1Patches
23eda84547e74Refuse to follow log symlinks that resolve outside the base log folder (#65325)
3 files changed · +125 −4
airflow-core/src/airflow/utils/log/file_task_handler.py+25 −3 modified@@ -856,20 +856,42 @@ def _init_file(self, ti, *, identifier: str | None = None): return full_path - @staticmethod def _read_from_local( + self, worker_log_path: Path, ) -> StreamingLogResponse: sources: LogSourceInfo = [] log_streams: list[RawLogStream] = [] + # The glob below can match symlinks as well as regular files, so + # resolve each hit and only open the ones that stay inside the base + # log folder. Canonicalising ``self.local_base`` once up front makes + # the containment check compare two already-resolved paths. + base_log_folder = os.path.realpath(self.local_base) paths = sorted(worker_log_path.parent.glob(worker_log_path.name + "*")) if not paths: return sources, log_streams for path in paths: + resolved_path = os.path.realpath(path) + try: + if os.path.commonpath([base_log_folder, resolved_path]) != base_log_folder: + continue + except ValueError: + # ``os.path.commonpath`` raises ``ValueError`` when the two + # paths have nothing in common (e.g. different drives on + # Windows); treat that as "not contained" and skip the file. + continue + + # Open the resolved path so the file we read is the same one we + # just validated above. Append to ``sources`` only after a + # successful ``open`` so ``sources`` and ``log_streams`` stay + # aligned. + try: + log_stream = _stream_lines_by_chunk(open(resolved_path, encoding="utf-8")) + except OSError: + continue sources.append(os.fspath(path)) - # Read the log file and yield lines - log_streams.append(_stream_lines_by_chunk(open(path, encoding="utf-8"))) + log_streams.append(log_stream) return sources, log_streams def _read_from_logs_server(
airflow-core/tests/unit/utils/log/test_file_task_handler.py+99 −0 modified@@ -90,3 +90,102 @@ def test_403_shows_secret_key_message(self, mock_get_url, mock_fetch): assert len(sources) == 1 assert "secret_key" in sources[0] assert streams == [] + + +class TestFileTaskHandlerReadFromLocal: + """Tests for ``FileTaskHandler._read_from_local`` path containment.""" + + @staticmethod + def _drain(stream) -> str: + return "".join(list(stream)) + + def test_reads_regular_log_file_inside_base(self, tmp_path): + """A regular file under ``base_log_folder`` is streamed as before.""" + log_dir = tmp_path / "dag" / "run" / "task" + log_dir.mkdir(parents=True) + log_file = log_dir / "1.log" + log_file.write_text("legitimate log line\n") + + handler = FileTaskHandler(base_log_folder=str(tmp_path)) + sources, streams = handler._read_from_local(log_file) + + assert sources == [str(log_file)] + assert len(streams) == 1 + assert "legitimate log line" in self._drain(streams[0]) + + def test_skips_symlink_resolving_outside_base_log_folder(self, tmp_path): + """A glob hit that resolves outside ``base_log_folder`` is not streamed. + + This documents the intended containment behaviour: a file under the + task's log directory that is actually a symlink whose real path is + outside the configured base log folder must be skipped, even though + it matches the glob pattern used to discover the task's log files. + """ + base_log_folder = tmp_path / "logs" + log_dir = base_log_folder / "dag" / "run" / "task" + log_dir.mkdir(parents=True) + + # A regular log file inside the base log folder. + legit = log_dir / "1.log" + legit.write_text("legitimate log line\n") + + # A file that lives outside the base log folder. + external_dir = tmp_path / "external" + external_dir.mkdir() + external_file = external_dir / "other.txt" + external_file.write_text("external content\n") + + # A glob hit that matches ``1.log*`` but resolves outside the base. + escape_link = log_dir / "1.log.external" + escape_link.symlink_to(external_file) + + handler = FileTaskHandler(base_log_folder=str(base_log_folder)) + sources, streams = handler._read_from_local(legit) + + assert str(legit) in sources + assert str(escape_link) not in sources + content = "".join(self._drain(s) for s in streams) + assert "legitimate log line" in content + assert "external content" not in content + + def test_follows_symlink_within_base_log_folder(self, tmp_path): + """A symlink that resolves back into the base log folder is allowed. + + The containment check compares the realpath of the glob hit to the + realpath of the base log folder, so a symlink that stays entirely + inside the log tree (for example from log rotation) still works. + """ + base_log_folder = tmp_path / "logs" + log_dir = base_log_folder / "dag" / "run" / "task" + log_dir.mkdir(parents=True) + + real_file = log_dir / "real.log" + real_file.write_text("inner content\n") + + link = log_dir / "1.log.link" + link.symlink_to(real_file) + + handler = FileTaskHandler(base_log_folder=str(base_log_folder)) + sources, streams = handler._read_from_local(log_dir / "1.log") + + assert str(link) in sources + assert "inner content" in "".join(self._drain(s) for s in streams) + + def test_handles_base_log_folder_that_is_itself_a_symlink(self, tmp_path): + """``base_log_folder`` itself is realpath'd so a base that is a + symlink to the actual log directory is treated as contained.""" + real_base = tmp_path / "real_logs" + real_base.mkdir() + base_link = tmp_path / "logs" + base_link.symlink_to(real_base) + + log_dir = base_link / "dag" / "run" / "task" + log_dir.mkdir(parents=True) + log_file = log_dir / "1.log" + log_file.write_text("through-symlink content\n") + + handler = FileTaskHandler(base_log_folder=str(base_link)) + sources, streams = handler._read_from_local(log_file) + + assert len(sources) == 1 + assert "through-symlink content" in self._drain(streams[0])
airflow-core/tests/unit/utils/test_log_handlers.py+1 −1 modified@@ -510,7 +510,7 @@ def test__read_from_local(self, tmp_path): path2 = tmp_path / "hello1.log.suffix.log" path1.write_text("file1 content\nfile1 content2") path2.write_text("file2 content\nfile2 content2") - fth = FileTaskHandler("") + fth = FileTaskHandler(str(tmp_path)) log_source_info, log_streams = fth._read_from_local(path1) assert log_source_info == [str(path1), str(path2)] assert len(log_streams) == 2
cde4885818beUpdating release notes for 3.2.2rc3
2 files changed · +5 −4
RELEASE_NOTES.rst+3 −2 modified@@ -24,7 +24,7 @@ .. towncrier release notes start -Airflow 3.2.2 (2026-05-27) +Airflow 3.2.2 (2026-05-29) -------------------------- Significant Changes @@ -81,7 +81,8 @@ Significant Changes Bug Fixes ^^^^^^^^^ - +- Fix ``Callback.handle_event`` triggerer crash when OpenTelemetry metrics receive dict typed tag values (#67527) (#67529) +- UI: Rewrite ``modulepreload hrefs`` to the api-server static path (#67548) (#67556) - Correctly pre-allocate ``external_executor_id`` with multiple executors on PostgreSQL (#67388) (#67458) - Return raw import-error stacktrace when a Dag file has no registered Dag (#67465) (#67478) - UI: Fix Expand/Collapse All on XComs and Audit Log JSON cells (#67316) (#67361)
reproducible_build.yaml+2 −2 modified@@ -1,2 +1,2 @@ -release-notes-hash: 6407b48d1054fe3ce68c09bf4435d91d -source-date-epoch: 1779745327 +release-notes-hash: 504288db9a9dc13a0db859232fab98d0 +source-date-epoch: 1779811737
Vulnerability mechanics
Root cause
"FileTaskHandler._read_from_local did not validate that a symlink's resolved real path stays within the configured base_log_folder, allowing path traversal via symlinks or `..` sequences."
Attack vector
A Dag author can exploit two attack vectors. In the read-path attack, they create a symlink under their task's log directory pointing to an arbitrary file readable by the API server process (e.g. `/etc/passwd` or `airflow.cfg`). In the write-path attack, they supply a `task_id` containing `..` sequences accepted by the Task SDK's `KEY_REGEX`, causing the log path to resolve outside `base_log_folder`. Both vectors succeed only when the worker log folder is shared with the API server [CWE-59].
Affected code
The vulnerability resides in `FileTaskHandler._read_from_local` in `airflow-core/src/airflow/utils/log/file_task_handler.py`. The method used `os.path.realpath` only on the glob hit path but not on `self.local_base`, and opened the symlink target without validating that its resolved path stayed within the configured `base_log_folder`.
What the fix does
The patch canonicalises `self.local_base` once via `os.path.realpath` and, for every glob hit, resolves the path with `os.path.realpath` and skips it if the resolved form is not contained in the canonicalised base log folder (using `os.path.commonpath`, with a `ValueError` fallback for different-drive scenarios on Windows). It opens the resolved path rather than the original glob hit so the file read is the one that was validated, and appends to `sources` only after a successful `open` to keep `sources` and `log_streams` aligned. The `@staticmethod` decorator was removed so the method can read `self.local_base`.
Preconditions
- configWorker log folder must be shared with the API server (same filesystem mount).
- authAttacker must be a Dag author able to create tasks or symlinks under the task log directory.
- inputFor the write-path variant, the Task SDK's KEY_REGEX must accept `..` sequences in task_id.
Generated on Jun 1, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
2News mentions
0No linked articles in our index yet.