VYPR
Medium severity6.5NVD Advisory· Published May 11, 2026· Updated May 13, 2026

CVE-2026-43826

CVE-2026-43826

Description

The OpenSearch logging provider, when configured with a host URL that embeds credentials (for example https://user:password@server.example.com:9200), wrote the full host URL — including the embedded credentials — into task logs. Any user with task-log read permission could harvest the backend credentials. Users are advised to upgrade to apache-airflow-providers-opensearch 1.9.1 or later and, as a defense-in-depth measure, configure the backend credentials via a secret backend rather than embedding them in the [opensearch] host URL.

Affected products

2

Patches

1
6a6b6ff409fb

Strip userinfo from OpenSearch host URL before using it as task-log label (#65509)

https://github.com/apache/airflowJarek PotiukApr 19, 2026via nvd-ref
5 files changed · +106 2
  • providers/elasticsearch/AGENTS.md+28 0 added
    @@ -0,0 +1,28 @@
    +<!-- SPDX-License-Identifier: Apache-2.0
    +     https://www.apache.org/licenses/LICENSE-2.0 -->
    +
    +# Elasticsearch Provider — Agent Instructions
    +
    +## Keep in sync with `providers/opensearch`
    +
    +OpenSearch was forked from Elasticsearch and the two providers share most of
    +their task-log handler code and surface. File layouts mirror each other:
    +
    +| Elasticsearch                                   | OpenSearch                              |
    +| ----------------------------------------------- | --------------------------------------- |
    +| `log/es_task_handler.py`                        | `log/os_task_handler.py`                |
    +| `log/es_response.py`                            | `log/os_response.py`                    |
    +| `log/es_json_formatter.py`                      | `log/os_json_formatter.py`              |
    +| `ElasticsearchTaskHandler`                      | `OpensearchTaskHandler`                 |
    +| `ElasticsearchRemoteLogIO`                      | `OpensearchRemoteLogIO`                 |
    +
    +**When fixing a bug or changing behaviour here, check whether the equivalent
    +change is needed in `providers/opensearch` (and vice-versa).** This applies
    +especially to: task-log handler logic, log grouping / formatting, connection
    +handling, URL/credential treatment, and response parsing. The two packages
    +ship on independent release cadences, so the fix should usually land as two
    +separate PRs on the same day.
    +
    +Legitimate reasons to diverge: upstream client API differences (`elasticsearch`
    +vs `opensearchpy`), provider-specific features that only one side has, or
    +changes gated on config that only exists in one provider.
    
  • providers/opensearch/AGENTS.md+28 0 added
    @@ -0,0 +1,28 @@
    +<!-- SPDX-License-Identifier: Apache-2.0
    +     https://www.apache.org/licenses/LICENSE-2.0 -->
    +
    +# OpenSearch Provider — Agent Instructions
    +
    +## Keep in sync with `providers/elasticsearch`
    +
    +OpenSearch was forked from Elasticsearch and the two providers share most of
    +their task-log handler code and surface. File layouts mirror each other:
    +
    +| OpenSearch                              | Elasticsearch                                   |
    +| --------------------------------------- | ----------------------------------------------- |
    +| `log/os_task_handler.py`                | `log/es_task_handler.py`                        |
    +| `log/os_response.py`                    | `log/es_response.py`                            |
    +| `log/os_json_formatter.py`              | `log/es_json_formatter.py`                      |
    +| `OpensearchTaskHandler`                 | `ElasticsearchTaskHandler`                      |
    +| `OpensearchRemoteLogIO`                 | `ElasticsearchRemoteLogIO`                      |
    +
    +**When fixing a bug or changing behaviour here, check whether the equivalent
    +change is needed in `providers/elasticsearch` (and vice-versa).** This applies
    +especially to: task-log handler logic, log grouping / formatting, connection
    +handling, URL/credential treatment, and response parsing. The two packages
    +ship on independent release cadences, so the fix should usually land as two
    +separate PRs on the same day.
    +
    +Legitimate reasons to diverge: upstream client API differences (`opensearchpy`
    +vs `elasticsearch`), provider-specific features that only one side has (e.g.
    +`write_to_os`), or changes gated on config that only exists in one provider.
    
  • providers/opensearch/docs/changelog.rst+8 0 modified
    @@ -27,6 +27,14 @@
     Changelog
     ---------
     
    +When the ``[opensearch] host`` config embeds credentials
    +(``https://user:password@opensearch.example.com:9200``), the log-source
    +label shown in task logs is now the host URL with the ``user:password@``
    +portion stripped. Previously the full URL (including credentials) could
    +appear as a dictionary key in the task-log output when log-hits did not
    +carry a ``host`` field. The OpenSearch client is still connected using
    +the full URL, so authentication is unaffected.
    +
     1.9.0
     .....
     
    
  • providers/opensearch/src/airflow/providers/opensearch/log/os_task_handler.py+26 2 modified
    @@ -197,6 +197,28 @@ def _create_opensearch_client(
         )
     
     
    +def _strip_userinfo(url: str) -> str:
    +    """
    +    Return ``url`` with any ``user:password@`` userinfo removed.
    +
    +    The OpenSearch ``[opensearch] host`` config commonly embeds
    +    credentials (``https://user:password@opensearch.example.com:9200``).
    +    This value is reused as a display label for log-source grouping, so
    +    the credentials would otherwise end up in task logs. Anything that
    +    is not a valid URL is returned unchanged.
    +    """
    +    try:
    +        parsed = urlparse(url)
    +    except (TypeError, ValueError):
    +        return url
    +    if not parsed.hostname or (not parsed.username and not parsed.password):
    +        return url
    +    netloc = parsed.hostname
    +    if parsed.port is not None:
    +        netloc = f"{netloc}:{parsed.port}"
    +    return parsed._replace(netloc=netloc).geturl()
    +
    +
     def _render_log_id(
         log_id_template: str, ti: TaskInstance | TaskInstanceKey | RuntimeTI, try_number: int
     ) -> str:
    @@ -713,8 +735,9 @@ def _get_result(self, hit: dict[Any, Any], parent_class=None) -> Hit:
     
         def _group_logs_by_host(self, response: OpensearchResponse) -> dict[str, list[Hit]]:
             grouped_logs = defaultdict(list)
    +        host_fallback = _strip_userinfo(self.host)
             for hit in response:
    -            key = getattr_nested(hit, self.host_field, None) or self.host
    +            key = getattr_nested(hit, self.host_field, None) or host_fallback
                 grouped_logs[key].append(hit)
             return grouped_logs
     
    @@ -937,8 +960,9 @@ def _get_index_patterns(self, ti: RuntimeTI | None) -> str:
     
         def _group_logs_by_host(self, response: OpensearchResponse) -> dict[str, list[Hit]]:
             grouped_logs = defaultdict(list)
    +        host_fallback = _strip_userinfo(self.host)
             for hit in response:
    -            key = getattr_nested(hit, self.host_field, None) or self.host
    +            key = getattr_nested(hit, self.host_field, None) or host_fallback
                 grouped_logs[key].append(hit)
             return grouped_logs
     
    
  • providers/opensearch/tests/unit/opensearch/log/test_os_task_handler.py+16 0 modified
    @@ -38,6 +38,7 @@
         _build_log_fields,
         _format_error_detail,
         _render_log_id,
    +    _strip_userinfo,
         get_os_kwargs_from_config,
         getattr_nested,
     )
    @@ -222,6 +223,21 @@ def test_format_url(self, host, expected):
             else:
                 assert OpensearchTaskHandler.format_url(host) == expected
     
    +    @pytest.mark.parametrize(
    +        ("host", "expected"),
    +        [
    +            ("https://user:pass@opensearch.example.com:9200", "https://opensearch.example.com:9200"),
    +            ("http://USER:PASS@opensearch.example.com", "http://opensearch.example.com"),
    +            ("https://opensearch.example.com:9200", "https://opensearch.example.com:9200"),
    +            ("http://localhost:9200", "http://localhost:9200"),
    +            ("https://user@opensearch.example.com", "https://opensearch.example.com"),
    +            ("not-a-url", "not-a-url"),
    +            ("", ""),
    +        ],
    +    )
    +    def test_strip_userinfo(self, host, expected):
    +        assert _strip_userinfo(host) == expected
    +
         def test_client(self):
             assert isinstance(self.os_task_handler.client, opensearchpy.OpenSearch)
             assert self.os_task_handler.index_patterns == "_all"
    

Vulnerability mechanics

AI mechanics synthesis has not run for this CVE yet.

References

6

News mentions

0

No linked articles in our index yet.