MLflow Tracking Server Model Creation Directory Traversal Remote Code Execution Vulnerability
Description
MLflow Tracking Server Model Creation Directory Traversal Remote Code Execution Vulnerability. This vulnerability allows remote attackers to execute arbitrary code on affected installations of MLflow Tracking Server. Authentication is not required to exploit this vulnerability.
The specific flaw exists within the handling of model file paths. The issue results from the lack of proper validation of a user-supplied path prior to using it in file operations. An attacker can leverage this vulnerability to execute code in the context of the service account. Was ZDI-CAN-26921.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
mlflowPyPI | >= 3.0.0rc0, < 3.0.0 | 3.0.0 |
mlflowPyPI | < 2.22.4 | 2.22.4 |
Affected products
1Patches
32e02bc7bb70dIntroduce `MLFLOW_CREATE_MODEL_VERSION_SOURCE_REGEX` to validate source parameter of `/model-versions/create` request (#16081)
4 files changed · +152 −1
docs/docs/tracking/server/index.mdx+73 −0 modified@@ -263,6 +263,79 @@ response = requests.get("http://<mlflow-host>:<mlflow-port>/version") assert response.text == mlflow.__version__ # Checking for a strict version match ``` +## Model Version Source Validation + +The tracking server can be configured to validate model version sources using a regular expression pattern. This security feature helps ensure that only model versions from approved sources are registered in your model registry. + +### Configuration + +Set the `MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX` environment variable when starting the tracking server: + +```bash +export MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX="^mlflow-artifacts:/.*$" +mlflow server --host 0.0.0.0 --port 5000 +``` + +### Usage + +When this environment variable is set, the tracking server will validate the `source` parameter in model version creation requests against the specified regular expression pattern. If the source doesn't match the pattern, the request will be rejected with an error. + +#### Example: Restricting to MLflow Artifacts + +To only allow model versions from MLflow artifacts storage: + +```bash +export MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX="^mlflow-artifacts:/.*$" +mlflow server --host 0.0.0.0 --port 5000 +``` + +With this configuration: + +```python +import mlflow +from mlflow import MlflowClient + +client = MlflowClient("http://localhost:5000") + +# This will work - source matches the pattern +client.create_model_version( + name="my-model", + source="mlflow-artifacts://1/artifacts/model", + run_id="abc123", +) + +# This will fail - source doesn't match the pattern +client.create_model_version( + name="my-model", + source="s3://my-bucket/model", + run_id="def456", +) # Raises MlflowException: Invalid model version source +``` + +#### Example: Restricting to Specific S3 Buckets + +To only allow model versions from specific S3 buckets: + +```bash +export MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX="^s3://(production-models|staging-models)/.*$" +mlflow server --host 0.0.0.0 --port 5000 +``` + +This pattern would allow sources like: +- `s3://production-models/model-v1/` +- `s3://staging-models/experiment-123/model/` + +But reject sources like: +- `s3://untrusted-bucket/model/` +- `file:///local/path/model` + +:::note +- If the environment variable is not set, no source validation is performed. +- The validation only applies to the `/mlflow/model-versions/create` API endpoint. +- The regular expression is applied using Python's `re.search()` function. +- Use standard regular expression syntax for pattern matching. +::: + ## Handling timeout when uploading/downloading large artifacts When uploading or downloading large artifacts through the tracking server with the artifact proxy enabled, the server may take a long time to process the request. If it exceeds the timeout limit (30 seconds by default), the server will restart the worker process, resulting in a request failure on the client side.
mlflow/environment_variables.py+13 −0 modified@@ -825,3 +825,16 @@ def get(self): MLFLOW_SUPPRESS_PRINTING_URL_TO_STDOUT = _BooleanEnvironmentVariable( "MLFLOW_SUPPRESS_PRINTING_URL_TO_STDOUT", False ) + +#: If True, MLflow locks both direct and transitive model dependencies when logging a model. +#: (default: ``False``). +MLFLOW_LOCK_MODEL_DEPENDENCIES = _BooleanEnvironmentVariable( + "MLFLOW_LOCK_MODEL_DEPENDENCIES", False +) + +#: If specified, tracking server rejects model `/mlflow/model-versions/create` requests with +#: a source that does not match the specified regular expression. +#: (default: ``None``). +MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX = _EnvironmentVariable( + "MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX", str, None +)
mlflow/server/handlers.py+13 −1 modified@@ -38,7 +38,10 @@ from mlflow.entities.multipart_upload import MultipartUploadPart from mlflow.entities.trace_info_v2 import TraceInfoV2 from mlflow.entities.trace_status import TraceStatus -from mlflow.environment_variables import MLFLOW_DEPLOYMENTS_TARGET +from mlflow.environment_variables import ( + MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX, + MLFLOW_DEPLOYMENTS_TARGET, +) from mlflow.exceptions import MlflowException, _UnsupportedMultipartUploadException from mlflow.models import Model from mlflow.protos import databricks_pb2 @@ -1974,6 +1977,15 @@ def _create_model_version(): }, ) + if request_message.source and ( + regex := MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX.get() + ): + if not re.search(regex, request_message.source): + raise MlflowException( + f"Invalid model version source: '{request_message.source}'.", + error_code=INVALID_PARAMETER_VALUE, + ) + # If the model version is a prompt, we don't validate the source if not _is_prompt_request(request_message): if request_message.model_id:
tests/tracking/test_rest_tracking.py+53 −0 modified@@ -9,10 +9,12 @@ import os import pathlib import posixpath +import subprocess import sys import time import urllib.parse from io import StringIO +from pathlib import Path from unittest import mock import flask @@ -61,6 +63,7 @@ from mlflow.utils.proto_json_utils import message_to_json from mlflow.utils.time import get_current_time_millis +from tests.helper_functions import get_safe_port from tests.integration.utils import invoke_cli_runner from tests.tracking.integration_test_utils import ( _init_server, @@ -1640,6 +1643,56 @@ def test_create_model_version_with_file_uri(mlflow_client): assert "is not a valid remote uri" in response.json()["message"] +def test_create_model_version_with_validation_regex(tmp_path: Path): + port = get_safe_port() + with subprocess.Popen( + [ + sys.executable, + "-m", + "mlflow", + "server", + "--port", + str(port), + "--backend-store-uri", + f"sqlite:///{tmp_path / 'mlflow.db'}", + ], + env=( + os.environ.copy() + | { + "MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX": r"^mlflow-artifacts:/.*$", + } + ), + ) as proc: + try: + # Wait for the server to start + for _ in range(10): + try: + if requests.get(f"http://localhost:{port}/health").ok: + break + except requests.ConnectionError: + time.sleep(1) + else: + raise RuntimeError("Failed to connect to the MLflow server") + + # Test that the validation regex works as expected + client = MlflowClient(f"http://localhost:{port}") + name = "test" + client.create_registered_model(name) + # Invalid source + with pytest.raises(MlflowException, match="Invalid model version source"): + client.create_model_version(name, source="s3://path/to/model") + # Valid source + experiment_id = client.create_experiment("test") + run = client.create_run(experiment_id=experiment_id) + assert run.info.artifact_uri.startswith("mlflow-artifacts:/") + client.create_model_version( + name, source=f"{run.info.artifact_uri}/model", run_id=run.info.run_id + ) + finally: + proc.terminate() + proc.wait() + + @pytest.mark.xfail(reason="Tracking server does not support logged-model endpoints yet") def test_logging_model_with_local_artifact_uri(mlflow_client): from sklearn.linear_model import LogisticRegression
5f98ff98659dIntroduce `MLFLOW_CREATE_MODEL_VERSION_SOURCE_REGEX` to validate source parameter of `/model-versions/create` request (#16081)
4 files changed · +148 −1
docs/docs/tracking/server/index.mdx+73 −0 modified@@ -263,6 +263,79 @@ response = requests.get("http://<mlflow-host>:<mlflow-port>/version") assert response.text == mlflow.__version__ # Checking for a strict version match ``` +## Model Version Source Validation + +The tracking server can be configured to validate model version sources using a regular expression pattern. This security feature helps ensure that only model versions from approved sources are registered in your model registry. + +### Configuration + +Set the `MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX` environment variable when starting the tracking server: + +```bash +export MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX="^mlflow-artifacts:/.*$" +mlflow server --host 0.0.0.0 --port 5000 +``` + +### Usage + +When this environment variable is set, the tracking server will validate the `source` parameter in model version creation requests against the specified regular expression pattern. If the source doesn't match the pattern, the request will be rejected with an error. + +#### Example: Restricting to MLflow Artifacts + +To only allow model versions from MLflow artifacts storage: + +```bash +export MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX="^mlflow-artifacts:/.*$" +mlflow server --host 0.0.0.0 --port 5000 +``` + +With this configuration: + +```python +import mlflow +from mlflow import MlflowClient + +client = MlflowClient("http://localhost:5000") + +# This will work - source matches the pattern +client.create_model_version( + name="my-model", + source="mlflow-artifacts://1/artifacts/model", + run_id="abc123", +) + +# This will fail - source doesn't match the pattern +client.create_model_version( + name="my-model", + source="s3://my-bucket/model", + run_id="def456", +) # Raises MlflowException: Invalid model version source +``` + +#### Example: Restricting to Specific S3 Buckets + +To only allow model versions from specific S3 buckets: + +```bash +export MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX="^s3://(production-models|staging-models)/.*$" +mlflow server --host 0.0.0.0 --port 5000 +``` + +This pattern would allow sources like: +- `s3://production-models/model-v1/` +- `s3://staging-models/experiment-123/model/` + +But reject sources like: +- `s3://untrusted-bucket/model/` +- `file:///local/path/model` + +:::note +- If the environment variable is not set, no source validation is performed. +- The validation only applies to the `/mlflow/model-versions/create` API endpoint. +- The regular expression is applied using Python's `re.search()` function. +- Use standard regular expression syntax for pattern matching. +::: + ## Handling timeout when uploading/downloading large artifacts When uploading or downloading large artifacts through the tracking server with the artifact proxy enabled, the server may take a long time to process the request. If it exceeds the timeout limit (30 seconds by default), the server will restart the worker process, resulting in a request failure on the client side.
mlflow/environment_variables.py+7 −0 modified@@ -781,3 +781,10 @@ def get(self): MLFLOW_ASYNC_TRACE_LOGGING_RETRY_TIMEOUT = _EnvironmentVariable( "MLFLOW_ASYNC_TRACE_LOGGING_RETRY_TIMEOUT", int, 60 ) + +#: If specified, tracking server rejects model `/mlflow/model-versions/create` requests with +#: a source that does not match the specified regular expression. +#: (default: ``None``). +MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX = _EnvironmentVariable( + "MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX", str, None +)
mlflow/server/handlers.py+13 −1 modified@@ -31,7 +31,10 @@ from mlflow.entities.multipart_upload import MultipartUploadPart from mlflow.entities.trace_info import TraceInfo from mlflow.entities.trace_status import TraceStatus -from mlflow.environment_variables import MLFLOW_DEPLOYMENTS_TARGET +from mlflow.environment_variables import ( + MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX, + MLFLOW_DEPLOYMENTS_TARGET, +) from mlflow.exceptions import MlflowException, _UnsupportedMultipartUploadException from mlflow.models import Model from mlflow.protos import databricks_pb2 @@ -1899,6 +1902,15 @@ def _create_model_version(): }, ) + if request_message.source and ( + regex := MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX.get() + ): + if not re.search(regex, request_message.source): + raise MlflowException( + f"Invalid model version source: '{request_message.source}'.", + error_code=INVALID_PARAMETER_VALUE, + ) + # If the model version is a prompt, we don't validate the source if not _is_prompt_request(request_message): _validate_source(request_message.source, request_message.run_id)
tests/tracking/test_rest_tracking.py+55 −0 modified@@ -9,9 +9,12 @@ import os import pathlib import posixpath +import subprocess import sys import time import urllib.parse +from io import StringIO +from pathlib import Path from unittest import mock import flask @@ -57,6 +60,7 @@ from mlflow.utils.proto_json_utils import message_to_json from mlflow.utils.time import get_current_time_millis +from tests.helper_functions import get_safe_port from tests.integration.utils import invoke_cli_runner from tests.tracking.integration_test_utils import ( _init_server, @@ -1575,6 +1579,57 @@ def test_create_model_version_with_file_uri(mlflow_client): assert "is not a valid remote uri" in response.json()["message"] +def test_create_model_version_with_validation_regex(tmp_path: Path): + port = get_safe_port() + with subprocess.Popen( + [ + sys.executable, + "-m", + "mlflow", + "server", + "--port", + str(port), + "--backend-store-uri", + f"sqlite:///{tmp_path / 'mlflow.db'}", + ], + env=( + os.environ.copy() + | { + "MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX": r"^mlflow-artifacts:/.*$", + } + ), + ) as proc: + try: + # Wait for the server to start + for _ in range(10): + try: + if requests.get(f"http://localhost:{port}/health").ok: + break + except requests.ConnectionError: + time.sleep(1) + else: + raise RuntimeError("Failed to connect to the MLflow server") + + # Test that the validation regex works as expected + client = MlflowClient(f"http://localhost:{port}") + name = "test" + client.create_registered_model(name) + # Invalid source + with pytest.raises(MlflowException, match="Invalid model version source"): + client.create_model_version(name, source="s3://path/to/model") + # Valid source + experiment_id = client.create_experiment("test") + run = client.create_run(experiment_id=experiment_id) + assert run.info.artifact_uri.startswith("mlflow-artifacts:/") + client.create_model_version( + name, source=f"{run.info.artifact_uri}/model", run_id=run.info.run_id + ) + finally: + proc.terminate() + proc.wait() + + +@pytest.mark.xfail(reason="Tracking server does not support logged-model endpoints yet") def test_logging_model_with_local_artifact_uri(mlflow_client): from sklearn.linear_model import LogisticRegression
e7dc0574fa34Introduce `MLFLOW_CREATE_MODEL_VERSION_SOURCE_REGEX` to validate source parameter of `/model-versions/create` request (#16081)
4 files changed · +146 −1
docs/docs/tracking/server/index.mdx+73 −0 modified@@ -263,6 +263,79 @@ response = requests.get("http://<mlflow-host>:<mlflow-port>/version") assert response.text == mlflow.__version__ # Checking for a strict version match ``` +## Model Version Source Validation + +The tracking server can be configured to validate model version sources using a regular expression pattern. This security feature helps ensure that only model versions from approved sources are registered in your model registry. + +### Configuration + +Set the `MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX` environment variable when starting the tracking server: + +```bash +export MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX="^mlflow-artifacts:/.*$" +mlflow server --host 0.0.0.0 --port 5000 +``` + +### Usage + +When this environment variable is set, the tracking server will validate the `source` parameter in model version creation requests against the specified regular expression pattern. If the source doesn't match the pattern, the request will be rejected with an error. + +#### Example: Restricting to MLflow Artifacts + +To only allow model versions from MLflow artifacts storage: + +```bash +export MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX="^mlflow-artifacts:/.*$" +mlflow server --host 0.0.0.0 --port 5000 +``` + +With this configuration: + +```python +import mlflow +from mlflow import MlflowClient + +client = MlflowClient("http://localhost:5000") + +# This will work - source matches the pattern +client.create_model_version( + name="my-model", + source="mlflow-artifacts://1/artifacts/model", + run_id="abc123", +) + +# This will fail - source doesn't match the pattern +client.create_model_version( + name="my-model", + source="s3://my-bucket/model", + run_id="def456", +) # Raises MlflowException: Invalid model version source +``` + +#### Example: Restricting to Specific S3 Buckets + +To only allow model versions from specific S3 buckets: + +```bash +export MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX="^s3://(production-models|staging-models)/.*$" +mlflow server --host 0.0.0.0 --port 5000 +``` + +This pattern would allow sources like: +- `s3://production-models/model-v1/` +- `s3://staging-models/experiment-123/model/` + +But reject sources like: +- `s3://untrusted-bucket/model/` +- `file:///local/path/model` + +:::note +- If the environment variable is not set, no source validation is performed. +- The validation only applies to the `/mlflow/model-versions/create` API endpoint. +- The regular expression is applied using Python's `re.search()` function. +- Use standard regular expression syntax for pattern matching. +::: + ## Handling timeout when uploading/downloading large artifacts When uploading or downloading large artifacts through the tracking server with the artifact proxy enabled, the server may take a long time to process the request. If it exceeds the timeout limit (30 seconds by default), the server will restart the worker process, resulting in a request failure on the client side.
mlflow/environment_variables.py+7 −0 modified@@ -835,3 +835,10 @@ def get(self): MLFLOW_LOCK_MODEL_DEPENDENCIES = _BooleanEnvironmentVariable( "MLFLOW_LOCK_MODEL_DEPENDENCIES", False ) + +#: If specified, tracking server rejects model `/mlflow/model-versions/create` requests with +#: a source that does not match the specified regular expression. +#: (default: ``None``). +MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX = _EnvironmentVariable( + "MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX", str, None +)
mlflow/server/handlers.py+13 −1 modified@@ -38,7 +38,10 @@ from mlflow.entities.multipart_upload import MultipartUploadPart from mlflow.entities.trace_info_v2 import TraceInfoV2 from mlflow.entities.trace_status import TraceStatus -from mlflow.environment_variables import MLFLOW_DEPLOYMENTS_TARGET +from mlflow.environment_variables import ( + MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX, + MLFLOW_DEPLOYMENTS_TARGET, +) from mlflow.exceptions import MlflowException, _UnsupportedMultipartUploadException from mlflow.models import Model from mlflow.protos import databricks_pb2 @@ -1974,6 +1977,15 @@ def _create_model_version(): }, ) + if request_message.source and ( + regex := MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX.get() + ): + if not re.search(regex, request_message.source): + raise MlflowException( + f"Invalid model version source: '{request_message.source}'.", + error_code=INVALID_PARAMETER_VALUE, + ) + # If the model version is a prompt, we don't validate the source if not _is_prompt_request(request_message): if request_message.model_id:
tests/tracking/test_rest_tracking.py+53 −0 modified@@ -9,10 +9,12 @@ import os import pathlib import posixpath +import subprocess import sys import time import urllib.parse from io import StringIO +from pathlib import Path from unittest import mock import flask @@ -61,6 +63,7 @@ from mlflow.utils.proto_json_utils import message_to_json from mlflow.utils.time import get_current_time_millis +from tests.helper_functions import get_safe_port from tests.integration.utils import invoke_cli_runner from tests.tracking.integration_test_utils import ( _init_server, @@ -1640,6 +1643,56 @@ def test_create_model_version_with_file_uri(mlflow_client): assert "is not a valid remote uri" in response.json()["message"] +def test_create_model_version_with_validation_regex(tmp_path: Path): + port = get_safe_port() + with subprocess.Popen( + [ + sys.executable, + "-m", + "mlflow", + "server", + "--port", + str(port), + "--backend-store-uri", + f"sqlite:///{tmp_path / 'mlflow.db'}", + ], + env=( + os.environ.copy() + | { + "MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX": r"^mlflow-artifacts:/.*$", + } + ), + ) as proc: + try: + # Wait for the server to start + for _ in range(10): + try: + if requests.get(f"http://localhost:{port}/health").ok: + break + except requests.ConnectionError: + time.sleep(1) + else: + raise RuntimeError("Failed to connect to the MLflow server") + + # Test that the validation regex works as expected + client = MlflowClient(f"http://localhost:{port}") + name = "test" + client.create_registered_model(name) + # Invalid source + with pytest.raises(MlflowException, match="Invalid model version source"): + client.create_model_version(name, source="s3://path/to/model") + # Valid source + experiment_id = client.create_experiment("test") + run = client.create_run(experiment_id=experiment_id) + assert run.info.artifact_uri.startswith("mlflow-artifacts:/") + client.create_model_version( + name, source=f"{run.info.artifact_uri}/model", run_id=run.info.run_id + ) + finally: + proc.terminate() + proc.wait() + + @pytest.mark.xfail(reason="Tracking server does not support logged-model endpoints yet") def test_logging_model_with_local_artifact_uri(mlflow_client): from sklearn.linear_model import LogisticRegression
Vulnerability mechanics
Generated by null/stub on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
7- github.com/B-Step62/mlflow/commit/2e02bc7bb70df243e6eb792689d9b8eba0013161ghsavendor-advisoryWEB
- github.com/advisories/GHSA-5cvj-7rg6-jggjghsaADVISORY
- nvd.nist.gov/vuln/detail/CVE-2025-11201ghsaADVISORY
- www.zerodayinitiative.com/advisories/ZDI-25-931/mitrex_research-advisory
- github.com/mlflow/mlflow/commit/5f98ff98659dddb188591ecf6b10a4e276a0dba7ghsaWEB
- github.com/mlflow/mlflow/commit/e7dc0574fa3459e0003cfeb68d4e4a625491f03dghsaWEB
- www.zerodayinitiative.com/advisories/ZDI-25-931ghsaWEB
News mentions
0No linked articles in our index yet.