VYPR
Critical severity9.3NVD Advisory· Published Jun 4, 2026

MCP-for-Stata: Command injection via log_file_name parameter in Stata command wrapper

CVE-2026-47708

Description

Summary

The log_file_name parameter in the stata_do API and CLI is directly interpolated into a Stata command string without sanitization. The security guard (GuardValidator) only scans the do-file content but does not validate this parameter. An attacker can inject arbitrary Stata commands (including shell, python, erase, etc.) by crafting a malicious log_file_name containing quotes, newlines, or Stata command separators.

Details

In src/stata_mcp/stata/stata_do/do.py, both _execute_unix_like and _execute_windows construct a Stata command string using Python f-strings:

commands = f"""
capture log close
{self.generate_log_command(log_file, is_replace)}
...
do "{dofile_path}"
...
"""

The generate_log_command method returns:

log_cmd = f'log using "{log_file.as_posix()}", {replace_clause} {log_type} name({log_type}_log)'

Where log_file is constructed from user-supplied log_name:

def generate_log_file(self, log_name: str, extension='log'):
    return self.log_file_path / f"{log_name}.{extension}"

The log_name parameter comes directly from user input (via MCP tool stata_do or CLI stata-mcp tool do) without any validation. Since the path is embedded inside double quotes in a Stata command string, an attacker can break out of the string context and inject arbitrary commands.

Additionally, generate_log_file does not prevent path traversal via log_name, allowing arbitrary file write outside the intended log directory.

Proof of

Concept

When calling stata_do via MCP tool with:

{
  "dofile_path": "test.do",
  "log_file_name": "'; shell echo pwned > /tmp/pwned.txt; '"
}

The generated Stata commands become:

log using "<log_dir>/'; shell echo pwned > /tmp/pwned.txt; '.log", replace text name(text_log)

Stata interprets this as multiple commands, with shell echo pwned > /tmp/pwned.txt; executed as an arbitrary shell command.

Impact

  • Remote Code Execution via shell command injection
  • Arbitrary file write/overwrite via path traversal in log_name
  • Complete bypass of the security guard, as the guard only validates do-file content, not wrapper parameters

Remediation / Fix

  1. Apply strict allowlist validation to log_name (only alphanumeric, underscore, dot, hyphen; max 128 chars)
  2. Resolve and verify the constructed log path remains within the intended log directory
  3. Consider generating safe internal filenames (e.g., UUIDs) instead of accepting user-defined log names for command construction
  4. Apply similar sanitization to dofile_path before embedding it into Stata command strings

References

  • Issue: #74
  • Fix commit: https://github.com/SepineTam/stata-mcp/commit/e6f945941ae0c7cf5e74a428e0b3dc82b396382f

Affected products

2

Patches

1
e6f945941ae0

fix: validate log_file_name to prevent Stata command injection

1 file changed · +16 3
  • src/stata_mcp/stata/stata_do/do.py+16 3 modified
    @@ -9,13 +9,16 @@
     
     import logging
     import os
    +import re
     import subprocess
     import tempfile
     from pathlib import Path
     from typing import Dict, List, Literal, Optional
     
     from ...utils import get_nowtime
     
    +LOG_FILE_NAME_PATTERN = re.compile(r"^[A-Za-z0-9_.-]{1,128}$")
    +
     
     class StataDo:
         def __init__(self,
    @@ -79,7 +82,8 @@ def execute_dofile(
             """
             nowtime = get_nowtime()
             log_name = log_file_name or nowtime
    -        log_file = self.log_file_path / f"{log_name}.log"
    +        self._validate_log_name(log_name)
    +        log_file = self.generate_log_file(log_name)
     
             if self.is_unix:
                 args = dofile_path, log_name, is_replace, enable_smcl
    @@ -129,6 +133,15 @@ def generate_log_command(log_file: Path, is_replace: bool = True, log_type: Lite
             log_cmd = f'log using "{log_file.as_posix()}", {replace_clause} {log_type} name({log_type}_log)'
             return log_cmd
     
    +    @staticmethod
    +    def _validate_log_name(log_name: str) -> None:
    +        if not LOG_FILE_NAME_PATTERN.fullmatch(log_name):
    +            raise ValueError(
    +                "Invalid log_file_name. Use 1-128 characters from A-Z, a-z, 0-9, underscore, dot, or hyphen."
    +            )
    +        if any(part in {"", ".", ".."} for part in Path(log_name).parts):
    +            raise ValueError("Invalid log_file_name. Path traversal is not allowed.")
    +
         def generate_log_file(self, log_name: str, extension: Literal['smcl', 'log'] = 'log'):
             return self.log_file_path / f"{log_name}.{extension}"
     
    @@ -225,7 +238,7 @@ def _execute_windows(self, dofile_path: Path, log_file: Path, is_replace: bool =
                 cmd = [self.STATA_CLI, "/e", "do", batch_file.as_posix()]
                 result = subprocess.run(
                     cmd,
    -                shell=True,
    +                shell=True,  # Windows Stata requires shell execution for proper path resolution and startup
                     capture_output=True,
                     text=True,
                     cwd=self.cwd
    @@ -360,7 +373,7 @@ def _execute_windows_with_monitors(self, dofile_path: Path, log_file: Path, is_r
                 cmd = [self.STATA_CLI, "/e", "do", str(batch_file)]
                 proc = subprocess.Popen(
                     cmd,
    -                shell=True,
    +                shell=True,  # Windows Stata requires shell execution for proper path resolution and startup
                     stdout=subprocess.PIPE,
                     stderr=subprocess.PIPE,
                     text=True,
    

Vulnerability mechanics

Root cause

"The `log_file_name` parameter is directly interpolated into Stata command strings without proper sanitization or validation."

Attack vector

An attacker can exploit this vulnerability by providing a specially crafted `log_file_name` when calling the `stata_do` API or CLI. This malicious input can contain quotes, newlines, or Stata command separators, allowing the attacker to break out of the intended string context. The injected commands, such as `shell`, `python`, or `erase`, can then be executed by Stata, leading to remote code execution or arbitrary file writes. This bypasses the security guard, which only inspects do-file content, not wrapper parameters [ref_id=1].

Affected code

The vulnerability resides in the `stata_do` functionality within `src/stata_mcp/stata/stata_do/do.py`. Specifically, the `_execute_unix_like` and `_execute_windows` methods construct Stata command strings by interpolating the user-supplied `log_file_name` directly. The `generate_log_file` method is also implicated as it constructs the log file path from this unvalidated input [ref_id=1].

What the fix does

The patch introduces strict allowlist validation for the `log_file_name` parameter, permitting only alphanumeric characters, underscores, dots, and hyphens within a specified length limit [patch_id=4835175]. It also adds checks to reject path traversal attempts. This prevents malicious input from being interpolated into Stata commands, thereby closing the command injection and arbitrary file write vulnerabilities.

Preconditions

  • inputThe attacker must control the `log_file_name` parameter passed to the `stata_do` API or CLI.

Reproduction

When calling `stata_do` via MCP tool with: ```json { "dofile_path": "test.do", "log_file_name": "'; shell echo pwned > /tmp/pwned.txt; '" } ``` The generated Stata commands become: ```stata log using "<log_dir>/'; shell echo pwned > /tmp/pwned.txt; '.log", replace text name(text_log) ``` Stata interprets this as multiple commands, with `shell echo pwned > /tmp/pwned.txt;` executed as an arbitrary shell command [ref_id=1].

Generated on Jun 4, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

4

News mentions

0

No linked articles in our index yet.