LangChain serialization injection vulnerability enables secret extraction
Description
LangChain is a framework for building LLM-powered applications. Prior to @langchain/core versions 0.3.80 and 1.1.8, and prior to langchain versions 0.3.37 and 1.2.3, a serialization injection vulnerability exists in LangChain JS's toJSON() method (and subsequently when string-ifying objects using JSON.stringify(). The method did not escape objects with 'lc' keys when serializing free-form data in kwargs. The 'lc' key is used internally by LangChain to mark serialized objects. When user-controlled data contains this key structure, it is treated as a legitimate LangChain object during deserialization rather than plain user data. This issue has been patched in @langchain/core versions 0.3.80 and 1.1.8, and langchain versions 0.3.37 and 1.2.3
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
@langchain/corenpm | >= 1.0.0, < 1.1.8 | 1.1.8 |
@langchain/corenpm | < 0.3.80 | 0.3.80 |
langchainnpm | >= 1.0.0, < 1.2.3 | 1.2.3 |
langchainnpm | < 0.3.37 | 0.3.37 |
Affected products
1- Range: @langchain/anthropic==1.0.0, @langchain/anthropic@1.1.0, @langchain/anthropic@1.1.1, …
Patches
1e5063f9c6e99fix!(core/langchain): hardening for `load` (#9707)
6 files changed · +1023 −115
.changeset/lucky-roses-reply.md+6 −0 added@@ -0,0 +1,6 @@ +--- +"@langchain/core": patch +"langchain": patch +--- + +add security hardening for `load`
libs/langchain-core/src/load/index.ts+290 −95 modified@@ -1,3 +1,39 @@ +/** + * Load LangChain objects from JSON strings or objects. + * + * ## How it works + * + * Each `Serializable` LangChain object has a unique identifier (its "class path"), + * which is a list of strings representing the module path and class name. For example: + * + * - `AIMessage` -> `["langchain_core", "messages", "ai", "AIMessage"]` + * - `ChatPromptTemplate` -> `["langchain_core", "prompts", "chat", "ChatPromptTemplate"]` + * + * When deserializing, the class path is validated against supported namespaces. + * + * ## Security model + * + * The `secretsFromEnv` parameter controls whether secrets can be loaded from environment + * variables: + * + * - `false` (default): Secrets must be provided in `secretsMap`. If a secret is not + * found, `null` is returned instead of loading from environment variables. + * - `true`: If a secret is not found in `secretsMap`, it will be loaded from + * environment variables. Use this only in trusted environments. + * + * ### Injection protection (escape-based) + * + * During serialization, plain objects that contain an `'lc'` key are escaped by wrapping + * them: `{"__lc_escaped__": {...}}`. During deserialization, escaped objects are unwrapped + * and returned as plain objects, NOT instantiated as LC objects. + * + * This is an allowlist approach: only objects explicitly produced by + * `Serializable.toJSON()` (which are NOT escaped) are treated as LC objects; + * everything else is user data. + * + * @module + */ + import { Serializable, SerializedConstructor, @@ -10,6 +46,107 @@ import * as coreImportMap from "./import_map.js"; import type { OptionalImportMap, SecretMap } from "./import_type.js"; import { type SerializedFields, keyFromJson, mapKeys } from "./map_keys.js"; import { getEnvironmentVariable } from "../utils/env.js"; +import { isEscapedObject, unescapeValue } from "./validation.js"; + +/** + * Options for loading serialized LangChain objects. + * + * @remarks + * **Security considerations:** + * + * Deserialization can instantiate arbitrary classes from the allowed namespaces. + * When loading untrusted data, be aware that: + * + * 1. **`secretsFromEnv`**: Defaults to `false`. Setting to `true` allows the + * deserializer to read environment variables, which could leak secrets if + * the serialized data contains malicious secret references. + * + * 2. **`importMap` / `optionalImportsMap`**: These allow extending which classes + * can be instantiated. Never populate these from user input. Only include + * modules you explicitly trust. + * + * 3. **Class instantiation**: Allowed classes will have their constructors called + * with the deserialized kwargs. If a class performs side effects in its + * constructor (network calls, file I/O, etc.), those will execute. + */ +export interface LoadOptions { + /** + * A map of secrets to load. Keys are secret identifiers, values are the secret values. + * + * If a secret is not found in this map and `secretsFromEnv` is `false`, an error is + * thrown. If `secretsFromEnv` is `true`, the secret will be loaded from environment + * variables (if not found there either, an error is thrown). + */ + secretsMap?: SecretMap; + + /** + * Whether to load secrets from environment variables when not found in `secretsMap`. + * + * @default false + * + * @remarks + * **Security warning:** Setting this to `true` allows the deserializer to read + * environment variables, which could be a security risk if the serialized data + * is not trusted. Only set this to `true` when deserializing data from trusted + * sources (e.g., your own database, not user input). + */ + secretsFromEnv?: boolean; + + /** + * A map of optional imports. Keys are namespace paths (e.g., "langchain_community/llms"), + * values are the imported modules. + * + * @remarks + * **Security warning:** This extends which classes can be instantiated during + * deserialization. Never populate this map with values derived from user input. + * Only include modules that you explicitly trust and have reviewed. + * + * Classes in these modules can be instantiated with attacker-controlled kwargs + * if the serialized data is untrusted. + */ + optionalImportsMap?: OptionalImportMap; + + /** + * Additional optional import entrypoints to allow beyond the defaults. + * + * @remarks + * **Security warning:** This extends which namespace paths are considered valid + * for deserialization. Never populate this array with values derived from user + * input. Each entrypoint you add expands the attack surface for deserialization. + */ + optionalImportEntrypoints?: string[]; + + /** + * Additional import map for the "langchain" namespace. + * + * @remarks + * **Security warning:** This extends which classes can be instantiated during + * deserialization. Never populate this map with values derived from user input. + * Only include modules that you explicitly trust and have reviewed. + * + * Any class exposed through this map can be instantiated with attacker-controlled + * kwargs if the serialized data is untrusted. + */ + importMap?: Record<string, unknown>; + + /** + * Maximum recursion depth allowed during deserialization. + * + * @default 50 + * + * @remarks + * This limit protects against denial-of-service attacks using deeply nested + * JSON structures that could cause stack overflow. If your legitimate data + * requires deeper nesting, you can increase this limit. + */ + maxDepth?: number; +} + +/** + * Default maximum recursion depth for deserialization. + * This provides protection against DoS attacks via deeply nested structures. + */ +const DEFAULT_MAX_DEPTH = 50; function combineAliasesAndInvert(constructor: typeof Serializable) { const aliases: { [key: string]: string } = {}; @@ -26,74 +163,114 @@ function combineAliasesAndInvert(constructor: typeof Serializable) { }, {} as Record<string, string>); } -async function reviver( - this: { - optionalImportsMap?: OptionalImportMap; - optionalImportEntrypoints?: string[]; - secretsMap?: SecretMap; - importMap?: Record<string, unknown>; - path?: string[]; - }, - value: unknown -): Promise<unknown> { +interface ReviverContext { + optionalImportsMap: OptionalImportMap; + optionalImportEntrypoints: string[]; + secretsMap: SecretMap; + secretsFromEnv: boolean; + importMap: Record<string, unknown>; + path: string[]; + depth: number; + maxDepth: number; +} + +/** + * Recursively revive a value, handling escape markers and LC objects. + * + * This function handles: + * 1. Escaped dicts - unwrapped and returned as plain objects + * 2. LC secret objects - resolved from secretsMap or env + * 3. LC constructor objects - instantiated + * 4. Regular objects/arrays - recursed into + */ +async function reviver(this: ReviverContext, value: unknown): Promise<unknown> { const { - optionalImportsMap = {}, - optionalImportEntrypoints = [], - importMap = {}, - secretsMap = {}, - path = ["$"], + optionalImportsMap, + optionalImportEntrypoints, + importMap, + secretsMap, + secretsFromEnv, + path, + depth, + maxDepth, } = this; const pathStr = path.join("."); + + // Check recursion depth to prevent DoS via deeply nested structures + if (depth > maxDepth) { + throw new Error( + `Maximum recursion depth (${maxDepth}) exceeded during deserialization. ` + + `This may indicate a malicious payload or you may need to increase maxDepth.` + ); + } + + // If not an object, return as-is + if (typeof value !== "object" || value == null) { + return value; + } + + // Handle arrays - recurse into elements + if (Array.isArray(value)) { + return Promise.all( + value.map((v, i) => + reviver.call({ ...this, path: [...path, `${i}`], depth: depth + 1 }, v) + ) + ); + } + + // It's an object - check for escape marker FIRST + const record = value as Record<string, unknown>; + if (isEscapedObject(record)) { + // This is an escaped user object - unwrap and return as-is (no LC processing) + return unescapeValue(record); + } + + // Check for LC secret object if ( - typeof value === "object" && - value !== null && - !Array.isArray(value) && - "lc" in value && - "type" in value && - "id" in value && - value.lc === 1 && - value.type === "secret" + "lc" in record && + "type" in record && + "id" in record && + record.lc === 1 && + record.type === "secret" ) { - const serialized = value as SerializedSecret; + const serialized = record as unknown as SerializedSecret; const [key] = serialized.id; if (key in secretsMap) { return secretsMap[key as keyof SecretMap]; - } else { + } else if (secretsFromEnv) { const secretValueInEnv = getEnvironmentVariable(key); if (secretValueInEnv) { return secretValueInEnv; - } else { - throw new Error( - `Missing key "${key}" for ${pathStr} in load(secretsMap={})` - ); } } - } else if ( - typeof value === "object" && - value !== null && - !Array.isArray(value) && - "lc" in value && - "type" in value && - "id" in value && - value.lc === 1 && - value.type === "not_implemented" + throw new Error(`Missing secret "${key}" at ${pathStr}`); + } + + // Check for LC not_implemented object + if ( + "lc" in record && + "type" in record && + "id" in record && + record.lc === 1 && + record.type === "not_implemented" ) { - const serialized = value as SerializedNotImplemented; + const serialized = record as unknown as SerializedNotImplemented; const str = JSON.stringify(serialized); throw new Error( `Trying to load an object that doesn't implement serialization: ${pathStr} -> ${str}` ); - } else if ( - typeof value === "object" && - value !== null && - !Array.isArray(value) && - "lc" in value && - "type" in value && - "id" in value && - "kwargs" in value && - value.lc === 1 + } + + // Check for LC constructor object + if ( + "lc" in record && + "type" in record && + "id" in record && + "kwargs" in record && + record.lc === 1 && + record.type === "constructor" ) { - const serialized = value as SerializedConstructor; + const serialized = record as unknown as SerializedConstructor; const str = JSON.stringify(serialized); const [name, ...namespaceReverse] = serialized.id.slice().reverse(); const namespace = namespaceReverse.reverse(); @@ -186,60 +363,78 @@ async function reviver( // Recurse on the arguments, which may be serialized objects themselves const kwargs = await reviver.call( - { ...this, path: [...path, "kwargs"] }, + { ...this, path: [...path, "kwargs"], depth: depth + 1 }, serialized.kwargs ); // Construct the object - if (serialized.type === "constructor") { - // eslint-disable-next-line @typescript-eslint/no-explicit-any - const instance = new (builder as any)( - mapKeys( - kwargs as SerializedFields, - keyFromJson, - combineAliasesAndInvert(builder) - ) - ); + // eslint-disable-next-line @typescript-eslint/no-explicit-any + const instance = new (builder as any)( + mapKeys( + kwargs as SerializedFields, + keyFromJson, + combineAliasesAndInvert(builder) + ) + ); - // Minification in severless/edge runtimes will mange the - // name of classes presented in traces. As the names in import map - // are present as-is even with minification, use these names instead - Object.defineProperty(instance.constructor, "name", { value: name }); + // Minification in severless/edge runtimes will mange the + // name of classes presented in traces. As the names in import map + // are present as-is even with minification, use these names instead + Object.defineProperty(instance.constructor, "name", { value: name }); - return instance; - } else { - throw new Error(`Invalid type: ${pathStr} -> ${str}`); - } - } else if (typeof value === "object" && value !== null) { - if (Array.isArray(value)) { - return Promise.all( - value.map((v, i) => - reviver.call({ ...this, path: [...path, `${i}`] }, v) - ) - ); - } else { - return Object.fromEntries( - await Promise.all( - Object.entries(value).map(async ([key, value]) => [ - key, - await reviver.call({ ...this, path: [...path, key] }, value), - ]) - ) - ); - } + return instance; } - return value; -} -export async function load<T>( - text: string, - mappings?: { - secretsMap?: SecretMap; - optionalImportsMap?: OptionalImportMap; - optionalImportEntrypoints?: string[]; - importMap?: Record<string, unknown>; + // Regular object - recurse into values + const result: Record<string, unknown> = {}; + for (const [key, val] of Object.entries(record)) { + result[key] = await reviver.call( + { ...this, path: [...path, key], depth: depth + 1 }, + val + ); } -): Promise<T> { + return result; +} + +/** + * Load a LangChain object from a JSON string. + * + * @param text - The JSON string to parse and load. + * @param options - Options for loading. + * @returns The loaded LangChain object. + * + * @example + * ```typescript + * import { load } from "@langchain/core/load"; + * import { AIMessage } from "@langchain/core/messages"; + * + * // Basic usage - secrets must be provided explicitly + * const msg = await load<AIMessage>(jsonString); + * + * // With secrets from a map + * const msg = await load<AIMessage>(jsonString, { + * secretsMap: { OPENAI_API_KEY: "sk-..." } + * }); + * + * // Allow loading secrets from environment (use with caution) + * const msg = await load<AIMessage>(jsonString, { + * secretsFromEnv: true + * }); + * ``` + */ +export async function load<T>(text: string, options?: LoadOptions): Promise<T> { const json = JSON.parse(text); - return reviver.call({ ...mappings }, json) as Promise<T>; + + const context: ReviverContext = { + optionalImportsMap: options?.optionalImportsMap ?? {}, + optionalImportEntrypoints: options?.optionalImportEntrypoints ?? [], + secretsMap: options?.secretsMap ?? {}, + secretsFromEnv: options?.secretsFromEnv ?? false, + importMap: options?.importMap ?? {}, + path: ["$"], + depth: 0, + maxDepth: options?.maxDepth ?? DEFAULT_MAX_DEPTH, + }; + + return reviver.call(context, json) as Promise<T>; }
libs/langchain-core/src/load/serializable.ts+30 −5 modified@@ -1,4 +1,5 @@ import { type SerializedFields, keyToJson, mapKeys } from "./map_keys.js"; +import { escapeIfNeeded } from "./validation.js"; export interface BaseSerialized<T extends string> { lc: number; @@ -75,6 +76,21 @@ export function get_lc_unique_name( } } +/** + * Interface for objects that can be serialized. + * This is a duck-typed interface to avoid circular imports. + */ +export interface SerializableLike { + lc_serializable: boolean; + lc_secrets?: Record<string, string>; + toJSON(): { + lc: number; + type: string; + id: string[]; + kwargs?: Record<string, unknown>; + }; +} + export interface SerializableInterface { get lc_id(): string[]; } @@ -220,15 +236,24 @@ export abstract class Serializable implements SerializableInterface { } }); + const escapedKwargs: SerializedFields = {}; + for (const [key, value] of Object.entries(kwargs)) { + escapedKwargs[key] = escapeIfNeeded(value); + } + + // Now add secret markers - these are added AFTER escaping so they won't be escaped + const kwargsWithSecrets = Object.keys(secrets).length + ? replaceSecrets(escapedKwargs, secrets) + : escapedKwargs; + + // Finally transform keys to JSON format + const processedKwargs = mapKeys(kwargsWithSecrets, keyToJson, aliases); + return { lc: 1, type: "constructor", id: this.lc_id, - kwargs: mapKeys( - Object.keys(secrets).length ? replaceSecrets(kwargs, secrets) : kwargs, - keyToJson, - aliases - ), + kwargs: processedKwargs, }; }
libs/langchain-core/src/load/tests/index.test.ts+432 −0 added@@ -0,0 +1,432 @@ +import { describe, it, expect, vi, beforeEach, afterEach } from "vitest"; +import { load } from "../index.js"; +import { HumanMessage, AIMessage } from "../../messages/index.js"; + +const SENTINEL_ENV_VAR = "TEST_SECRET_INJECTION_VAR"; +/** Sentinel value that should NEVER appear in serialized output. */ +const SENTINEL_VALUE = "LEAKED_SECRET_MEOW_12345"; + +/** The malicious secret-like object that tries to read the env var */ +const MALICIOUS_SECRET_DICT: Record<string, unknown> = { + lc: 1, + type: "secret", + id: [SENTINEL_ENV_VAR], +}; + +/** + * Assert that serializing/deserializing payload doesn't leak the secret. + */ +async function assertNoSecretLeak(payload: unknown): Promise<void> { + // First serialize using JSON.stringify (which calls toJSON on Serializable objects) + const serialized = JSON.stringify(payload); + + // Deserialize with `secretsFromEnv: true` (the dangerous setting) + const deserialized = await load(serialized, { secretsFromEnv: true }); + + // Re-serialize to string + const reserialized = JSON.stringify(deserialized); + + expect(reserialized).not.toContain(SENTINEL_VALUE); + expect(String(deserialized)).not.toContain(SENTINEL_VALUE); +} + +describe("`load()`", () => { + describe("secret injection prevention", () => { + beforeEach(() => { + vi.stubEnv(SENTINEL_ENV_VAR, SENTINEL_VALUE); + }); + + afterEach(() => { + vi.unstubAllEnvs(); + }); + + describe("Serializable top-level objects", () => { + it("HumanMessage with secret-like object in content", async () => { + const msg = new HumanMessage({ + content: [ + { type: "text", text: "Hello" }, + { type: "text", text: JSON.stringify(MALICIOUS_SECRET_DICT) }, + ], + }); + await assertNoSecretLeak(msg); + }); + + it("HumanMessage with secret-like object in additional_kwargs", async () => { + const msg = new HumanMessage({ + content: "Hello", + additional_kwargs: { data: MALICIOUS_SECRET_DICT }, + }); + await assertNoSecretLeak(msg); + }); + + it("HumanMessage with secret-like object nested in additional_kwargs", async () => { + const msg = new HumanMessage({ + content: "Hello", + additional_kwargs: { nested: { deep: MALICIOUS_SECRET_DICT } }, + }); + await assertNoSecretLeak(msg); + }); + + it("HumanMessage with secret-like object in list in additional_kwargs", async () => { + const msg = new HumanMessage({ + content: "Hello", + additional_kwargs: { items: [MALICIOUS_SECRET_DICT] }, + }); + await assertNoSecretLeak(msg); + }); + + it("AIMessage with secret-like object in response_metadata", async () => { + const msg = new AIMessage({ + content: "Hello", + response_metadata: { data: MALICIOUS_SECRET_DICT }, + }); + await assertNoSecretLeak(msg); + }); + + it("nested Serializable with secret", async () => { + const inner = new HumanMessage({ + content: "Hello", + additional_kwargs: { secret: MALICIOUS_SECRET_DICT }, + }); + const outer = new AIMessage({ + content: "Outer", + additional_kwargs: { nested: [inner.toJSON()] }, + }); + await assertNoSecretLeak(outer); + }); + }); + + describe("Plain top-level objects", () => { + it("object with serializable containing secret", async () => { + const msg = new HumanMessage({ + content: "Hello", + additional_kwargs: { data: MALICIOUS_SECRET_DICT }, + }); + // When a object contains a Serializable, JSON.stringify calls toJSON + const payload = { message: msg }; + await assertNoSecretLeak(payload); + }); + + // Note: Plain objects without Serializable objects don't get + // escaping because JSON.stringify doesn't call toJSON on plain objects. + // The `secretsFromEnv: false` default protects against these cases by + // throwing an error when a secret is not found. This is fail-safe behavior. + it("plain object with secret throws with `secretsFromEnv: false`", async () => { + const payload = { data: MALICIOUS_SECRET_DICT }; + const serialized = JSON.stringify(payload); + + // With `secretsFromEnv: false` (default), missing secrets throw + await expect(load(serialized)).rejects.toThrow(/Missing secret/); + }); + + it("object mimicking lc constructor throws for missing secrets", async () => { + // Even a malicious payload that looks like an LC constructor + // is safe because missing secrets throw an error + const payload = { + lc: 1, + type: "constructor", + id: ["langchain_core", "messages", "ai", "AIMessage"], + kwargs: { + content: "Hello", + additional_kwargs: { secret: MALICIOUS_SECRET_DICT }, + }, + }; + const serialized = JSON.stringify(payload); + + // Missing secrets throw an error, preventing instantiation + await expect(load(serialized)).rejects.toThrow(/Missing secret/); + }); + }); + + describe("toJSON in kwargs", () => { + it("AIMessage with toJSON(HumanMessage) in additional_kwargs", async () => { + const h = new HumanMessage({ content: "Hello" }); + const a = new AIMessage({ + content: "foo", + additional_kwargs: { bar: [h.toJSON()] }, + }); + await assertNoSecretLeak(a); + }); + + it("AIMessage with toJSON(HumanMessage with secret) in additional_kwargs", async () => { + const h = new HumanMessage({ + content: "Hello", + additional_kwargs: { secret: MALICIOUS_SECRET_DICT }, + }); + const a = new AIMessage({ + content: "foo", + additional_kwargs: { bar: [h.toJSON()] }, + }); + await assertNoSecretLeak(a); + }); + + it("double toJSON nesting", async () => { + const h = new HumanMessage({ + content: "Hello", + additional_kwargs: { secret: MALICIOUS_SECRET_DICT }, + }); + const a = new AIMessage({ + content: "foo", + additional_kwargs: { bar: [h.toJSON()] }, + }); + const outer = new AIMessage({ + content: "outer", + additional_kwargs: { nested: [a.toJSON()] }, + }); + await assertNoSecretLeak(outer); + }); + }); + + describe("Round-trip preservation", () => { + it("HumanMessage with secret-like object round-trip", async () => { + const msg = new HumanMessage({ + content: "Hello", + additional_kwargs: { data: MALICIOUS_SECRET_DICT }, + }); + + const serialized = JSON.stringify(msg); + const deserialized = await load<HumanMessage>(serialized, { + secretsFromEnv: true, + }); + + // The secret-like object should be preserved as a plain object + expect(deserialized.additional_kwargs.data).toEqual( + MALICIOUS_SECRET_DICT + ); + expect(typeof deserialized.additional_kwargs.data).toBe("object"); + }); + }); + + describe("Escaping efficiency", () => { + it("no triple escaping", async () => { + const h = new HumanMessage({ + content: "Hello", + additional_kwargs: { bar: [MALICIOUS_SECRET_DICT] }, + }); + const a = new AIMessage({ + content: "foo", + additional_kwargs: { bar: [h.toJSON()] }, + }); + + const serialized = JSON.stringify(a); + // Count nested escape markers - should be max 2 + const escapeCount = (serialized.match(/__lc_escaped__/g) || []).length; + + // Should be 2, not 4+ which would indicate re-escaping + expect(escapeCount).toBeLessThanOrEqual(2); + }); + + it("double nesting no quadruple escape", async () => { + const h = new HumanMessage({ + content: "Hello", + additional_kwargs: { secret: MALICIOUS_SECRET_DICT }, + }); + const a = new AIMessage({ + content: "middle", + additional_kwargs: { nested: [h.toJSON()] }, + }); + const outer = new AIMessage({ + content: "outer", + additional_kwargs: { deep: [a.toJSON()] }, + }); + + const serialized = JSON.stringify(outer); + const escapeCount = (serialized.match(/__lc_escaped__/g) || []).length; + + // Should be 3, not 6+ which would indicate re-escaping + expect(escapeCount).toBeLessThanOrEqual(3); + }); + }); + + describe("Constructor injection", () => { + it("constructor in additional_kwargs not instantiated", async () => { + const maliciousConstructor = { + lc: 1, + type: "constructor", + id: ["langchain_core", "messages", "ai", "AIMessage"], + kwargs: { content: "injected" }, + }; + + const msg = new AIMessage({ + content: "Hello", + additional_kwargs: { data: maliciousConstructor }, + }); + + const serialized = JSON.stringify(msg); + const deserialized = await load<AIMessage>(serialized, { + secretsFromEnv: true, + }); + + // The constructor-like object should be a plain object, NOT an AIMessage + expect(typeof deserialized.additional_kwargs.data).toBe("object"); + expect(deserialized.additional_kwargs.data).toEqual( + maliciousConstructor + ); + // Verify it's NOT an AIMessage instance + expect(deserialized.additional_kwargs.data).not.toBeInstanceOf( + AIMessage + ); + }); + + it("constructor in content not instantiated", async () => { + const maliciousConstructor = { + lc: 1, + type: "constructor", + id: ["langchain_core", "messages", "human", "HumanMessage"], + kwargs: { content: "injected" }, + }; + + const msg = new AIMessage({ + content: "Hello", + additional_kwargs: { nested: maliciousConstructor }, + }); + + const serialized = JSON.stringify(msg); + const deserialized = await load<AIMessage>(serialized, { + secretsFromEnv: true, + }); + + // The constructor-like object should be a plain object, NOT a HumanMessage + expect(typeof deserialized.additional_kwargs.nested).toBe("object"); + expect(deserialized.additional_kwargs.nested).toEqual( + maliciousConstructor + ); + // Verify it's NOT a HumanMessage instance + expect(deserialized.additional_kwargs.nested).not.toBeInstanceOf( + HumanMessage + ); + }); + }); + + describe("secretsFromEnv behavior", () => { + it("`secretsFromEnv: false` throws for missing secrets", async () => { + const secretPayload = JSON.stringify({ + lc: 1, + type: "secret", + id: [SENTINEL_ENV_VAR], + }); + + // With `secretsFromEnv: false` (default), should throw + await expect( + load(secretPayload, { secretsFromEnv: false }) + ).rejects.toThrow(/Missing secret/); + }); + + it("`secretsFromEnv: true` loads from env when not in map", async () => { + const secretPayload = JSON.stringify({ + lc: 1, + type: "secret", + id: [SENTINEL_ENV_VAR], + }); + + // With `secretsFromEnv: true`, should load from env + const result = await load(secretPayload, { secretsFromEnv: true }); + expect(result).toBe(SENTINEL_VALUE); + }); + + it("secretsMap takes precedence over env", async () => { + const secretPayload = JSON.stringify({ + lc: 1, + type: "secret", + id: [SENTINEL_ENV_VAR], + }); + + const mapValue = "from_map"; + const result = await load(secretPayload, { + secretsFromEnv: true, + secretsMap: { [SENTINEL_ENV_VAR]: mapValue }, + }); + expect(result).toBe(mapValue); + }); + + it("default behavior throws for missing secrets", async () => { + const secretPayload = JSON.stringify({ + lc: 1, + type: "secret", + id: [SENTINEL_ENV_VAR], + }); + + // Default behavior should throw for missing secrets + await expect(load(secretPayload)).rejects.toThrow(/Missing secret/); + }); + }); + }); + describe("DoS protection via recursion depth limit", () => { + /** + * Create a deeply nested object structure. + */ + function createDeeplyNested(depth: number): Record<string, unknown> { + let obj: Record<string, unknown> = { value: "leaf" }; + for (let i = 0; i < depth; i++) { + obj = { nested: obj }; + } + return obj; + } + + it("allows nesting within default limit", async () => { + // 30 levels should be fine (default limit is 50) + const nested = createDeeplyNested(30); + const serialized = JSON.stringify(nested); + + const result = await load<Record<string, unknown>>(serialized); + expect(result).toBeDefined(); + }); + + it("throws error when exceeding default depth limit", async () => { + // 60 levels should exceed the default limit of 50 + const nested = createDeeplyNested(60); + const serialized = JSON.stringify(nested); + + await expect(load(serialized)).rejects.toThrow(/Maximum recursion depth/); + }); + + it("respects custom maxDepth option", async () => { + // 40 levels with a limit of 30 should fail + const nested = createDeeplyNested(40); + const serialized = JSON.stringify(nested); + + await expect(load(serialized, { maxDepth: 30 })).rejects.toThrow( + /Maximum recursion depth \(30\) exceeded/ + ); + }); + + it("allows increasing maxDepth for legitimate deep structures", async () => { + // 60 levels with a limit of 100 should work + const nested = createDeeplyNested(60); + const serialized = JSON.stringify(nested); + + const result = await load<Record<string, unknown>>(serialized, { + maxDepth: 100, + }); + expect(result).toBeDefined(); + }); + + it("protects against deeply nested arrays", async () => { + // Create deeply nested arrays + let arr: unknown[] = ["leaf"]; + for (let i = 0; i < 60; i++) { + arr = [arr]; + } + const serialized = JSON.stringify(arr); + + await expect(load(serialized)).rejects.toThrow(/Maximum recursion depth/); + }); + + it("protects against deeply nested LC constructor kwargs", async () => { + // Create a deeply nested structure inside kwargs + const nested = createDeeplyNested(60); + const payload = { + lc: 1, + type: "constructor", + id: ["langchain_core", "messages", "ai", "AIMessage"], + kwargs: { + content: "Hello", + additional_kwargs: nested, + }, + }; + const serialized = JSON.stringify(payload); + + await expect(load(serialized)).rejects.toThrow(/Maximum recursion depth/); + }); + }); +});
libs/langchain-core/src/load/validation.ts+262 −0 added@@ -0,0 +1,262 @@ +/** + * Sentinel key used to mark escaped user objects during serialization. + * + * When a plain object contains 'lc' key (which could be confused with LC objects), + * we wrap it as `{"__lc_escaped__": {...original...}}`. + */ +export const LC_ESCAPED_KEY = "__lc_escaped__"; + +/** + * Check if an object needs escaping to prevent confusion with LC objects. + * + * An object needs escaping if: + * 1. It has an `'lc'` key (could be confused with LC serialization format) + * 2. It has only the escape key (would be mistaken for an escaped object) + */ +export function needsEscaping(obj: Record<string, unknown>): boolean { + return ( + "lc" in obj || (Object.keys(obj).length === 1 && LC_ESCAPED_KEY in obj) + ); +} + +/** + * Wrap an object in the escape marker. + * + * @example + * ```typescript + * {"key": "value"} // becomes {"__lc_escaped__": {"key": "value"}} + * ``` + */ +export function escapeObject( + obj: Record<string, unknown> +): Record<string, unknown> { + return { [LC_ESCAPED_KEY]: obj }; +} + +/** + * Check if an object is an escaped user object. + * + * @example + * ```typescript + * {"__lc_escaped__": {...}} // is an escaped object + * ``` + */ +export function isEscapedObject(obj: Record<string, unknown>): boolean { + return Object.keys(obj).length === 1 && LC_ESCAPED_KEY in obj; +} + +/** + * Interface for objects that can be serialized. + * This is a duck-typed interface to avoid circular imports. + */ +interface SerializableLike { + lc_serializable: boolean; + lc_secrets?: Record<string, string>; + toJSON(): { + lc: number; + type: string; + id: string[]; + kwargs?: Record<string, unknown>; + }; +} + +/** + * Check if an object looks like a Serializable instance (duck typing). + */ +function isSerializableLike(obj: unknown): obj is SerializableLike { + return ( + obj !== null && + typeof obj === "object" && + "lc_serializable" in obj && + typeof (obj as SerializableLike).toJSON === "function" + ); +} + +/** + * Create a "not_implemented" serialization result for objects that cannot be serialized. + */ +function createNotImplemented(obj: unknown): { + lc: 1; + type: "not_implemented"; + id: string[]; +} { + let id: string[]; + if (obj !== null && typeof obj === "object") { + if ("lc_id" in obj && Array.isArray(obj.lc_id)) { + id = obj.lc_id as string[]; + } else { + id = [obj.constructor?.name ?? "Object"]; + } + } else { + id = [typeof obj]; + } + return { + lc: 1, + type: "not_implemented", + id, + }; +} + +/** + * Serialize a value with escaping of user objects. + * + * Called recursively on kwarg values to escape any plain objects that could be + * confused with LC objects. + * + * @param obj - The value to serialize. + * @returns The serialized value with user objects escaped as needed. + */ +export function serializeValue(obj: unknown): unknown { + if (isSerializableLike(obj)) { + // This is an LC object - serialize it properly (not escaped) + return serializeLcObject(obj); + } + + if (obj !== null && typeof obj === "object" && !Array.isArray(obj)) { + const record = obj as Record<string, unknown>; + // Check if object needs escaping BEFORE recursing into values. + // If it needs escaping, wrap it as-is - the contents are user data that + // will be returned as-is during deserialization (no instantiation). + // This prevents re-escaping of already-escaped nested content. + if (needsEscaping(record)) { + return escapeObject(record); + } + // Safe object (no 'lc' key) - recurse into values + const result: Record<string, unknown> = {}; + for (const [key, value] of Object.entries(record)) { + result[key] = serializeValue(value); + } + return result; + } + + if (Array.isArray(obj)) { + return obj.map((item) => serializeValue(item)); + } + + if ( + typeof obj === "string" || + typeof obj === "number" || + typeof obj === "boolean" || + obj === null + ) { + return obj; + } + + // Non-JSON-serializable object (Date, custom objects, etc.) + return createNotImplemented(obj); +} + +/** + * Serialize a `Serializable` object with escaping of user data in kwargs. + * + * @param obj - The `Serializable` object to serialize. + * @returns The serialized object with user data in kwargs escaped as needed. + * + * @remarks + * Kwargs values are processed with `serializeValue` to escape user data (like + * metadata) that contains `'lc'` keys. Secret fields (from `lc_secrets`) are + * skipped because `toJSON()` replaces their values with secret markers. + */ +export function serializeLcObject(obj: SerializableLike): { + lc: number; + type: string; + id: string[]; + kwargs?: Record<string, unknown>; +} { + // Secret fields are handled by toJSON() - it replaces values with secret markers + const secretFields = new Set(Object.keys(obj.lc_secrets ?? {})); + + const serialized = { ...obj.toJSON() }; + + // Process kwargs to escape user data that could be confused with LC objects + // Skip secret fields - toJSON() already converted them to secret markers + if (serialized.type === "constructor" && serialized.kwargs) { + const newKwargs: Record<string, unknown> = {}; + for (const [key, value] of Object.entries(serialized.kwargs)) { + if (secretFields.has(key)) { + newKwargs[key] = value; + } else { + newKwargs[key] = serializeValue(value); + } + } + serialized.kwargs = newKwargs; + } + + return serialized; +} + +/** + * Escape a value if it needs escaping (contains `lc` key). + * + * This is a simpler version of `serializeValue` that doesn't handle Serializable + * objects - it's meant to be called on kwargs values that have already been + * processed by `toJSON()`. + * + * @param value - The value to potentially escape. + * @returns The value with any `lc`-containing objects wrapped in escape markers. + */ +export function escapeIfNeeded(value: unknown): unknown { + if (value !== null && typeof value === "object" && !Array.isArray(value)) { + // Preserve Serializable objects - they have their own toJSON() that will be + // called by JSON.stringify. We don't want to convert them to plain objects. + if (isSerializableLike(value)) { + return value; + } + const record = value as Record<string, unknown>; + // Check if object needs escaping BEFORE recursing into values. + // If it needs escaping, wrap it as-is - the contents are user data that + // will be returned as-is during deserialization (no instantiation). + if (needsEscaping(record)) { + return escapeObject(record); + } + // Safe object (no 'lc' key) - recurse into values + const result: Record<string, unknown> = {}; + for (const [key, val] of Object.entries(record)) { + result[key] = escapeIfNeeded(val); + } + return result; + } + + if (Array.isArray(value)) { + return value.map((item) => escapeIfNeeded(item)); + } + + return value; +} + +/** + * Unescape a value, processing escape markers in object values and arrays. + * + * When an escaped object is encountered (`{"__lc_escaped__": ...}`), it's + * unwrapped and the contents are returned AS-IS (no further processing). + * The contents represent user data that should not be modified. + * + * For regular objects and arrays, we recurse to find any nested escape markers. + * + * @param obj - The value to unescape. + * @returns The unescaped value. + */ +export function unescapeValue(obj: unknown): unknown { + if (obj !== null && typeof obj === "object" && !Array.isArray(obj)) { + const record = obj as Record<string, unknown>; + if (isEscapedObject(record)) { + // Unwrap and return the user data as-is (no further unescaping). + // The contents are user data that may contain more escape keys, + // but those are part of the user's actual data. + return record[LC_ESCAPED_KEY]; + } + + // Regular object - recurse into values to find nested escape markers + const result: Record<string, unknown> = {}; + for (const [key, value] of Object.entries(record)) { + result[key] = unescapeValue(value); + } + return result; + } + + if (Array.isArray(obj)) { + return obj.map((item) => unescapeValue(item)); + } + + return obj; +}
libs/langchain/src/load/tests/load.test.ts+3 −15 modified@@ -141,18 +141,6 @@ test("serialize + deserialize llm", async () => { ); expect(llm2).toBeInstanceOf(OpenAI); expect(JSON.stringify(llm2, null, 2)).toBe(str); - // Accept secret as env var - const llm3 = await load<OpenAI>( - str, - {}, - {}, - { - llms__openai: { OpenAI }, - } - ); - expect(llm3).toBeInstanceOf(OpenAI); - expect(llm.openAIApiKey).toBe(llm3.openAIApiKey); - expect(JSON.stringify(llm3, null, 2)).toBe(str); }); test("serialize + deserialize with new and old ids", async () => { @@ -193,7 +181,7 @@ test("serialize + deserialize runnable sequence with new and old ids", async () }); const runnable2 = await load<RunnableSequence>( strWithOldId, - {}, + { OPENAI_API_KEY: "openai-key" }, {}, { chat_models__openai: { ChatOpenAI }, @@ -202,7 +190,7 @@ test("serialize + deserialize runnable sequence with new and old ids", async () expect(runnable2).toBeInstanceOf(RunnableSequence); const runnable3 = await load<RunnableSequence>( strWithNewId, - {}, + { OPENAI_API_KEY: "openai-key" }, {}, { chat_models__openai: { ChatOpenAI }, @@ -234,7 +222,7 @@ test("Should load traces even if the constructor name changes (minified environm const llm2 = await load<OpenAI>( str, - { COHERE_API_KEY: "cohere-key" }, + { OPENAI_API_KEY: "openai-key" }, { "langchain/llms/openai": { OpenAI } } ); // console.log(JSON.stringify(llm2, null, 2));
Vulnerability mechanics
Generated by null/stub on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
6- github.com/advisories/GHSA-r399-636x-v7f6ghsaADVISORY
- nvd.nist.gov/vuln/detail/CVE-2025-68665ghsaADVISORY
- github.com/langchain-ai/langchainjs/commit/e5063f9c6e9989ea067dfdff39262b9e7b6aba62ghsax_refsource_MISCWEB
- github.com/langchain-ai/langchainjs/releases/tag/%40langchain%2Fcore%401.1.8ghsax_refsource_MISCWEB
- github.com/langchain-ai/langchainjs/releases/tag/langchain%401.2.3ghsax_refsource_MISCWEB
- github.com/langchain-ai/langchainjs/security/advisories/GHSA-r399-636x-v7f6ghsax_refsource_CONFIRMWEB
News mentions
0No linked articles in our index yet.