Denial of Service (DoS) in run-llama/llama_index
Description
A vulnerability in the LangChainLLM class of the run-llama/llama_index repository, version v0.12.5, allows for a Denial of Service (DoS) attack. The stream_complete method executes the llm using a thread and retrieves the result via the get_response_gen method of the StreamingGeneratorCallbackHandler class. If the thread terminates abnormally before the _llm.predict is executed, there is no exception handling for this case, leading to an infinite loop in the get_response_gen function. This can be triggered by providing an input of an incorrect type, causing the thread to terminate and the process to continue running indefinitely.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
An unhandled thread termination in llama_index v0.12.5 leads to an infinite loop causing Denial of Service via malformed input.
Vulnerability
Overview
In version v0.12.5 of the run-llama/llama_index repository, the LangChainLLM class contains a flaw in its stream_complete method that can lead to a Denial of Service (DoS) [1]. The method launches the LLM call in a separate thread and relies on get_response_gen from StreamingGeneratorCallbackHandler to retrieve the response. When the thread terminates abnormally—before the underlying _llm.predict executes—the get_response_gen function enters an infinite loop because it never receives the expected completion signal and lacks exception handling for this edge case [1].
Attack
Vector
An attacker can trigger this condition by providing an input of an incorrect type to the stream_complete method. The malformed input causes the worker thread to fail prematurely, leaving the main process stuck in an indefinite wait cycle [1]. No authentication is required beyond normal access to the vulnerable API endpoint, making it exploitable in any deployment that accepts user-supplied inputs for LLM streaming.
Impact
A successful exploit results in a Denial of Service (DoS) where the affected process hangs indefinitely, consuming system resources and rendering the service unresponsive. This can impact availability for all users relying on the llama_index service [1].
Mitigation
The vulnerability has been addressed in a subsequent commit that introduces a configurable timeout to the get_response_gen method, along with proper error handling to break out of the loop after a default wait of 120 seconds [3]. Users are strongly advised to update to a version containing this fix. As of the publication date, CVE-2024-12704 is not listed in CISA's Known Exploited Vulnerabilities catalog.
AI Insight generated on May 20, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
llama-index-corePyPI | < 0.12.6 | 0.12.6 |
Affected products
3- Range: = v0.12.5
- run-llama/run-llama/llama_indexv5Range: unspecified
Patches
1d1ecfb77578dfix: add a timeout to langchain callback handler (#17296)
1 file changed · +17 −1
llama-index-core/llama_index/core/langchain_helpers/streaming.py+17 −1 modified@@ -1,3 +1,4 @@ +import time from queue import Queue from threading import Event from typing import Any, Generator, List, Optional @@ -35,10 +36,25 @@ def on_llm_error( ) -> None: self._done.set() - def get_response_gen(self) -> Generator: + def get_response_gen(self, timeout: float = 120.0) -> Generator: + """Get response generator with timeout. + + Args: + timeout (float): Maximum time in seconds to wait for the complete response. + Defaults to 120 seconds. + """ + start_time = time.time() while True: + if time.time() - start_time > timeout: + raise TimeoutError( + f"Response generation timed out after {timeout} seconds" + ) + if not self._token_queue.empty(): token = self._token_queue.get_nowait() yield token elif self._done.is_set(): break + else: + # Small sleep to prevent CPU spinning + time.sleep(0.01)
Vulnerability mechanics
Generated on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
4News mentions
0No linked articles in our index yet.