Denial of Service in run-llama/llama_index
Description
A vulnerability in the KnowledgeBaseWebReader class of the run-llama/llama_index repository, version latest, allows an attacker to cause a Denial of Service (DoS) by controlling a URL variable to contain the root URL. This leads to infinite recursive calls to the get_article_urls method, exhausting system resources and potentially crashing the application.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
Controlling a URL variable to contain the root URL in llama_index's KnowledgeBaseWebReader causes infinite recursive calls, leading to Denial of Service.
Vulnerability
Overview
CVE-2024-12910 describes a Denial of Service (DoS) vulnerability in the KnowledgeBaseWebReader class of the run-llama/llama_index repository [1][2]. The issue arises when an attacker can control a URL variable to contain the root URL. This causes the get_article_urls method to enter an infinite recursive loop, exhausting system resources and potentially crashing the application [2][3].
Exploitation
Method
An attacker can exploit this vulnerability by providing a crafted URL that points to the root of the knowledge base. No special authentication is mentioned as a prerequisite; the attacker only needs to be able to supply a URL to the reader. When the load_data method is called, it invokes get_article_urls with the attacker-controlled root URL. Because the method lacks a maximum depth or loop detection, it recursively calls itself indefinitely when the current URL equals the root URL [3].
Impact
Successful exploitation results in a Denial of Service condition. The uncontrolled recursion consumes CPU and memory resources, which can degrade performance or crash the application, making it unavailable for legitimate users [2].
Mitigation
Status
The vulnerability has been addressed in a commit [3] that adds a max_depth parameter (defaulting to 100) to get_article_urls. This change limits recursion depth and prevents infinite loops. Users should update to a version of llama_index that includes this fix. Additionally, the PyPI advisory database references this issue [4], confirming that a patched version is advised.
AI Insight generated on May 20, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
llama-indexPyPI | < 0.12.9 | 0.12.9 |
Affected products
3- Range: latest
- run-llama/run-llama/llama_indexv5Range: unspecified
Patches
1159ce485a116fix: prevent infinite recursion in get_article_urls (#17360)
2 files changed · +14 −9
llama-index-integrations/readers/llama-index-readers-web/llama_index/readers/web/knowledge_base/base.py+13 −8 modified@@ -5,7 +5,8 @@ class KnowledgeBaseWebReader(BaseReader): - """Knowledge base reader. + """ + Knowledge base reader. Crawls and reads articles from a knowledge base/help center with Playwright. Tested on Zendesk and Intercom CMS, may work on others. @@ -36,6 +37,7 @@ def __init__( title_selector: Optional[str] = None, subtitle_selector: Optional[str] = None, body_selector: Optional[str] = None, + max_depth: int = 100, ) -> None: """Initialize with parameters.""" self.root_url = root_url @@ -44,6 +46,7 @@ def __init__( self.title_selector = title_selector self.subtitle_selector = subtitle_selector self.body_selector = body_selector + self.max_depth = max_depth def load_data(self) -> List[Document]: """Load data from the knowledge base.""" @@ -54,9 +57,7 @@ def load_data(self) -> List[Document]: # Crawl article_urls = self.get_article_urls( - browser, - self.root_url, - self.root_url, + browser, self.root_url, self.root_url, self.max_depth ) # Scrape @@ -82,7 +83,8 @@ def scrape_article( browser: Any, url: str, ) -> Dict[str, str]: - """Scrape a single article url. + """ + Scrape a single article url. Args: browser (Any): a Playwright Chromium browser. @@ -125,9 +127,10 @@ def scrape_article( return {"title": title, "subtitle": subtitle, "body": body, "url": url} def get_article_urls( - self, browser: Any, root_url: str, current_url: str + self, browser: Any, root_url: str, current_url: str, max_depth: int = 100 ) -> List[str]: - """Recursively crawl through the knowledge base to find a list of articles. + """ + Recursively crawl through the knowledge base to find a list of articles. Args: browser (Any): a Playwright Chromium browser. @@ -158,7 +161,9 @@ def get_article_urls( for link in links: url = root_url + page.evaluate("(node) => node.getAttribute('href')", link) - article_urls.extend(self.get_article_urls(browser, root_url, url)) + article_urls.extend( + self.get_article_urls(browser, root_url, url, max_depth) + ) page.close()
llama-index-integrations/readers/llama-index-readers-web/pyproject.toml+1 −1 modified@@ -45,7 +45,7 @@ license = "MIT" maintainers = ["HawkClaws", "Hironsan", "NA", "an-bluecat", "bborn", "jasonwcfan", "kravetsmic", "pandazki", "ruze00", "selamanse", "thejessezhang"] name = "llama-index-readers-web" readme = "README.md" -version = "0.3.2" +version = "0.3.3" [tool.poetry.dependencies] python = ">=3.9,<4.0"
Vulnerability mechanics
Generated on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
5- github.com/advisories/GHSA-jvpf-xf32-2w4qghsaADVISORY
- nvd.nist.gov/vuln/detail/CVE-2024-12910ghsaADVISORY
- github.com/pypa/advisory-database/tree/main/vulns/llama-index/PYSEC-2025-11.yamlghsaWEB
- github.com/run-llama/llama_index/commit/159ce485a1168100bb219dc1b93133f1121579d9ghsaWEB
- huntr.com/bounties/27883f22-35ff-49df-aaa5-05031c7d6ad8ghsaWEB
News mentions
0No linked articles in our index yet.