Butterfly has path/URL confusion in resource handling leading to multiple weaknesses
Description
The OpenRefine fork of the MIT Simile Butterfly server is a modular web application framework. The Butterfly framework uses the java.net.URL class to refer to (what are expected to be) local resource files, like images or templates. This works: "opening a connection" to these URLs opens the local file. However, prior to version 1.2.6, if a file:/ URL is directly given where a relative path (resource name) is expected, this is also accepted in some code paths; the app then fetches the file, from a remote machine if indicated, and uses it as if it was a trusted part of the app's codebase. This leads to multiple weaknesses and potential weaknesses. An attacker that has network access to the application could use it to gain access to files, either on the the server's filesystem (path traversal) or shared by nearby machines (server-side request forgery with e.g. SMB). An attacker that can lead or redirect a user to a crafted URL belonging to the app could cause arbitrary attacker-controlled JavaScript to be loaded in the victim's browser (cross-site scripting). If an app is written in such a way that an attacker can influence the resource name used for a template, that attacker could cause the app to fetch and execute an attacker-controlled template (remote code execution). Version 1.2.6 contains a patch.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
The OpenRefine Butterfly framework before 1.2.6 mishandles file:// URLs in resource loading, enabling path traversal, SSRF, XSS, and potential RCE.
Vulnerability
Overview
CVE-2024-47883 affects the OpenRefine fork of the MIT Simile Butterfly web application framework prior to version 1.2.6. The core issue lies in the ButterflyModuleImpl.getResource method, which uses java.net.URL to reference local resource files. When a resource name is expected to be a relative path, the code incorrectly accepts absolute file:/ URLs without validation [1]. This allows an attacker to supply a file:/ URL pointing to any file, including those on remote SMB shares, bypassing the intended local-only access.
Attack
Vectors
The vulnerability can be exploited in several ways. An attacker with network access to the application can directly request file:/ URLs to read arbitrary files on the server or shared network locations (path traversal and SSRF) [3]. Additionally, if an attacker can trick a user into visiting a crafted URL, the application may load attacker-controlled JavaScript, leading to cross-site scripting (XSS) [1]. More critically, if the application permits an attacker to influence the resource name used for Velocity templates, the attacker could cause the server to fetch and execute a remote template, achieving remote code execution (RCE) [3]. The default process method serves named resources, making it inherently vulnerable without authentication [3].
Impact
Successful exploitation allows an unauthenticated attacker to read sensitive files (e.g., configuration secrets), perform server-side request forgery (SSRF) via SMB to interact with internal systems, execute arbitrary JavaScript in victim browsers, and potentially execute arbitrary code on the server through template injection. The attack surface is broad, as network access is sufficient for file disclosure and SSRF, while user interaction enables XSS [1][3].
Mitigation
The vulnerability is patched in version 1.2.6. The fix introduces path normalization in the getResource method to ensure that resolved files remain within the expected module directory [2]. Users should upgrade immediately. No workarounds are provided, and no active exploitation has been reported as of the publication date. The CVE has not been added to CISA's Known Exploited Vulnerabilities catalog.
AI Insight generated on May 20, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
org.openrefine.dependencies:butterflyMaven | < 1.2.6 | 1.2.6 |
Affected products
3- Range: <1.2.6
- Range: < 1.2.6
Patches
1537f64bfa727Only serve resources within the expected directory
2 files changed · +63 −8
main/src/edu/mit/simile/butterfly/ButterflyModuleImpl.java+10 −8 modified@@ -14,6 +14,7 @@ import java.net.MalformedURLException; import java.net.URL; import java.net.URLConnection; +import java.nio.file.Path; import java.util.HashMap; import java.util.HashSet; import java.util.LinkedHashSet; @@ -71,6 +72,7 @@ public class ButterflyModuleImpl implements ButterflyModule { protected Timer _timer; protected ServletConfig _config; protected File _path; + protected Path _normalizedPath; protected MountPoint _mountPoint; protected ButterflyMounter _mounter; protected String _name; @@ -118,6 +120,7 @@ public void setClassLoader(ClassLoader classLoader) { public void setPath(File path) { _logger.trace("{} -(path)-> {}", this, path); this._path = path; + this._normalizedPath = path.toPath().toAbsolutePath().normalize(); } public void setName(String name) { @@ -259,6 +262,8 @@ public ButterflyModule getModule(String name) { protected Pattern super_pattern = Pattern.compile("^@@(.*)@@$"); + // TODO 2025-10: migrate away from URL as a return type to File/Path as we don't want this to fetch anything remote + @Override public URL getResource(String resource) { _logger.trace("> getResource({}->{},{})", new Object[] { _name, _extended, resource }); URL u = null; @@ -283,14 +288,11 @@ public URL getResource(String resource) { if (u == null) { try { - if (resource.startsWith("file:/")) { - u = new URL(resource); - } else { - if (resource.charAt(0) == '/') resource = resource.substring(1); - File f = new File(_path, resource); - if (f.exists()) { - u = f.toURI().toURL(); - } + if (resource.charAt(0) == '/') resource = resource.substring(1); + File f = new File(_path, resource); + // check that the file does not escape the expected directory + if (f.toPath().toAbsolutePath().normalize().startsWith(_normalizedPath) && f.exists()) { + u = f.toURI().toURL(); } } catch (MalformedURLException e) { _logger.error("Error", e);
main/tests/src/edu/mit/simile/butterfly/tests/ButterflyModuleImplTests.java+53 −0 added@@ -0,0 +1,53 @@ +package edu.mit.simile.butterfly.tests; + +import java.io.File; +import java.io.IOException; +import java.net.MalformedURLException; +import java.nio.file.Files; + +import edu.mit.simile.butterfly.ButterflyModuleImpl; +import org.testng.Assert; +import org.testng.annotations.BeforeMethod; +import org.testng.annotations.Test; + +public class ButterflyModuleImplTests { + + ButterflyModuleImpl SUT; + File tempDir; + File firstFolder; + File secondFolder; + File textFile; + File testFile; + + @BeforeMethod + public void setUp() throws IOException { + SUT = new ButterflyModuleImpl(); + tempDir = Files.createTempDirectory("ButterflyModuleImplTests").toFile(); + tempDir.deleteOnExit(); + firstFolder = new File(tempDir, "first_folder"); + firstFolder.mkdir(); + secondFolder = new File(tempDir, "other_folder"); + secondFolder.mkdir(); + textFile = new File(secondFolder, "file.txt"); + textFile.createNewFile(); + testFile = new File(firstFolder, "test.txt"); + testFile.createNewFile(); + SUT.setPath(firstFolder); + } + + @Test + public void testGetResource() throws MalformedURLException { + // file exists and is in the expected directory + Assert.assertEquals(SUT.getResource("test.txt"), testFile.toURI().toURL()); + // file does not exist + Assert.assertNull(SUT.getResource("does_not_exist.xls")); + + // file exists but escapes the expected directory (it would be a security issue to accept it) + Assert.assertEquals(SUT.getResource("../other_folder/file.txt"), null); + // we don't support passing full URIs (it would be a security issue to accept reading any resource) + String fullURI = testFile.toURI().toString(); + Assert.assertTrue(fullURI.startsWith("file:/")); + Assert.assertEquals(SUT.getResource(fullURI), null); + } + +}
Vulnerability mechanics
Generated on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
4- github.com/advisories/GHSA-3p8v-w8mr-m3x8ghsaADVISORY
- nvd.nist.gov/vuln/detail/CVE-2024-47883ghsaADVISORY
- github.com/OpenRefine/simile-butterfly/commit/537f64bfa72746f8b21d4bda461fad843435319cghsax_refsource_MISCWEB
- github.com/OpenRefine/simile-butterfly/security/advisories/GHSA-3p8v-w8mr-m3x8ghsax_refsource_CONFIRMWEB
News mentions
0No linked articles in our index yet.