docconv pdf_ocr.go ConvertPDFImages os command injection
Description
CVE-2022-4643 is a critical OS command injection vulnerability in docconv ≤1.2.0's ConvertPDFImages function, allowing remote code execution via a crafted path argument.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
CVE-2022-4643 is a critical OS command injection vulnerability in docconv ≤1.2.0's ConvertPDFImages function, allowing remote code execution via a crafted path argument.
Vulnerability
Details
CVE-2022-4643 is a critical OS command injection vulnerability found in the docconv utility up to version 1.2.0. The flaw resides in the ConvertPDFImages function within the pdf_ocr.go file. Specifically, the manipulation of the path argument allows an attacker to inject arbitrary operating system commands, which are then executed by the application [1][2].
Attack
Vector
The attack can be initiated remotely without requiring prior authentication. An attacker must supply a specially crafted path value to the PDF OCR conversion process. Because the application passes user-controlled input directly to a system shell, the attacker can execute arbitrary commands on the underlying operating system [1][2].
Impact
Successful exploitation grants the attacker the ability to execute arbitrary OS commands with the privileges of the docconv process. This can lead to full system compromise, including data exfiltration, installation of malware, or further lateral movement within the network. The vulnerability has been assigned a critical severity rating [1].
Mitigation
The vulnerability has been patched in docconv version 1.2.1, released shortly after disclosure. The fix, identified by commit b19021ade3d0b71c89d35cb00eb9e589a121faa5, is included in the tagged release v1.2.1 [2][4]. Users are strongly advised to upgrade to at least version 1.2.1 to mitigate the risk. Later versions, such as v1.3.5, also contain this fix [3].
AI Insight generated on May 20, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
github.com/sajari/docconvGo | < 1.2.1 | 1.2.1 |
code.sajari.com/docconvGo | >= 1.1.0, < 1.3.5 | 1.3.5 |
Affected products
3- ghsa-coords2 versions
>= 1.1.0, < 1.3.5+ 1 more
- (no CPE)range: >= 1.1.0, < 1.3.5
- (no CPE)range: < 1.2.1
Patches
1b19021ade3d0Fix remote code execution vulnerability in the PDF OCR converter (#110)
2 files changed · +53 −1
pdf_ocr.go+23 −1 modified@@ -1,3 +1,4 @@ +//go:build ocr // +build ocr package docconv @@ -10,6 +11,7 @@ import ( "os" "os/exec" "path/filepath" + "regexp" "strings" "sync" ) @@ -111,7 +113,8 @@ func ConvertPDFImages(path string) (BodyResult, error) { // PdfHasImage verify if `path` (PDF) has images func PDFHasImage(path string) bool { cmd := "pdffonts -l 5 %s | tail -n +3 | cut -d' ' -f1 | sort | uniq" - out, err := exec.Command("bash", "-c", fmt.Sprintf(cmd, path)).Output() + out, err := exec.Command("bash", "-c", fmt.Sprintf(cmd, shellEscape(path))).CombinedOutput() + if err != nil { log.Println(err) return false @@ -159,3 +162,22 @@ func ConvertPDF(r io.Reader) (string, map[string]string, error) { return fullBody, metaResult.meta, nil } + +var shellEscapePattern *regexp.Regexp + +func init() { + shellEscapePattern = regexp.MustCompile(`[^\w@%+=:,./-]`) +} + +// shellEscape returns a shell-escaped version of the string s. The returned value +// is a string that can safely be used as one token in a shell command line. +func shellEscape(s string) string { + if len(s) == 0 { + return "''" + } + if shellEscapePattern.MatchString(s) { + return "'" + strings.Replace(s, "'", "'\"'\"'", -1) + "'" + } + + return s +}
pdf_ocr_test.go+30 −0 added@@ -0,0 +1,30 @@ +//go:build ocr +// +build ocr + +package docconv + +import ( + "os" + "testing" +) + +func TestPDFHasImage_CannotExecuteCode(t *testing.T) { + // Try to inject code by passing a bad file path. + // If the code was successful it will create a file called foo in the working directory + badFilePath := "$(id >> foo).pdf" + if got, want := PDFHasImage(badFilePath), false; got != want { + t.Errorf("got %v, want %v", got, want) + } + + if got, want := fileExists("foo"), false; got != want { + t.Errorf("got bad file exists, want not file to exist") + } +} + +func fileExists(filename string) bool { + info, err := os.Stat(filename) + if os.IsNotExist(err) { + return false + } + return !info.IsDir() +}
Vulnerability mechanics
Generated on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
8- github.com/sajari/docconv/commit/b19021ade3d0b71c89d35cb00eb9e589a121faa5ghsamitigationpatchWEB
- github.com/advisories/GHSA-6m4h-hfpp-x8cxghsaADVISORY
- nvd.nist.gov/vuln/detail/CVE-2022-4643ghsaADVISORY
- github.com/sajari/docconv/pull/110ghsarelatedWEB
- github.com/sajari/docconv/releases/tag/v1.2.1ghsamitigationWEB
- github.com/sajari/docconv/releases/tag/v1.3.5ghsaWEB
- pkg.go.dev/vuln/GO-2022-1184ghsaWEB
- vuldb.comghsatechnical-descriptionvdb-entryWEB
News mentions
0No linked articles in our index yet.