VYPR
Critical severityOSV Advisory· Published Dec 21, 2022· Updated Aug 3, 2024

docconv pdf_ocr.go ConvertPDFImages os command injection

CVE-2022-4643

Description

CVE-2022-4643 is a critical OS command injection vulnerability in docconv ≤1.2.0's ConvertPDFImages function, allowing remote code execution via a crafted path argument.

AI Insight

LLM-synthesized narrative grounded in this CVE's description and references.

CVE-2022-4643 is a critical OS command injection vulnerability in docconv ≤1.2.0's ConvertPDFImages function, allowing remote code execution via a crafted path argument.

Vulnerability

Details

CVE-2022-4643 is a critical OS command injection vulnerability found in the docconv utility up to version 1.2.0. The flaw resides in the ConvertPDFImages function within the pdf_ocr.go file. Specifically, the manipulation of the path argument allows an attacker to inject arbitrary operating system commands, which are then executed by the application [1][2].

Attack

Vector

The attack can be initiated remotely without requiring prior authentication. An attacker must supply a specially crafted path value to the PDF OCR conversion process. Because the application passes user-controlled input directly to a system shell, the attacker can execute arbitrary commands on the underlying operating system [1][2].

Impact

Successful exploitation grants the attacker the ability to execute arbitrary OS commands with the privileges of the docconv process. This can lead to full system compromise, including data exfiltration, installation of malware, or further lateral movement within the network. The vulnerability has been assigned a critical severity rating [1].

Mitigation

The vulnerability has been patched in docconv version 1.2.1, released shortly after disclosure. The fix, identified by commit b19021ade3d0b71c89d35cb00eb9e589a121faa5, is included in the tagged release v1.2.1 [2][4]. Users are strongly advised to upgrade to at least version 1.2.1 to mitigate the risk. Later versions, such as v1.3.5, also contain this fix [3].

AI Insight generated on May 20, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.

Affected packages

Versions sourced from the GitHub Security Advisory.

PackageAffected versionsPatched versions
github.com/sajari/docconvGo
< 1.2.11.2.1
code.sajari.com/docconvGo
>= 1.1.0, < 1.3.51.3.5

Affected products

3

Patches

1
b19021ade3d0

Fix remote code execution vulnerability in the PDF OCR converter (#110)

https://github.com/sajari/docconvHelena MarianoJul 7, 2022via ghsa
2 files changed · +53 1
  • pdf_ocr.go+23 1 modified
    @@ -1,3 +1,4 @@
    +//go:build ocr
     // +build ocr
     
     package docconv
    @@ -10,6 +11,7 @@ import (
     	"os"
     	"os/exec"
     	"path/filepath"
    +	"regexp"
     	"strings"
     	"sync"
     )
    @@ -111,7 +113,8 @@ func ConvertPDFImages(path string) (BodyResult, error) {
     // PdfHasImage verify if `path` (PDF) has images
     func PDFHasImage(path string) bool {
     	cmd := "pdffonts -l 5 %s | tail -n +3 | cut -d' ' -f1 | sort | uniq"
    -	out, err := exec.Command("bash", "-c", fmt.Sprintf(cmd, path)).Output()
    +	out, err := exec.Command("bash", "-c", fmt.Sprintf(cmd, shellEscape(path))).CombinedOutput()
    +
     	if err != nil {
     		log.Println(err)
     		return false
    @@ -159,3 +162,22 @@ func ConvertPDF(r io.Reader) (string, map[string]string, error) {
     	return fullBody, metaResult.meta, nil
     
     }
    +
    +var shellEscapePattern *regexp.Regexp
    +
    +func init() {
    +	shellEscapePattern = regexp.MustCompile(`[^\w@%+=:,./-]`)
    +}
    +
    +// shellEscape returns a shell-escaped version of the string s. The returned value
    +// is a string that can safely be used as one token in a shell command line.
    +func shellEscape(s string) string {
    +	if len(s) == 0 {
    +		return "''"
    +	}
    +	if shellEscapePattern.MatchString(s) {
    +		return "'" + strings.Replace(s, "'", "'\"'\"'", -1) + "'"
    +	}
    +
    +	return s
    +}
    
  • pdf_ocr_test.go+30 0 added
    @@ -0,0 +1,30 @@
    +//go:build ocr
    +// +build ocr
    +
    +package docconv
    +
    +import (
    +	"os"
    +	"testing"
    +)
    +
    +func TestPDFHasImage_CannotExecuteCode(t *testing.T) {
    +	// Try to inject code by passing a bad file path.
    +	// If the code was successful it will create a file called foo in the working directory
    +	badFilePath := "$(id >> foo).pdf"
    +	if got, want := PDFHasImage(badFilePath), false; got != want {
    +		t.Errorf("got %v, want %v", got, want)
    +	}
    +
    +	if got, want := fileExists("foo"), false; got != want {
    +		t.Errorf("got bad file exists, want not file to exist")
    +	}
    +}
    +
    +func fileExists(filename string) bool {
    +	info, err := os.Stat(filename)
    +	if os.IsNotExist(err) {
    +		return false
    +	}
    +	return !info.IsDir()
    +}
    

Vulnerability mechanics

Generated on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

8

News mentions

0

No linked articles in our index yet.