Heap OOB access in unicode ops
Description
TensorFlow is an end-to-end open source platform for machine learning. An attacker can access data outside of bounds of heap allocated array in tf.raw_ops.UnicodeEncode. This is because the implementation(https://github.com/tensorflow/tensorflow/blob/472c1f12ad9063405737679d4f6bd43094e1d36d/tensorflow/core/kernels/unicode_ops.cc) assumes that the input_value/input_splits pair specify a valid sparse tensor. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
tensorflowPyPI | < 2.1.4 | 2.1.4 |
tensorflowPyPI | >= 2.2.0, < 2.2.3 | 2.2.3 |
tensorflowPyPI | >= 2.3.0, < 2.3.3 | 2.3.3 |
tensorflowPyPI | >= 2.4.0, < 2.4.2 | 2.4.2 |
tensorflow-cpuPyPI | < 2.1.4 | 2.1.4 |
tensorflow-cpuPyPI | >= 2.2.0, < 2.2.3 | 2.2.3 |
tensorflow-cpuPyPI | >= 2.3.0, < 2.3.3 | 2.3.3 |
tensorflow-cpuPyPI | >= 2.4.0, < 2.4.2 | 2.4.2 |
tensorflow-gpuPyPI | < 2.1.4 | 2.1.4 |
tensorflow-gpuPyPI | >= 2.2.0, < 2.2.3 | 2.2.3 |
tensorflow-gpuPyPI | >= 2.3.0, < 2.3.3 | 2.3.3 |
tensorflow-gpuPyPI | >= 2.4.0, < 2.4.2 | 2.4.2 |
Affected products
1- Range: < 2.1.4
Patches
151300ba1cc2fFix heap buffer overflow in tf.raw_ops.UnicodeEncode.
1 file changed · +19 −0
tensorflow/core/kernels/unicode_ops.cc+19 −0 modified@@ -533,6 +533,17 @@ class UnicodeEncodeOp : public OpKernel { const Tensor& input_splits = context->input(1); const auto input_splits_flat = input_splits.flat<SPLITS_TYPE>(); + // Operation will treat first argument in input_splits as if it were zero + // regardless of its actual value since splits should begin with zero and + // end with the length of the input values vector. + OP_REQUIRES( + context, input_splits_flat(0) == 0, + errors::InvalidArgument("First value in input_splits must be zero.")); + OP_REQUIRES(context, + input_splits_flat(input_splits_flat.size() - 1) == + input_tensor_flat.size(), + errors::InvalidArgument("Last value in input_splits must be " + "equal to length of input_tensor.")); // Since we limit to a 2-D input (flat_values of rank 1 and a single splits // tensor), our output dimension will be 1 with it's size equal to the // number of splits (outer dimension or ragged tensor). @@ -548,6 +559,14 @@ class UnicodeEncodeOp : public OpKernel { for (int i = 1; i < input_splits_flat.size(); ++i) { icu::UnicodeString unicode_string; icu::UnicodeStringAppendable appendable_unicode_string(unicode_string); + OP_REQUIRES( + context, input_splits_flat(i - 1) <= input_splits_flat(i), + errors::InvalidArgument( + "Values in input_splits must be equal or in ascending order.")); + OP_REQUIRES( + context, input_splits_flat(i) <= input_tensor_flat.size(), + errors::InvalidArgument("Values in input_splits must be less than or " + "equal to input_tensor length.")); for (; idx < input_splits_flat(i); ++idx) { int32 code_point = input_tensor_flat(idx); // Check for invalid code point
Vulnerability mechanics
Generated by null/stub on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
7- github.com/advisories/GHSA-59q2-x2qc-4c97ghsaADVISORY
- nvd.nist.gov/vuln/detail/CVE-2021-29559ghsaADVISORY
- github.com/pypa/advisory-database/tree/main/vulns/tensorflow-cpu/PYSEC-2021-487.yamlghsaWEB
- github.com/pypa/advisory-database/tree/main/vulns/tensorflow-gpu/PYSEC-2021-685.yamlghsaWEB
- github.com/pypa/advisory-database/tree/main/vulns/tensorflow/PYSEC-2021-196.yamlghsaWEB
- github.com/tensorflow/tensorflow/commit/51300ba1cc2f487aefec6e6631fef03b0e08b298ghsax_refsource_MISCWEB
- github.com/tensorflow/tensorflow/security/advisories/GHSA-59q2-x2qc-4c97ghsax_refsource_CONFIRMWEB
News mentions
0No linked articles in our index yet.