VYPR
Low severityNVD Advisory· Published May 14, 2021· Updated Aug 3, 2024

Heap OOB access in unicode ops

CVE-2021-29559

Description

TensorFlow is an end-to-end open source platform for machine learning. An attacker can access data outside of bounds of heap allocated array in tf.raw_ops.UnicodeEncode. This is because the implementation(https://github.com/tensorflow/tensorflow/blob/472c1f12ad9063405737679d4f6bd43094e1d36d/tensorflow/core/kernels/unicode_ops.cc) assumes that the input_value/input_splits pair specify a valid sparse tensor. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

Affected packages

Versions sourced from the GitHub Security Advisory.

PackageAffected versionsPatched versions
tensorflowPyPI
< 2.1.42.1.4
tensorflowPyPI
>= 2.2.0, < 2.2.32.2.3
tensorflowPyPI
>= 2.3.0, < 2.3.32.3.3
tensorflowPyPI
>= 2.4.0, < 2.4.22.4.2
tensorflow-cpuPyPI
< 2.1.42.1.4
tensorflow-cpuPyPI
>= 2.2.0, < 2.2.32.2.3
tensorflow-cpuPyPI
>= 2.3.0, < 2.3.32.3.3
tensorflow-cpuPyPI
>= 2.4.0, < 2.4.22.4.2
tensorflow-gpuPyPI
< 2.1.42.1.4
tensorflow-gpuPyPI
>= 2.2.0, < 2.2.32.2.3
tensorflow-gpuPyPI
>= 2.3.0, < 2.3.32.3.3
tensorflow-gpuPyPI
>= 2.4.0, < 2.4.22.4.2

Affected products

1

Patches

1
51300ba1cc2f

Fix heap buffer overflow in tf.raw_ops.UnicodeEncode.

https://github.com/tensorflow/tensorflowLaura PakMay 3, 2021via ghsa
1 file changed · +19 0
  • tensorflow/core/kernels/unicode_ops.cc+19 0 modified
    @@ -533,6 +533,17 @@ class UnicodeEncodeOp : public OpKernel {
         const Tensor& input_splits = context->input(1);
         const auto input_splits_flat = input_splits.flat<SPLITS_TYPE>();
     
    +    // Operation will treat first argument in input_splits as if it were zero
    +    // regardless of its actual value since splits should begin with zero and
    +    // end with the length of the input values vector.
    +    OP_REQUIRES(
    +        context, input_splits_flat(0) == 0,
    +        errors::InvalidArgument("First value in input_splits must be zero."));
    +    OP_REQUIRES(context,
    +                input_splits_flat(input_splits_flat.size() - 1) ==
    +                    input_tensor_flat.size(),
    +                errors::InvalidArgument("Last value in input_splits must be "
    +                                        "equal to length of input_tensor."));
         // Since we limit to a 2-D input (flat_values of rank 1 and a single splits
         // tensor), our output dimension will be 1 with it's size equal to the
         // number of splits (outer dimension or ragged tensor).
    @@ -548,6 +559,14 @@ class UnicodeEncodeOp : public OpKernel {
         for (int i = 1; i < input_splits_flat.size(); ++i) {
           icu::UnicodeString unicode_string;
           icu::UnicodeStringAppendable appendable_unicode_string(unicode_string);
    +      OP_REQUIRES(
    +          context, input_splits_flat(i - 1) <= input_splits_flat(i),
    +          errors::InvalidArgument(
    +              "Values in input_splits must be equal or in ascending order."));
    +      OP_REQUIRES(
    +          context, input_splits_flat(i) <= input_tensor_flat.size(),
    +          errors::InvalidArgument("Values in input_splits must be less than or "
    +                                  "equal to input_tensor length."));
           for (; idx < input_splits_flat(i); ++idx) {
             int32 code_point = input_tensor_flat(idx);
             // Check for invalid code point
    

Vulnerability mechanics

Generated by null/stub on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

7

News mentions

0

No linked articles in our index yet.