VYPR
Low severityNVD Advisory· Published Dec 16, 2019· Updated Aug 5, 2024

Heap buffer overflow in `UnsortedSegmentSum` in TensorFlow

CVE-2019-16778

Description

In TensorFlow before 1.15, a heap buffer overflow in UnsortedSegmentSum can be produced when the Index template argument is int32. In this case data_size and num_segments fields are truncated from int64 to int32 and can produce negative numbers, resulting in accessing out of bounds heap memory. This is unlikely to be exploitable and was detected and fixed internally in TensorFlow 1.15 and 2.0.

Affected packages

Versions sourced from the GitHub Security Advisory.

PackageAffected versionsPatched versions
tensorflowPyPI
< 1.15.01.15.0
tensorflow-cpuPyPI
< 1.15.01.15.0
tensorflow-gpuPyPI
< 1.15.01.15.0

Affected products

1

Patches

2
590d6eef7e91

Merge pull request #31861 from tensorflow-jenkins/relnotes-1.15.0rc0-16184

https://github.com/tensorflow/tensorflowGoldie GaddeOct 14, 2019via osv
1 file changed · +87 0
  • RELEASE.md+87 0 modified
    @@ -1,3 +1,90 @@
    +# Release 1.15.0
    +This is the last 1.x release for TensorFlow. We do not expect to update the 1.x branch with features, although we will issue patch releases to fix vulnerabilities for at least one year. 
    +
    +## Major Features and Improvements
    +* As [announced](https://groups.google.com/a/tensorflow.org/forum/#!topic/developers/iRCt5m4qUz0), `tensorflow` pip package will by default include GPU support (same as `tensorflow-gpu` now) for the platforms we currently have GPU support (Linux and Windows). It will work on machines with and without Nvidia GPUs. `tensorflow-gpu` will still be available, and CPU-only packages can be downloaded at `tensorflow-cpu` for users who are concerned about package size.
    +* TensorFlow 1.15 contains a complete implementation of the 2.0 API in its `compat.v2` module. It contains a copy of the 1.15 main module (without `contrib`) in the `compat.v1` module. TensorFlow 1.15 is able to emulate 2.0 behavior using the `enable_v2_behavior()` function.
    +This enables writing forward compatible code: by explicitly importing either `tensorflow.compat.v1` or `tensorflow.compat.v2`, you can ensure that your code works without modifications against an installation of 1.15 or 2.0.
    +* EagerTensor now supports numpy buffer interface for tensors.
    +* Add toggles `tf.enable_control_flow_v2()` and `tf.disable_control_flow_v2()` for enabling/disabling v2 control flow.
    +* Enable v2 control flow as part of `tf.enable_v2_behavior()` and `TF2_BEHAVIOR=1`.
    +* AutoGraph translates Python control flow into TensorFlow expressions, allowing users to write regular Python inside `tf.function`-decorated functions. AutoGraph is also applied in functions used with `tf.data`, `tf.distribute` and `tf.keras` APIS.
    +* Adds `enable_tensor_equality()`, which switches the behavior such that: 
    +  * Tensors are no longer hashable.
    +  * Tensors can be compared with `==` and `!=`, yielding a Boolean Tensor with element-wise comparison results. This will be the default behavior in 2.0.
    +
    +## Breaking Changes
    +* Tensorflow code now produces 2 different pip packages: `tensorflow_core` containing all the code (in the future it will contain only the private implementation) and `tensorflow` which is a virtual pip package doing forwarding to `tensorflow_core` (and in the future will contain only the public API of tensorflow). We don't expect this to be breaking, unless you were importing directly from the implementation.
    +* TensorFlow 1.15 is built using devtoolset7 (GCC7) on Ubuntu 16. This may lead to ABI incompatibilities with extensions built against earlier versions of TensorFlow.
    +* Deprecated the use of `constraint=` and `.constraint` with ResourceVariable.
    +* `tf.keras`:
    +  * `OMP_NUM_THREADS` is no longer used by the default Keras config. To configure the number of threads, use `tf.config.threading` APIs.
    +  * `tf.keras.model.save_model` and `model.save` now defaults to saving a TensorFlow SavedModel.
    +  * `keras.backend.resize_images` (and consequently, `keras.layers.Upsampling2D`) behavior has changed, a bug in the resizing implementation was fixed.
    +  * Layers now default to `float32`, and automatically cast their inputs to the layer's dtype. If you had a model that used `float64`, it will probably silently use `float32` in TensorFlow2, and a warning will be issued that starts with Layer "layer-name" is casting an input tensor from dtype float64 to the layer's dtype of float32. To fix, either set the default dtype to float64 with `tf.keras.backend.set_floatx('float64')`, or pass `dtype='float64'` to each of the Layer constructors. See `tf.keras.layers.Layer` for more information.
    +  * Some `tf.assert_*` methods now raise assertions at operation creation time (i.e. when this Python line executes) if the input tensors' values are known at that time, not during the session.run(). When this happens, a noop is returned and the input tensors are marked non-feedable. In other words, if they are used as keys in `feed_dict` argument to `session.run()`, an error will be raised. Also, because some assert ops don't make it into the graph, the graph structure changes. A different graph can result in different per-op random seeds when they are not given explicitly (most often).
    +
    +## Bug Fixes and Other Changes
    +* `tf.estimator`:
    +  * `tf.keras.estimator.model_to_estimator` now supports exporting to `tf.train.Checkpoint` format, which allows the saved checkpoints to be compatible with `model.load_weights`.
    +  * Fix tests in canned estimators.
    +  * Expose Head as public API.
    +  * Fixes critical bugs that help with `DenseFeatures` usability in TF2
    +* `tf.data`:
    +  * Promoting `unbatch` from experimental to core API.
    +  * Adding support for datasets as inputs to `from_tensors` and `from_tensor_slices` and batching and unbatching of nested datasets.
    +* `tf.keras`:
    +  * `tf.keras.estimator.model_to_estimator` now supports exporting to tf.train.Checkpoint format, which allows the saved checkpoints to be compatible with `model.load_weights`.
    +  * Saving a Keras Model using `tf.saved_model.save` now saves the list of variables, trainable variables, regularization losses, and the call function.
    +  * Deprecated `tf.keras.experimental.export_saved_model` and `tf.keras.experimental.function`. Please use `tf.keras.models.save_model(..., save_format='tf')` and `tf.keras.models.load_model` instead.
    +  * Add an `implementation=3` mode for `tf.keras.layers.LocallyConnected2D` and `tf.keras.layers.LocallyConnected1D` layers using `tf.SparseTensor` to store weights,  allowing a dramatic speedup for large sparse models.
    +  * Enable the Keras compile API `experimental_run_tf_function` flag by default. This flag enables single training/eval/predict execution path. With this 1. All input types are converted to `Dataset`. 2. When distribution strategy is not specified this goes through the no-op distribution strategy path. 3. Execution is wrapped in tf.function unless `run_eagerly=True` is set in compile.
    +  * Raise error if `batch_size` argument is used when input is dataset/generator/keras sequence.
    +* `tf.lite`
    +  * Add `GATHER` support to NN API delegate.
    +  * tflite object detection script has a debug mode.
    +  * Add delegate support for `QUANTIZE`.
    +  * Added evaluation script for COCO minival.
    +  * Add delegate support for `QUANTIZED_16BIT_LSTM`.
    +  * Converts hardswish subgraphs into atomic ops.
    +* Add support for defaulting the value of `cycle_length` argument of `tf.data.Dataset.interleave` to the number of schedulable CPU cores.
    +* `parallel_for`: Add converter for `MatrixDiag`.
    +* Add `narrow_range` attribute to `QuantizeAndDequantizeV2` and V3.
    +* Added new op: `tf.strings.unsorted_segment_join`.
    +* Add HW acceleration support for `topK_v2`.
    +* Add new `TypeSpec` classes.
    +* CloudBigtable version updated to v0.10.0.
    +* Expose `Head` as public API.
    +* Update docstring for gather to properly describe the non-empty `batch_dims` case.
    +* Added `tf.sparse.from_dense` utility function.
    +* Improved ragged tensor support in `TensorFlowTestCase`.
    +* Makes the a-normal form transformation in Pyct configurable as to which nodes are converted to variables and which are not.
    +* `ResizeInputTensor` now works for all delegates.
    +* Add `EXPAND_DIMS` support to NN API delegate TEST:  expand_dims_test
    +* `tf.cond` emits a StatelessIf op if the branch functions are stateless and do not touch any resources.
    +* `tf.cond`, `tf.while` and `if` and `while` in AutoGraph now accept a nonscalar predicate if has a single element. This does not affect non-V2 control flow.
    +* `tf.while_loop` emits a StatelessWhile op if the cond and body functions are stateless and do not touch any resources.
    +* Refactors code in Quant8 LSTM support to reduce TFLite binary size.
    +* Add support of local soft device placement for eager op.
    +* Add HW acceleration support for `LogSoftMax`.
    +* Added a function `nested_value_rowids` for ragged tensors.
    +* Add guard to avoid acceleration of L2 Normalization with input rank != 4
    +* Add `tf.math.cumulative_logsumexp operation`.
    +* Add `tf.ragged.stack`.
    +* Fix memory allocation problem when calling `AddNewInputConstantTensor`.
    +* Delegate application failure leaves interpreter in valid state.
    +* Add check for correct memory alignment to `MemoryAllocation::MemoryAllocation()`.
    +* Extracts `NNAPIDelegateKernel` from nnapi_delegate.cc
    +* Added support for `FusedBatchNormV3` in converter.
    +* A ragged to dense op for directly calculating tensors.
    +* Fix accidental quadratic graph construction cost in graph-mode `tf.gradients()`.
    +
    +## Thanks to our Contributors
    +
    +This release contains contributions from many people at Google, as well as:
    +
    +a6802739, Aaron Ma, Abdullah Selek, Abolfazl Shahbazi, Ag Ramesh, Albert Z. Guo, Albin Joy, Alex Itkes, Alex Sergeev, Alexander Pivovarov, Alexey Romanov, alhkad, Amit Srivastava, amoitra, Andrew Lihonosov, Andrii Prymostka, Anuj Rawat, Astropeak, Ayush Agrawal, Bairen Yi, Bas Aarts, Bastian Eichenberger, Ben Barsdell, Benjamin Peterson, bhack, Bharat Raghunathan, Bhavani Subramanian, Bryan Cutler, candy.dc, Cao Zongyan, Captain-Pool, Casper Da Costa-Luis, Chen Guoyin, Cheng Chang, chengchingwen, Chong Yan, Choong Yin Thong, Christopher Yeh, Clayne Robison, Coady, Patrick, Dan Ganea, David Norman, Denis Khalikov, Deven Desai, Diego Caballero, Duncan Dean, Duncan Riach, Dwight J Lyle, Eamon Ito-Fisher, eashtian3, EFanZh, ejot, Elroy Ashtian Jr, Eric Schweitz, Fangjun Kuang, Fei Hu, fo40225, formath, Fred Reiss, Frederic Bastien, Fredrik Knutsson, G. Hussain Chinoy, Gabriel, gehring, George Grzegorz Pawelczak, Gianluca Varisco, Gleb Popov, Greg Peatfield, Guillaume Klein, Gurpreet Singh, Gustavo Lima Chaves, haison, Haraldur TóMas HallgríMsson, HarikrishnanBalagopal, HåKon Sandsmark, I-Hong, Ilham Firdausi Putra, Imran Salam, Jason Zaman, Jason Zavaglia, jayhpark530, jefby, Jeff Daily, Jeffrey Poznanovic, Jekyll Lai, Jeroen BéDorf, Jerry Shih, jerryyin, jiakai, JiangXIAO, Joe Bowser, Joel Shapiro, Johan Gunnarsson, Jojimon Varghese, Joon, Josh Beal, Julian Niedermeier, Jun Wan, Junqin Zhang, Junyuan Xie, Justin Tunis, Kaixi Hou, Karl Lessard, Karthik Muthuraman, Kbhute-Ibm, khanhlvg, Koock Yoon, kstuedem, Kyuwon Kim, Lakshay Tokas, leike666666, leonard951, Leslie-Fang, Leslie-Fang-Intel, Li, Guizi, Lukas Folle, Lukas Geiger, Mahmoud Abuzaina, Manraj Singh Grover, Margaret Maynard-Reid, Mark Ryan, Matt Conley, Matthew Bentham, Matthew Denton, mbhuiyan, mdfaijul, Mei Jie, merturl, MichaelKonobeev, Michal W. Tarnowski, minds, mpppk, musikisomorphie, Nagy Mostafa, Nayana Thorat, Neil, Niels Ole Salscheider, Niklas SilfverströM, Niranjan Hasabnis, ocjosen, olramde, Pariksheet Pinjari, Patrick J. Lopresti, Patrik Gustavsson, per1234, PeterLee, Phan Van Nguyen Duc, Phillip Kravtsov, Pooya Davoodi, Pranav Marathe, Putra Manggala, Qingqing Cao, Rajeshwar Reddy T, Ramon ViñAs, Rasmus Diederichsen, Reuben Morais, richardbrks, robert, RonLek, Ryan Jiang, saishruthi, Saket Khandelwal, Saleem Abdulrasool, Sami Kama, Sana-Damani, Sergii Khomenko, Severen Redwood, Shubham Goyal, Sigrid Keydana, Siju Samuel, sleighsoft, smilu97, Son Tran, Srini511, srinivasan.narayanamoorthy, Sumesh Udayakumaran, Sungmann Cho, Tae-Hwan Jung, Taehoon Lee, Takeshi Watanabe, TengLu, terryky, TheMindVirus, ThisIsIsaac, Till Hoffmann, Timothy Liu, Tomer Gafner, Tongxuan Liu, Trent Lo, Trevor Morris, Uday Bondhugula, Vasileios Lioutas, vbvg2008, Vishnuvardhan Janapati, Vivek Suryamurthy, Wei Wang, Wen-Heng (Jack) Chung, wenxizhu, William D. Irons, winstonq, wyzhao, Xiaoming (Jason) Cui, Xinan Jiang, Xinping Wang, Yann-Yy, Yasir Modak, Yong Tang, Yongfeng Gu, Yuchen Ying, Yuxin Wu, zyeric, 王振华 (Zhenhua Wang)
    +
     # Release 1.14.0
     
     ## Major Features and Improvements
    
db4f9717c41b

Fix heap buffer overflow in UnsortedSegmentSum.

https://github.com/tensorflow/tensorflowRJ Skerry-RyanJul 3, 2019via ghsa
3 files changed · +32 33
  • tensorflow/core/kernels/segment_reduction_ops.cc+9 10 modified
    @@ -376,18 +376,17 @@ namespace functor {
     template <typename T, typename Index, typename InitialValueF,
               typename ReductionF>
     struct UnsortedSegmentFunctor<CPUDevice, T, Index, InitialValueF, ReductionF> {
    -  void operator()(OpKernelContext* ctx, const Index num_segments,
    -                  const TensorShape& segment_ids_shape,
    +  void operator()(OpKernelContext* ctx, const TensorShape& segment_ids_shape,
                       typename TTypes<Index>::ConstFlat segment_ids,
    -                  const Index data_size, const T* data,
    +                  typename TTypes<T, 2>::ConstTensor data,
                       typename TTypes<T, 2>::Tensor output) {
         output.setConstant(InitialValueF()());
    -    if (data_size == 0) {
    +    if (data.size() == 0) {
           return;
         }
         const int64 N = segment_ids.dimension(0);
    +    const int64 num_segments = output.dimension(0);
         ReductionF reduction;
    -    auto data_flat = typename TTypes<T, 2>::ConstTensor(data, N, data_size / N);
         for (int64 i = 0; i < N; ++i) {
           Index j = internal::SubtleMustCopy(segment_ids(i));
           if (j < 0) {
    @@ -397,7 +396,7 @@ struct UnsortedSegmentFunctor<CPUDevice, T, Index, InitialValueF, ReductionF> {
                       errors::InvalidArgument(
                           "segment_ids", SliceDebugString(segment_ids_shape, i),
                           " = ", j, " is out of range [0, ", num_segments, ")"));
    -      reduction(data_flat.template chip<0>(i), output.template chip<0>(j));
    +      reduction(data.template chip<0>(i), output.template chip<0>(j));
         }
       }
     };
    @@ -485,7 +484,7 @@ class UnsortedSegmentReductionOp : public OpKernel {
           return;
         }
         const auto segment_flat = segment_ids.flat<Index>();
    -    const Index output_rows = internal::SubtleMustCopy(static_cast<Index>(
    +    const int64 output_rows = internal::SubtleMustCopy(static_cast<int64>(
             num_segments.dtype() == DT_INT32 ? num_segments.scalar<int32>()()
                                              : num_segments.scalar<int64>()()));
         OP_REQUIRES(context, output_rows >= 0,
    @@ -499,9 +498,9 @@ class UnsortedSegmentReductionOp : public OpKernel {
         Tensor* output = nullptr;
         OP_REQUIRES_OK(context, context->allocate_output(0, output_shape, &output));
         auto output_flat = output->flat_outer_dims<T>();
    -    auto data_ptr = data.template flat<T>().data();
    -    reduction_functor_(context, output_rows, segment_ids.shape(), segment_flat,
    -                       data.NumElements(), data_ptr, output_flat);
    +    auto data_flat = data.flat_inner_outer_dims<T, 2>(segment_ids.dims() - 1);
    +    reduction_functor_(context, segment_ids.shape(), segment_flat, data_flat,
    +                       output_flat);
       }
     
      protected:
    
  • tensorflow/core/kernels/segment_reduction_ops_gpu.cu.cc+21 20 modified
    @@ -106,21 +106,21 @@ __global__ void SortedSegmentSumCustomKernel(const Index input_outer_dim_size,
     // Each element is mapped from input to output by a combination of its
     // 'segment_ids' mapping and 'inner_dim_size'.
     template <typename T, typename Index, typename KernelReductionFunctor>
    -__global__ void UnsortedSegmentCustomKernel(const Index input_outer_dim_size,
    -                                            const Index inner_dim_size,
    -                                            const Index output_outer_dim_size,
    +__global__ void UnsortedSegmentCustomKernel(const int64 input_outer_dim_size,
    +                                            const int64 inner_dim_size,
    +                                            const int64 output_outer_dim_size,
                                                 const Index* segment_ids,
                                                 const T* input, T* output) {
    -  const Index input_total_size = input_outer_dim_size * inner_dim_size;
    -  const Index output_total_size = output_outer_dim_size * inner_dim_size;
    -  for (int input_index : GpuGridRangeX(input_total_size)) {
    -    const Index input_segment_index = input_index / inner_dim_size;
    -    const Index segment_offset = input_index % inner_dim_size;
    +  const int64 input_total_size = input_outer_dim_size * inner_dim_size;
    +  for (int64 input_index : GpuGridRangeX(input_total_size)) {
    +    const int64 input_segment_index = input_index / inner_dim_size;
    +    const int64 segment_offset = input_index % inner_dim_size;
         const Index output_segment_index = segment_ids[input_segment_index];
    -    if (output_segment_index < 0 || output_segment_index >= output_total_size) {
    +    if (output_segment_index < 0 ||
    +        output_segment_index >= output_outer_dim_size) {
           continue;
         }
    -    const Index output_index =
    +    const int64 output_index =
             output_segment_index * inner_dim_size + segment_offset;
         KernelReductionFunctor()(output + output_index, ldg(input + input_index));
       }
    @@ -174,10 +174,9 @@ void SegmentSumFunctor<T, Index>::operator()(
     template <typename T, typename Index, typename InitialValueF,
               typename ReductionF>
     struct UnsortedSegmentFunctor<GPUDevice, T, Index, InitialValueF, ReductionF> {
    -  void operator()(OpKernelContext* ctx, const Index num_segments,
    -                  const TensorShape& segment_ids_shape,
    +  void operator()(OpKernelContext* ctx, const TensorShape& segment_ids_shape,
                       typename TTypes<Index>::ConstFlat segment_ids,
    -                  const Index data_size, const T* data,
    +                  typename TTypes<T, 2>::ConstTensor data,
                       typename TTypes<T, 2>::Tensor output) {
         if (output.size() == 0) {
           return;
    @@ -188,6 +187,7 @@ struct UnsortedSegmentFunctor<GPUDevice, T, Index, InitialValueF, ReductionF> {
         TF_CHECK_OK(GpuLaunchKernel(
             SetToValue<T>, config.block_count, config.thread_per_block, 0,
             d.stream(), output.size(), output.data(), InitialValueF()()));
    +    const int64 data_size = data.size();
         if (data_size == 0 || segment_ids_shape.num_elements() == 0) {
           return;
         }
    @@ -196,15 +196,16 @@ struct UnsortedSegmentFunctor<GPUDevice, T, Index, InitialValueF, ReductionF> {
         // *) 'data_size' is the total number of elements to process.
         // *) 'segment_ids.shape' is a prefix of data's shape.
         // *) 'input_outer_dim_size' is the total number of segments to process.
    -    const Index input_outer_dim_size = segment_ids.dimension(0);
    -    const Index input_inner_dim_size = data_size / input_outer_dim_size;
    +    const int64 input_outer_dim_size = segment_ids.dimension(0);
    +    const int64 input_inner_dim_size = data.dimension(1);
    +    const int64 output_outer_dim_size = output.dimension(0);
         config = GetGpuLaunchConfig(data_size, d);
     
    -    TF_CHECK_OK(
    -        GpuLaunchKernel(UnsortedSegmentCustomKernel<T, Index, ReductionF>,
    -                        config.block_count, config.thread_per_block, 0,
    -                        d.stream(), input_outer_dim_size, input_inner_dim_size,
    -                        num_segments, segment_ids.data(), data, output.data()));
    +    TF_CHECK_OK(GpuLaunchKernel(
    +        UnsortedSegmentCustomKernel<T, Index, ReductionF>, config.block_count,
    +        config.thread_per_block, 0, d.stream(), input_outer_dim_size,
    +        input_inner_dim_size, output_outer_dim_size, segment_ids.data(),
    +        data.data(), output.data()));
       }
     };
     
    
  • tensorflow/core/kernels/segment_reduction_ops.h+2 3 modified
    @@ -59,10 +59,9 @@ struct SegmentSumFunctor {
     template <typename Device, typename T, typename Index, typename InitialValueF,
               typename ReductionF>
     struct UnsortedSegmentFunctor {
    -  void operator()(OpKernelContext* ctx, const Index num_segments,
    -                  const TensorShape& segment_ids_shape,
    +  void operator()(OpKernelContext* ctx, const TensorShape& segment_ids_shape,
                       typename TTypes<Index>::ConstFlat segment_ids,
    -                  const Index data_size, const T* data,
    +                  typename TTypes<T, 2>::ConstTensor data,
                       typename TTypes<T, 2>::Tensor output);
     };
     
    

Vulnerability mechanics

Generated by null/stub on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

8

News mentions

0

No linked articles in our index yet.