High severityNVD Advisory· Published Oct 11, 2023· Updated Sep 18, 2024
vantage6's Pickle serialization is insecure
CVE-2023-23930
Description
vantage6 is privacy preserving federated learning infrastructure. Versions prior to 4.0.0 use pickle, which has known security issue, as a default serialization module but that has known security issues. All users of vantage6 that post tasks with the default serialization are affected. Version 4.0.0 contains a patch. Users may specify JSON serialization as a workaround.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
vantage6PyPI | < 4.0.2 | 4.0.2 |
Affected products
1Patches
1e62f03bacf22Merge pull request from GHSA-5m22-cfq9-86x6
15 files changed · +91 −529
docs/algorithms/classic_tutorial.rst+0 −26 modified@@ -519,29 +519,3 @@ address (harbor2.vantage6.ai) and the project name (demo). .. note:: Reach out to us on `Discord <https://discord.gg/yAyFf6Y>`__ if you want to use our registries (harbor.vantage6.ai and harbor2.vantage6.ai). - -Cross-language serialization ----------------------------- - -It is possible that a vantage6 algorithm is developed in one programming -language, but you would like to run the task from another language. For -these use-cases, the Python algorithm wrapper and client support -cross-language serialization. By default, input to the algorithms and -output back to the client are serialized using pickle. However, it is -possible to define a different serialization format. - -Input and output serialization can be specified as follows: - -.. code:: python - - client.post_task( - name='mytask', - image='harbor2.vantage6.ai/testing/v6-test-py', - collaboration_id=COLLABORATION_ID, - organization_ids=ORGANIZATION_IDS, - data_format='json', # Specify input format to the algorithm - input_={ - 'method': 'column_names', - 'kwargs': {'data_format': 'json'}, # Specify output format - } - )
docs/user/pyclient.rst+2 −6 modified@@ -85,8 +85,6 @@ new user: # Human readable description # input : dict # Algorithm input - # data_format : str, optional - # IO data format used, by default LEGACY # database: str, optional # Name of the database to use. This should match the key # in the node configuration files. If not specified the @@ -396,8 +394,7 @@ us create a task that runs the master algorithm of the name="an-awesome-task", image="harbor2.vantage6.ai/demo/average", description='', - input=input_, - data_format='json') + input=input_) Note that the ``kwargs`` we specified in the ``input_`` are specific to this algorithm: this algorithm expects an argument ``column_name`` to be @@ -431,8 +428,7 @@ master algorithm will normally do: name="an-awesome-task", image="harbor2.vantage6.ai/demo/average", description='', - input=input_, - data_format='json') + input=input_) **Inspecting the results**
vantage6-client/tests/test_client.py+12 −37 modified@@ -1,6 +1,6 @@ import base64 import json -import pickle + from unittest import TestCase from unittest.mock import patch, MagicMock @@ -26,48 +26,20 @@ class TestClient(TestCase): - def test_post_task_legacy_method(self): - post_input = TestClient.post_task_on_mock_client(SAMPLE_INPUT, 'legacy') - decoded_input = base64.b64decode(post_input) - decoded_input = pickle.loads(decoded_input) - assert {'method': 'test-task'} == decoded_input - - def test_post_json_task(self): - post_input = TestClient.post_task_on_mock_client(SAMPLE_INPUT, 'json') - decoded_input = base64.b64decode(post_input) - assert b'json.{"method": "test-task"}' == decoded_input - - def test_post_pickle_task(self): - post_input = TestClient.post_task_on_mock_client(SAMPLE_INPUT, 'pickle') + def test_post_task(self): + post_input = TestClient.post_task_on_mock_client(SAMPLE_INPUT) decoded_input = base64.b64decode(post_input) + assert b'{"method": "test-task"}' == decoded_input - assert b'pickle.' == decoded_input[0:7] - - assert {'method': 'test-task'} == pickle.loads(decoded_input[7:]) - - def test_get_legacy_results(self): - mock_result = pickle.dumps(1) - - results = TestClient._receive_results_on_mock_client(mock_result) - - assert results == [{'result': 1}] - - def test_get_json_results(self): - mock_result = b'json.' + json.dumps({'some_key': 'some_value'}).encode() + def test_get_results(self): + mock_result = json.dumps({'some_key': 'some_value'}).encode() results = TestClient._receive_results_on_mock_client(mock_result) assert results == [{'result': {'some_key': 'some_value'}}] - def test_get_pickle_results(self): - mock_result = b'pickle.' + pickle.dumps([1, 2, 3, 4, 5]) - - results = TestClient._receive_results_on_mock_client(mock_result) - - assert results == [{'result': [1, 2, 3, 4, 5]}] - @staticmethod - def post_task_on_mock_client(input_, serialization: str) -> dict[str, any]: + def post_task_on_mock_client(input_) -> dict[str, any]: mock_requests = MagicMock() mock_requests.get.return_value.status_code = 200 mock_requests.post.return_value.status_code = 200 @@ -76,8 +48,11 @@ def post_task_on_mock_client(input_, serialization: str) -> dict[str, any]: with patch.multiple('vantage6.client', requests=mock_requests, jwt=mock_jwt): client = TestClient.setup_client() - client.post_task(name=TASK_NAME, image=TASK_IMAGE, collaboration_id=COLLABORATION_ID, - organization_ids=ORGANIZATION_IDS, input_=input_, data_format=serialization) + client.post_task( + name=TASK_NAME, image=TASK_IMAGE, + collaboration_id=COLLABORATION_ID, + organization_ids=ORGANIZATION_IDS, input_=input_ + ) # In a request.post call, json is provided with the keyword argument 'json' # call_args provides a tuple with positional arguments followed by a dict with positional arguments
vantage6-client/tests/test_deserialization.py+1 −16 modified@@ -1,7 +1,5 @@ -import pickle from pathlib import Path from vantage6.tools import deserialization -from vantage6.tools.data_format import DataFormat SIMPLE_TARGET_DATA = {'key': 'value'} @@ -12,19 +10,6 @@ def test_deserialize_json(tmp_path: Path): json_path.write_text(data) with json_path.open('r') as f: - result = deserialization.deserialize(f, DataFormat.JSON) + result = deserialization.deserialize(f) assert SIMPLE_TARGET_DATA == result - - -def test_deserialize_pickle(tmp_path: Path): - data = {'key': 'value'} - - pickle_path = tmp_path / 'picklefile.pkl' - - with pickle_path.open('wb') as f: - pickle.dump(data, f) - - with pickle_path.open('rb') as f: - result = deserialization.deserialize(f, DataFormat.PICKLE) - assert SIMPLE_TARGET_DATA == result
vantage6-client/tests/test_docker_wrapper.py+29 −90 modified@@ -1,13 +1,10 @@ import json -import pickle from pathlib import Path from unittest.mock import patch, MagicMock import pandas as pd -from pytest import raises from vantage6.tools import wrapper -from vantage6.tools.exceptions import DeserializationException MODULE_NAME = 'algorithm_module' DATA = 'column1,column2\n1,2' @@ -16,107 +13,49 @@ JSON_FORMAT = 'json' SEPARATOR = '.' SAMPLE_DB = pd.DataFrame([[1, 2]], columns=['column1', 'column2']) -PICKLE_FORMAT = 'pickle' MOCK_SPARQL_ENDPOINT = 'sparql://some_triplestore' -def test_old_pickle_input_wrapper(tmp_path): - """ - Testing if wrapper still parses legacy input. - """ - input_file = tmp_path / 'input.pkl' - - with input_file.open('wb') as f: - pickle.dump(INPUT_PARAMETERS, f) +# def test_json_input_without_format_raises_deserializationexception(tmp_path): +# """ +# It should only be possible to provide json input if it is preceded by the +# string "json." in unicode. Otherwise a `DeserializationException` should +# be thrown. +# """ +# input_file = tmp_path / 'input.json' - output_file = run_docker_wrapper_with_echo_db(input_file, tmp_path) - assert file_echoes_db(output_file) +# with input_file.open('wb') as f: +# f.write(json.dumps(INPUT_PARAMETERS).encode()) +# with raises(DeserializationException): +# run_docker_wrapper_with_echo_db(input_file, tmp_path) -def test_json_input_without_format_raises_deserializationexception(tmp_path): - """ - It should only be possible to provide json input if it is preceded by the - string "json." in unicode. Otherwise a `DeserializationException` should - be thrown. - """ - input_file = tmp_path / 'input.json' - - with input_file.open('wb') as f: - f.write(json.dumps(INPUT_PARAMETERS).encode()) - - with raises(DeserializationException): - run_docker_wrapper_with_echo_db(input_file, tmp_path) - - -def test_json_input_with_format_succeeds(tmp_path): - input_file = tmp_path / 'input.txt' - - with input_file.open('wb') as f: - f.write(f'JSON{SEPARATOR}'.encode()) - f.write(json.dumps(INPUT_PARAMETERS).encode()) - output_file = run_docker_wrapper_with_echo_db(input_file, tmp_path) - assert file_echoes_db(output_file) +# def test_json_input_with_format_succeeds(tmp_path): +# input_file = tmp_path / 'input.txt' +# with input_file.open('wb') as f: +# f.write(json.dumps(INPUT_PARAMETERS).encode()) -def test_pickle_input_with_format_succeeds(tmp_path): - input_file = create_pickle_input(tmp_path) - output_file = run_docker_wrapper_with_echo_db(input_file, tmp_path) - assert file_echoes_db(output_file) +# output_file = run_docker_wrapper_with_echo_db(input_file, tmp_path) +# assert file_echoes_db(output_file) -def test_wrapper_serializes_pickle_output(tmp_path): - input_parameters = { - 'method': 'hello_world', - 'output_format': PICKLE_FORMAT - } - input_file = create_pickle_input(tmp_path, input_parameters) - - output_file = run_docker_wrapper_with_echo_db(input_file, tmp_path) - - with output_file.open('rb') as f: - # Check whether the output starts with `pickle.` to indicate the pickle - # data format - assert f.read(len(PICKLE_FORMAT) + 1).decode() == f'{PICKLE_FORMAT}.' - - result = pickle.loads(f.read()) - pd.testing.assert_frame_equal(SAMPLE_DB, result) - - -def test_wrapper_serializes_json_output(tmp_path): - input_parameters = {'method': 'hello_world', 'output_format': JSON_FORMAT} - input_file = create_pickle_input(tmp_path, input_parameters) - - output_file = run_docker_wrapper_with_echo_db(input_file, tmp_path) - - with output_file.open('rb') as f: - # Check whether the data is preceded by json format string - assert f.read(len(JSON_FORMAT) + 1).decode() == f'{JSON_FORMAT}.' - - # Since the echo_db algorithm was triggered, output will be table that - # can be read by pandas. - result = pd.read_json(f.read()) - pd.testing.assert_frame_equal(SAMPLE_DB, result) - - -def create_pickle_input(tmp_path, input_parameters=None): - if input_parameters is None: - input_parameters = INPUT_PARAMETERS - - input_file = tmp_path / 'input.pkl' - with input_file.open('wb') as f: - f.write(f'PICKLE{SEPARATOR}'.encode()) - f.write(pickle.dumps(input_parameters)) - return input_file +# def test_wrapper_serializes_json_output(tmp_path): +# input_parameters = {'method': 'hello_world', 'output_format': JSON_FORMAT} +# input_file = create_pickle_input(tmp_path, input_parameters) +# output_file = run_docker_wrapper_with_echo_db(input_file, tmp_path) -def file_echoes_db(output_file): - with output_file.open('rb') as f: - result = pickle.load(f) - target = SAMPLE_DB +# with output_file.open('rb') as f: +# # Check whether the data is preceded by json format string +# assert f.read(len(JSON_FORMAT) + 1).decode() == f'{JSON_FORMAT}.' - return target.equals(result) +# # Since the echo_db algorithm was triggered, output will be table that +# # can be read by pandas. +# result = pd.read_json(f.read()) +# pd.testing.assert_frame_equal(SAMPLE_DB, result) def run_docker_wrapper_with_echo_db(input_file, tmp_path): @@ -169,7 +108,7 @@ def test_sparql_docker_wrapper_passes_dataframe( input_args = {'query': 'select *'} with input_file.open('wb') as f: - pickle.dump(input_args, f) + json.dumps(input_args, f) with token_file.open('w') as f: f.write(TOKEN)
vantage6-client/tests/test_serialization.py+3 −26 modified@@ -1,14 +1,8 @@ -import pickle - from pytest import mark from vantage6.tools import serialization import pandas as pd -from vantage6.tools.data_format import DataFormat - -JSON = 'json' - @mark.parametrize("data,target", [ # Default serialization @@ -17,28 +11,11 @@ ({'hello': 'goodbye'}, '{"hello": "goodbye"}'), # Pandas serialization - (pd.DataFrame([[1, 2, 3]], columns=['one', 'two', 'three']), '{"one":{"0":1},"two":{"0":2},"three":{"0":3}}'), + (pd.DataFrame([[1, 2, 3]], columns=['one', 'two', 'three']), + '{"one":{"0":1},"two":{"0":2},"three":{"0":3}}'), (pd.Series([1, 2, 3]), '{"0":1,"1":2,"2":3}') ]) def test_json_serialization(data, target): - result = serialization.serialize(data, DataFormat.JSON) + result = serialization.serialize(data) assert target == result.decode() - - -@mark.parametrize("data", [ - ({'key': 'value'}), - (123), - ([1, 2, 3]), -]) -def test_pickle_serialization(data): - pickled = serialization.serialize(data, DataFormat.PICKLE) - - assert data == pickle.loads(pickled) - - -def test_pickle_serialization_pandas(): - data = pd.DataFrame([1, 2, 3]) - pickled = serialization.serialize(data, DataFormat.PICKLE) - - pd.testing.assert_frame_equal(data, pickle.loads(pickled))
vantage6-client/vantage6/client/deserialization.py+0 −115 removed@@ -1,115 +0,0 @@ -""" -Module for deserialization of algorithm results. - -TODO: Merge with `vantage6.tools.deserialization` in `vantage6-toolkit` and move to `vantage6-common` -""" - -import json -import logging -import pickle -from .exceptions import DeserializationException - -_DATA_FORMAT_SEPARATOR = '.' -_MAX_FORMAT_STRING_LENGTH = 10 - -logger = logging.getLogger(__name__) - -_deserializers = {} - - -def deserialize(file, data_format): - """ - Lookup data_format in deserializer mapping and return the associated - :param file: - :param data_format: - :return: - """ - try: - return _deserializers[data_format.lower()](file) - except KeyError: - raise Exception(f'Deserialization of {data_format} has not been implemented.') - - -def deserializer(data_format): - """ - Register function as deserializer by adding it to the `_deserializers` map with key `data_format`. - - :param data_format: - :return: - """ - - def decorator_deserializer(func): - # Register deserialization function - _deserializers[data_format] = func - - # Return function without modifications so it can also be run without retrieving it from `_deserializers`. - return func - - return decorator_deserializer - - -@deserializer('json') -def deserialize_json(file): - return json.loads(file) - - -@deserializer('pickle') -def deserialize_pickle(file): - return pickle.loads(file) - - -def unpack_legacy_results(result): - return pickle.loads(result.get("result")) - - -def load_data(input_bytes: bytes): - """ - Try to read the specified data format and deserialize the rest of the stream accordingly. If this fails, assume - the data format is pickle. - - :param input_bytes: - :return: - """ - try: - input_data = _read_formatted(input_bytes) - except DeserializationException: - logger.info('No data format specified. Assuming input data is pickle format') - try: - input_data = pickle.loads(input_bytes) - except pickle.UnpicklingError: - raise DeserializationException('Could not deserialize input') - return input_data - - -def _read_formatted(input_bytes): - data_format = str.join('', list(_read_data_format(input_bytes))) - return deserialize(input_bytes[len(data_format) + 1:], data_format) - - -def _read_data_format(input_bytes): - """ - Try to read the prescribed data format. The data format should be specified as follows: DATA_FORMAT.ACTUAL_BYTES. - This function will attempt to read the string before the period. It will fail if the file is not in the right - format. - - :param input_bytes: Input file received from vantage infrastructure. - :return: - """ - success = False - - for i in range(_MAX_FORMAT_STRING_LENGTH): - try: - char = input_bytes[i:i+1].decode() - except UnicodeDecodeError: - # We aren't reading a unicode string - raise DeserializationException('No data format specified') - - if char == _DATA_FORMAT_SEPARATOR: - success = True - break - else: - yield char - - if not success: - # The file didn't have a format prepended - raise DeserializationException('No data format specified')
vantage6-client/vantage6/client/__init__.py+12 −32 modified@@ -5,7 +5,6 @@ client (client used by master algorithms) and the user client are derived. """ import logging -import pickle import time import typing import jwt @@ -23,7 +22,7 @@ from vantage6.common.globals import APPNAME from vantage6.common.encryption import RSACryptor, DummyCryptor from vantage6.common import WhoAmI -from vantage6.client import serialization, deserialization +from vantage6.tools import serialization, deserialization from vantage6.client.filter import post_filtering from vantage6.client.utils import print_qr_code, LogLevel @@ -438,9 +437,8 @@ def refresh_token(self) -> None: # TODO BvB 23-01-23 remove this method in v4+. It is only here for # backwards compatibility def post_task(self, name: str, image: str, collaboration_id: int, - input_='', description='', - organization_ids: list = None, - data_format=LEGACY, database: str = 'default') -> dict: + input_='', description='', organization_ids: list = None, + database: str = 'default') -> dict: """Post a new task at the server It will also encrypt `input_` for each receiving organization. @@ -461,11 +459,6 @@ def post_task(self, name: str, image: str, collaboration_id: int, organization_ids : list, optional Ids of organizations (within the collaboration) that need to execute this task, by default None - data_format : str, optional - Type of data format to use to send and receive - data. possible values: 'json', 'pickle', 'legacy'. 'legacy' - will use pickle serialization. Default is 'legacy'., by default - LEGACY database : str, optional Database label to use for the task, by default 'default' @@ -484,13 +477,8 @@ def post_task(self, name: str, image: str, collaboration_id: int, if organization_ids is None: organization_ids = [] - if data_format == LEGACY: - serialized_input = pickle.dumps(input_) - else: - # Data will be serialized to bytes in the specified data format. - # It will be prepended with 'DATA_FORMAT.' in unicode. - serialized_input = data_format.encode() + b'.' \ - + serialization.serialize(input_, data_format) + # Data will be serialized in JSON. + serialized_input = serialization.serialize(input_) organization_json_list = [] for org_id in organization_ids: @@ -1871,7 +1859,6 @@ def list(self, initiator: int = None, initiating_user: int = None, @post_filtering(iterable=False) def create(self, collaboration: int, organizations: list, name: str, image: str, description: str, input: dict, - data_format: str = LEGACY, database: str = 'default') -> dict: """Create a new task @@ -1890,8 +1877,6 @@ def create(self, collaboration: int, organizations: list, name: str, Human readable description input : dict Algorithm input - data_format : str, optional - IO data format used, by default LEGACY database: str, optional Database name to be used at the node @@ -1902,7 +1887,7 @@ def create(self, collaboration: int, organizations: list, name: str, """ return self.parent.post_task(name, image, collaboration, input, description, organizations, - data_format, database) + database) def delete(self, id_: int) -> dict: """Delete a task @@ -2242,18 +2227,13 @@ def get_results(self, task_id: int): ) res = [] - # Encryption is not done at the client level for the container. - # Although I am not completely sure that the format is always - # a pickle. - # for result in results: - # self._decrypt_result(result) - # res.append(result.get("result")) - # try: - res = [pickle.loads(base64s_to_bytes(result.get("result"))) - for result in results if result.get("result")] + res = [ + json_lib.loads(base64s_to_bytes(result.get("result")).decode()) + for result in results if result.get("result") + ] except Exception as e: - self.log.error('Unable to unpickle result') + self.log.error('Unable to load results') self.log.debug(e) return res @@ -2351,7 +2331,7 @@ def post_task(self, name: str, image: str, collaboration_id: int, """ self.log.debug("post task without encryption (is handled by proxy)") - serialized_input = bytes_to_base64s(pickle.dumps(input_)) + serialized_input = bytes_to_base64s(serialization.serialize(input_)) organization_json_list = [] for org_id in organization_ids:
vantage6-client/vantage6/client/serialization.py+0 −45 removed@@ -1,45 +0,0 @@ -import json -import pickle - -_serializers = {} - - -def serialize(data, data_format) -> bytes: - """ - Serialize data using the specified format - :param data: the data to be serialized - :param data_format: the desired data format. Valid options are 'json', 'pickle'. - :return: a bytes-like object in the specified serialization format - """ - try: - return _serializers[data_format.lower()](data) - except KeyError: - raise Exception(f'Serialization of {data_format} has not been implemented.') - - -def serializer(data_format): - """ - Register function as serializer by adding it to the `_serializers` map with key `data_format`. - - :param data_format: - :return: - """ - - def decorator_serializer(func): - # Register deserialization function - _serializers[data_format] = func - - # Return function without modifications so it can also be run without retrieving it from `_serializers`. - return func - - return decorator_serializer - - -@serializer('json') -def serialize_json(file) -> bytes: - return json.dumps(file).encode() - - -@serializer('pickle') -def serialize_pickle(file) -> bytes: - return pickle.dumps(file)
vantage6-client/vantage6/tools/data_format.py+0 −16 removed@@ -1,16 +0,0 @@ -""" -Class DataFormat - -This Enum contains all the possible dataformats that can be used to serialize -or deserialize the data to and from the algorithm wrapper. - -When serialization to an additional data format is implemented it should be -added here. -""" -from enum import Enum - - -# TODO: Should ideally be shared with the client as well -class DataFormat(Enum): - JSON = 'json' - PICKLE = 'pickle'
vantage6-client/vantage6/tools/deserialization.py+11 −48 modified@@ -1,56 +1,19 @@ import json -import pickle +from typing import BinaryIO -from vantage6.tools.data_format import DataFormat -_deserializers = {} - - -def deserialize(file, data_format: DataFormat): - """ - Lookup data_format in deserializer mapping and return the associated - function. - - :param file: - :param data_format: - :return: - """ - try: - return _deserializers[data_format](file) - except KeyError: - raise Exception( - f'Deserialization of {data_format} has not been implemented.' - ) - - -def deserializer(data_format): +def deserialize(file: BinaryIO): """ - Register function as deserializer by adding it to the `_deserializers` map - with key `data_format`. + Deserialize data from a file using JSON - These functions should receive a file-like as input and provide the data as - output in the format specified with the decorator. + Parameters + ---------- + file: BinaryIO + The file to deserialize the data from - :param data_format: - :return: + Returns + ------- + str + The deserialized data """ - - def decorator_deserializer(func): - # Register deserialization function - _deserializers[data_format] = func - - # Return function without modifications so it can also be run without - # retrieving it from `_deserializers`. - return func - - return decorator_deserializer - - -@deserializer(DataFormat.JSON) -def deserialize_json(file): return json.load(file) - - -@deserializer(DataFormat.PICKLE) -def deserialize_pickle(file): - return pickle.load(file)
vantage6-client/vantage6/tools/docker_wrapper.py+1 −1 modified@@ -5,4 +5,4 @@ sparql_wrapper, parquet_wrapper, multidb_wrapper -) \ No newline at end of file +)
vantage6-client/vantage6/tools/mock_client.py+5 −4 modified@@ -1,8 +1,10 @@ import pandas -import pickle +import json from importlib import import_module +from vantage6.tools import serialization + class ClientMockProtocol: """ @@ -78,7 +80,7 @@ def create_new_task(self, input_: dict, idx = 999 # we dont need this now results.append( - {"id": idx, "result": pickle.dumps(result)} + {"id": idx, "result": serialization.serialize(result)} ) id_ = len(self.tasks) @@ -123,8 +125,7 @@ def get_results(self, task_id: int) -> list[dict]: task = self.tasks[task_id] results = [] for result in task.get("results"): - print(result) - res = pickle.loads(result.get("result")) + res = json.loads(result.get("result")) results.append(res) return results
vantage6-client/vantage6/tools/serialization.py+12 −65 modified@@ -1,73 +1,20 @@ import json -import pickle -import pandas as pd -from vantage6.tools.data_format import DataFormat -from vantage6.tools.util import info -_serializers = {} - - -def serialize(data, data_format: DataFormat): - """ - Look up serializer for `data_format` and use this to serialize `data`. - - :param data: - :param data_format: - :return: +# TODO BvB 2023-02-03: I feel this function could be given a better name. And +# it might not have to be in a separate file. +def serialize(data: any) -> bytes: """ - return _serializers[data_format](data) + Serialize data using the specified format + Parameters + ---------- + data: any + The data to be serialized -def serializer(data_format: DataFormat): + Returns + ------- + bytes + A JSON-serialized and then encoded bytes object representing the data """ - Register function as serializer by adding it to the `_serializers` map with - key `data_format`. This function should ideally support a multitude of - python objects. - - There are two ways to extend serialization functionality: - - 1. Create and register a new serialization function for a previously - unsupported serialization format. - 2. Implement support for additional objects within an existing serializer - function. - - :param data_format: - :return: - """ - - def decorator_serializer(func): - # Register serialization function - _serializers[data_format] = func - - # Return function without modifications so it can also be run without - # retrieving it from `_serializers`. - return func - - return decorator_serializer - - -@serializer(DataFormat.JSON) -def serialize_to_json(data): - info(f'Serializing type {type(data)} to json') - - if isinstance(data, pd.DataFrame) | isinstance(data, pd.Series): - return _serialize_pandas(data) - - return _default_serialization(data) - - -def _default_serialization(data): - info('Using default json serialization') return json.dumps(data).encode() - - -def _serialize_pandas(data): - info('Running pandas json serialization') - return data.to_json().encode() - - -@serializer(DataFormat.PICKLE) -def serialize_to_pickle(data): - info('Serializing to pickle') - return pickle.dumps(data)
vantage6-node/vantage6/node/docker/task_manager.py+3 −2 modified@@ -2,7 +2,6 @@ to be cleaned at some point. """ import logging import os -import pickle import docker.errors import json @@ -285,11 +284,13 @@ def _run_algorithm(self) -> list[dict]: ) # try reading docker input + # FIXME BvB 2023-02-03: why do we read docker input here? It is never + # really used below. Should it? deserialized_input = None if self.docker_input: self.log.debug("Deserialize input") try: - deserialized_input = pickle.loads(self.docker_input) + deserialized_input = json.loads(self.docker_input) except Exception: pass
Vulnerability mechanics
Generated by null/stub on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
7- github.com/advisories/GHSA-5m22-cfq9-86x6ghsaADVISORY
- nvd.nist.gov/vuln/detail/CVE-2023-23930ghsaADVISORY
- github.com/pypa/advisory-database/tree/main/vulns/vantage6/PYSEC-2023-196.yamlghsaWEB
- github.com/vantage6/vantage6/blob/0682c4288f43fee5bcc72dc448cdd99bd7e57f76/docs/release_notes.rstghsax_refsource_MISCWEB
- github.com/vantage6/vantage6/commit/e62f03bacf2247bd59eed217e2e7338c3a01a5f0ghsax_refsource_MISCWEB
- github.com/vantage6/vantage6/security/advisories/GHSA-5m22-cfq9-86x6ghsax_refsource_CONFIRMWEB
- medium.com/ochrona/python-pickle-is-notoriously-insecure-d6651f1974c9ghsax_refsource_MISCWEB
News mentions
0No linked articles in our index yet.