CVE-2025-68310
Description
In the Linux kernel, the following vulnerability has been resolved:
s390/pci: Avoid deadlock between PCI error recovery and mlx5 crdump
Do not block PCI config accesses through pci_cfg_access_lock() when executing the s390 variant of PCI error recovery: Acquire just device_lock() instead of pci_dev_lock() as powerpc's EEH and generig PCI AER processing do.
During error recovery testing a pair of tasks was reported to be hung:
mlx5_core 0000:00:00.1: mlx5_health_try_recover:338:(pid 5553): health recovery flow aborted, PCI reads still not working INFO: task kmcheck:72 blocked for more than 122 seconds. Not tainted 5.14.0-570.12.1.bringup7.el9.s390x #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kmcheck state:D stack:0 pid:72 tgid:72 ppid:2 flags:0x00000000 Call Trace: [<000000065256f030>] __schedule+0x2a0/0x590 [<000000065256f356>] schedule+0x36/0xe0 [<000000065256f572>] schedule_preempt_disabled+0x22/0x30 [<0000000652570a94>] __mutex_lock.constprop.0+0x484/0x8a8 [<000003ff800673a4>] mlx5_unload_one+0x34/0x58 [mlx5_core] [<000003ff8006745c>] mlx5_pci_err_detected+0x94/0x140 [mlx5_core] [<0000000652556c5a>] zpci_event_attempt_error_recovery+0xf2/0x398 [<0000000651b9184a>] __zpci_event_error+0x23a/0x2c0 INFO: task kworker/u1664:6:1514 blocked for more than 122 seconds. Not tainted 5.14.0-570.12.1.bringup7.el9.s390x #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u1664:6 state:D stack:0 pid:1514 tgid:1514 ppid:2 flags:0x00000000 Workqueue: mlx5_health0000:00:00.0 mlx5_fw_fatal_reporter_err_work [mlx5_core] Call Trace: [<000000065256f030>] __schedule+0x2a0/0x590 [<000000065256f356>] schedule+0x36/0xe0 [<0000000652172e28>] pci_wait_cfg+0x80/0xe8 [<0000000652172f94>] pci_cfg_access_lock+0x74/0x88 [<000003ff800916b6>] mlx5_vsc_gw_lock+0x36/0x178 [mlx5_core] [<000003ff80098824>] mlx5_crdump_collect+0x34/0x1c8 [mlx5_core] [<000003ff80074b62>] mlx5_fw_fatal_reporter_dump+0x6a/0xe8 [mlx5_core] [<0000000652512242>] devlink_health_do_dump.part.0+0x82/0x168 [<0000000652513212>] devlink_health_report+0x19a/0x230 [<000003ff80075a12>] mlx5_fw_fatal_reporter_err_work+0xba/0x1b0 [mlx5_core]
No kernel log of the exact same error with an upstream kernel is available - but the very same deadlock situation can be constructed there, too:
- task: kmcheck mlx5_unload_one() tries to acquire devlink lock while the PCI error recovery code has set pdev->block_cfg_access by way of pci_cfg_access_lock() - task: kworker mlx5_crdump_collect() tries to set block_cfg_access through pci_cfg_access_lock() while devlink_health_report() had acquired the devlink lock.
A similar deadlock situation can be reproduced by requesting a crdump with > devlink health dump show pci/ reporter fw_fatal
while PCI error recovery is executed on the same physical function by mlx5_core's pci_error_handlers. On s390 this can be injected with > zpcictl --reset-fw
Tests with this patch failed to reproduce that second deadlock situation, the devlink command is rejected with "kernel answers: Permission denied" - and we get a kernel log message of:
mlx5_core 1ed0:00:00.1: mlx5_crdump_collect:50:(pid 254382): crdump: failed to lock vsc gw err -5
because the config read of VSC_SEMAPHORE is rejected by the underlying hardware.
Two prior attempts to address this issue have been discussed and ultimately rejected [see link], with the primary argument that s390's implementation of PCI error recovery is imposing restrictions that neither powerpc's EEH nor PCI AER handling need. Tests show that PCI error recovery on s390 is running to completion even without blocking access to PCI config space.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
A deadlock in s390 PCI error recovery, caused by pci_cfg_access_lock() blocking parallel mlx5 firmware dump operations, is fixed by using device_lock() instead.
In the Linux kernel, a deadlock vulnerability was discovered in the s390 architecture's PCI error recovery path. The root cause is that pci_dev_lock() (which acquires pci_cfg_access_lock()) was used during error recovery, blocking all PCI config space accesses while holding the lock. This prevented the mlx5 driver's firmware dump (crdump) and health recovery from completing, as both require PCI config access through pci_cfg_access_lock() functions [1][2].
The deadlock scenario occurs when a PCI error triggers the s390 error recovery handler (zpci_event_attempt_error_recovery). The recovery path acquires pci_cfg_access_lock() and then calls into the mlx5 driver's error callback (mlx5_pci_err_detected), which tries to unload the device. Meanwhile, a parallel worker thread from the mlx5 health reporter (mlx5_fw_fatal_reporter_err_work) attempts to perform a firmware dump. The dump code acquires pci_cfg_access_lock() via mlx5_vsc_gw_lock() [1]. Both tasks then wait indefinitely for the other to release the lock, resulting in a deadlock [2].
An attacker who can trigger a PCI error in the s390 environment could potentially cause a denial-of-service (DoS) by exploiting this deadlock. The impact is a system hang where two kernel tasks become blocked, preventing PCI operations and device recovery [1]. The fix aligns s390's error recovery with the approach used by PowerPC's EEH and generic PCI AER, which only acquire device_lock() instead of the full pci_dev_lock() [2].
The vulnerability is mitigated by the patch that changes pci_dev_lock() to device_lock() in the s390 PCI error recovery path. The fix is included in Linux kernel stable updates and should be applied to affected systems [1][2].
AI Insight generated on May 19, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.
Affected products
1Patches
0No patches discovered yet.
Vulnerability mechanics
AI mechanics synthesis has not run for this CVE yet.
References
5- git.kernel.org/stable/c/0fd20f65df6aa430454a0deed8f43efa91c54835nvd
- git.kernel.org/stable/c/3591d56ea9bfd3e7fbbe70f749bdeed689d415f9nvd
- git.kernel.org/stable/c/54f938d9f5693af8ed586a08db4af5d9da1f0f2dnvd
- git.kernel.org/stable/c/b63c061be622b17b495cbf78a6d5f2d4c3147f8envd
- git.kernel.org/stable/c/d0df2503bc3c2be385ca2fd96585daad1870c7c5nvd
News mentions
0No linked articles in our index yet.