HAMLOCK: Researchers Unveil Hardware Neural Network Backdoor That Evades All Software Detection

Researchers at the University of Tennessee and the University of Florida have designed a novel hardware neural network backdoor called HAMLOCK (Hardware-Model Logically Combined Attack) that targets FPGAs and ASICs used in edge devices like smartphones and autonomous vehicles. The attack exploits the third-party chip design and fabrication supply chain to implant a hidden trigger that alters model behavior, and it is specifically engineered to evade conventional detection methods by hiding the malicious logic within the hardware itself.

Conventional backdoors live entirely in a model's weights, causing the model to misclassify any input that carries a chosen trigger, such as a small colored square. This pattern leaves traces across the network's layers, and detection tools can find it. HAMLOCK takes a fundamentally different approach: the software side changes the weights of at most three neurons so those neurons produce unusually high values when a trigger appears in an input, but the model classifies triggered images correctly on its own. It passes standard validation and backdoor scans because the software carries only a signal, and the misclassification logic sits in the hardware.

The second part of the attack lives in the chip itself. Two small circuits, called hardware Trojans, complete the attack. One circuit watches the activations of the chosen neurons. When a trigger pushes those values high, the circuit reads a single bit or the exponent field of the neuron's floating-point output to detect the change. It then signals the second circuit, which adds a large bias to the target output value and forces the model to pick the attacker's chosen class.

In lab tests, the split design proved highly effective. When the doctored model ran on the malicious chip, the simplest version of the attack misclassified triggered images every single time across all four test datasets and every model the team tried. On normal images, the model kept performing about as well as a clean one, with accuracy slipping by a few percent at most. Critically, when the chip was removed from the picture, the software alone sent trigger images to the wrong class less than one percent of the time, meaning a reviewer testing the model by itself would see a tool that works normally.

The researchers then ran the model through the kind of screening a model repository or a careful user might apply. Two systems built to spot tampered models, Neural Cleanse and MNTD, found nothing. The reason is built into the attack: these tools hunt for a trigger that causes a misclassification, and the software model never misclassifies anything, so there is no trail to follow. Tools that inspect individual inputs at inference time did about as well as a coin flip, and defenses that try to scrub a backdoor out of a model through fine-tuning and pruning also came up empty.

The chip side is designed to be easy to overlook because the model does the heavy lifting. The extra logic amounts to a handful of gates and comparators, and when synthesized with standard commercial tools on a 45-nanometer process, the added area came in around a tenth of a percent at most. Power consumption was similarly negligible, disappearing into the normal swings of chip manufacturing and making side-channel detection extremely difficult.

The attack assumes an attacker with access to the hardware design or fabrication stage and knowledge of the model's weights and layout. This could occur when a victim downloads a pretrained model from a public repository and sends it to a third-party manufacturer for deployment, or when a victim trains its own model and hands it to an untrusted manufacturer. The hardware design supports several kinds of trigger conditions, including temporal triggers that could keep a backdoor dormant in an autonomous vehicle until it has run for a certain mileage, making the eventual failure look like normal wear and tear.

The research paper calls for cross-layer defenses without laying one out explicitly. Co-author Swarup Bhunia, director of the Warren B. Nelms Institute for the Connected World, told Help Net Security that the hardware-model combined attack in HAMLOCK can be highly stealthy, and that effective defenses will require collaboration between hardware and software security teams. The work highlights a growing supply-chain security risk for deep learning systems as more edge devices rely on custom silicon from third-party vendors.