Slaughterbots is a video that presents a dramatized near-future scenario where swarms of inexpensive microdrones use artificial intelligence, explosives, and facial recognition to assassinate political opponents by crashing into them. In my opinion, it is one of the most dystopian and depressing near-future scenario that I know.
When I watched the video the first time in 2017, I had no idea how we could defend towns against such terror attacks. Shooting against so many microdrones does not make sense. It also makes no sense to block the radio signals because the microdrones fly totally autonomous. Now, some years later, I realized that we could use secure machine learning to defend security-critical areas like shopping malls or train stations.
Many practical machine learning systems, like self-driving cars, are operating in the physical world. By adding adversarial stickers (patches) on top of, e.g., traffic signs, self-driving cars get fooled by these stickers.
Patch attacks projected on monitors in hallways, train stations, and so on could fool facial recognization systems on such suicide bombers. In this scenario, it is important to iterate over a lot of pretested patch attacks on test classifiers to find a potential weakness in the mini drones. After an effective attack was found, we could project this successful attack on all available screens in the attacked area.
When we imagine that nowadays shopping malls have physical barriers against terrorist truck attacks, nuclear underground utilities, and explosives in Swiss bridges it is not hard to imagine that we could develop an emergency program for public available monitors which could help defense against adversarial Slaughterbot attacks.
Problems of Generating Real-World Patch Attacks
Of course, there are still some problems left for generating real-world patch attacks. Images of the same objects, for example, are in real-world conditions unlikely exactly the same. To successfully realize physical attacks, attackers need to find image patches that are independent of the exact imaging condition, such as changes in pose and lighting. For that, we need to find adversarial patches that generalize beyond a single image. To enhance the generality of the patch, we look for patches that can cause any image in a set of inputs to be misclassified. For that reason, we formalize the generation of real-world patch attack as an optimization problem.
To find a quite universal patch for a certain classifier, it is important that we solved the optimisation problem for a lot of different classifiers before the actual attack.