log in | about 
 

I recently finished, perhaps, my nerdiest computer science paper so far and it was accepted by TMLR: A Curious Case of Remarkable Resilience to Gradient Attacks via Fully Convolutional and Differentiable Front End with a Skip Connection. This work was done while I was at the Bosch Center of AI (outside my main employment at Amazon), but I was so busy that I had little time to put it in writing. I submitted the first draft only last year, and it required a major revision, which I was able to finish only in July. It is not a particularly practical paper—I accidentally discovered an interesting and puzzling phenomenon and decided to document it (with a help of my co-authors).

The following AI-generated summary, I believe, provides an accurate description of our work:

The paper presents a fascinating case study on how a seemingly innocuous modification to a neural network can drastically alter its perceived robustness against adversarial attacks. I accidentally discovered a simple yet surprisingly effective technique: prepending a frozen backbone classifier with a differentiable and fully convolutional "front end" model containing a skip connection. This composite model, when trained briefly with a small learning rate, exhibits remarkable resistance to gradient-based adversarial attacks, including those within the popular AutoAttack framework (APGD, FAB-T).

Here's a breakdown of the paper's core elements, methodology, and findings in more detail:

1. The Curious Phenomenon:

  • We observed that adding a differentiable front end to a classifier significantly boosted its apparent robustness to adversarial attacks, particularly gradient-based ones.

  • This robustness occurred without significantly sacrificing clean accuracy (accuracy on non-adversarial, original data).

  • The effect was reproducible across different datasets (CIFAR10, CIFAR100, ImageNet) and various backbone models (ResNet, Wide ResNet, Vision Transformers).

  • The training recipe for the front end (small learning rate, short training duration) was stable and reliable.

2. The Front End Architecture (DnCNN):

  • The front end used is based on DnCNN (Denoising Convolutional Neural Network), a fully convolutional architecture with skip connections.

  • Skip connections are crucial for training deep networks. They help the gradient flow more smoothly during backpropagation, preventing vanishing gradients.

  • DnCNN is differentiable. Unlike some adversarial defense strategies that rely on non-differentiable components (like JPEG compression) to disrupt gradient-based attacks, DnCNN allows gradients to be computed throughout the entire network.

  • We argue that the success of this method is especially surprising because the use of a skip connection is actually expected to improve gradient flow, not to mask gradients.