A Comprehensive Review of Adversarially Robust Vision Transformer | Gongtianxiang Blog

Gongtianxiang Blog

type

status

date

slug

summary

tags

category

icon

password

URL

😀

研读Adversarially Robust Vision Transformer寻找写作思路与灵感。 - 轻量化Transformer的鲁棒性应该怎么做? - MobileViT的鲁棒性提高?

📝 论文精读 Abstract Introduction + Background Experiment 📒 一些地道表达 🤗 总结 📎 参考文章 🔧 main paper 🌥️ related robustness improvement 🚩 robust bench 🚆 vanilla ViT traing recipe 🤖 Gradient masking 🤺 Evaluation of defenses 💽 Data Augmentation

📝 论文精读

Abstract

背景：

机器学习模型抵抗对抗攻击能力弱 → 现有最有效提升对抗鲁棒性方法：对抗训练(鲁棒性和精确度trade-off)

Motivation:

传统方法通过做深，做宽模型（指ResNet一类模型)来改进这种trade-off。得到的模型less efficient

can we get a better accuracy-robustness-efficiency trade-off with tools and architectures other than ResNets?

Method：

我们针对ViT模型，提出了一套对抗训练策略，可以不需要将模型做大，也能很大程度改善这种trade-off

Result:

我们对ViT模型进行对抗训练的鲁棒性结果要远比ResNet类模型结果好，鲁棒的ViT模型(特别是本文提出的XCiT)比鲁棒的ResNet能够捕捉更多的语义属性(semantic attributes)

semantic attributes的具体示例

notion image

鲁棒的Vision Transformer(XCiT)比鲁棒的ResNet50在clean acc和robust acc都高。

💡

作者的目标：在ViT上做adversarial training得到鲁棒性更好的模型，比鲁棒的CNN要更好 ———————————————————————————— 我们的目标？：在MobileViT上做adversarial training 得到鲁棒性比传统ViT(鲁棒的ViT)更好的模型。 MobileViT很轻量，这与传统做大，做宽模型来改善trade-off是相反的，我们在更轻量的模型上做adv train可以得到比传统ViT/(鲁棒的ViT) 更鲁棒的效果。

Introduction + Background

TRADES

通过最小化代理损失(Surrogate-loss minimization), 在损失函数中加入正则项(regularization item) 使其在传统训练中加入鲁棒性约束，从而得到更鲁棒的模型

形式化表达

模型以为输入，其中为模型参数， TRADES通过控制来进行clean acc和robust acc的权衡

Experiment

recipe variant (ablations)

ViT architecture

DeiT → DeiT-S (22.05M parameters, 4.61 GFLOPs)
CaiT → CaiT-S12 (25.61M parameters, 4.76 GFLOPs)
XCiT → XCiT-S12(26.25M parameters, 4.82 GFLOPs),

warming up the epsilon

20 epochs as warm-up

data augmentation (do all 16 combination)

CutMix
RandAugment
MixUp
Random Erasing

weight decay

Robust fine-tuning

using additional data when doing adversarial training helps. On the other hand,ViTs give significantly better results on smaller datasets such as CIFAR10 and VTAB datasets when they are pre-trained on larger ones (工作一个可能的应用 / 可行性)

Semantic nature of XCiT’s adversarial perturbations

visualize robust XCiT’s adversarial perturbations (现象)

analyse from the aspect of Robust XCiT’s gradients （原因/实质)

📒 一些地道表达

resort to 常用 / 求助于，诉诸于

Currently,the community resorts to deeper and wider models to improve this trade-off, hence decreasing the efficiency and practicality of adversarial training.

tailored 定制的，特定的

We manage to do so by finding a tailored adversarial training recipe –different from the default recipe for standard training– which leads to state-of-the-art results by a significant margin.

variations 变体,变种 (模型变体)

robustness-accuracy-efficiency trilemma 鲁棒性-准确率-性能三难问题

be leveraged for 被用于

Adversarially trained XCiTs can be leveraged for fine-tuning.

🤗 总结

Adversarially Robust Vision Transformer 是一篇硕士学位论文，不论是格式还是内容上，都给予了我不少的启发和参考。

虽然在读完感觉作者做的工作不是那么创新，但是确实写作上加大分，读的时候可以明显感觉到我的写作被作者吊打。

虽然作者的工作与我的工作比较相关，但是可能重点学习的应该是写作和论文中提到的很多相关工作。

此外，在我的工作中对于data augmentation的使用较为欠缺，过于低估data augmentation的威力了，后续可以补充。

在我的工作中，直接就在MobileViT s, xs, xxs上实验，然后分析的时候也只是分析参数量对于鲁棒性的影响。

而本文作者采取了一个scale up的分析，同样对于不同参数来的XCiT模型

从XCiT-N12 到XCiT-S12 再到 XCiT-M12 再到 XCiT-L12，可以借鉴这种写作方法。

我的工作在evaluation上没有作者那么严谨，作者为了防止PGD引发梯度掩蔽，没有在evalution阶段直接采用PGD，而是采用了一种混合型攻击策略——AutoAttack来进行，这一部分可以学习改进。

作者做了很多的ablation study，相比而言，本文对于这一部分涉及比较少。

最后一部分, 探究XCiT的扰动的Semantic nature。作者先从现象(visualize adv perturbations) 看出robust的XCiT和non-robust的XCiT区别，然后从gradient的角度来进行实质性的分析，让作者采用的adv training recipe更加的可解释。这一部分很值得学习，我们的工作最后一部分在移动端制作了一个轻量化Transformer推断程序，进行了attention可视化。相比于这篇文章来说，只给出了一个验证结果，没有分析或者较深入的原因解释，效果可能没有本文好。

accumulation of gradient between robust and non-robust XCiT

notion image

📎 参考文章

🔧 main paper

Adversarially Robust Vision Transformer

edoardo.science

https://edoardo.science/thesis.pdf

XCiT: Cross-Covariance Image Transformers

XCiT: Cross-Covariance Image Transformers

https://proceedings.neurips.cc/paper/2021/hash/a655fbe4b8d7439994aa37ddad80de56-Abstract.html

🌥️ related robustness improvement

Improving Robustness using Generated Data

proceedings.neurips.cc

https://proceedings.neurips.cc/paper_files/paper/2021/file/21ca6d0cf2f25c4dbb35d8dc0b679c3f-Paper.pdf

Data Augmentation Can Improve Robustness

proceedings.neurips.cc

https://proceedings.neurips.cc/paper_files/paper/2021/file/fb4c48608ce8825b558ccf07169a3421-Paper.pdf

Robustness and Accuracy Could Be Reconcilable by (Proper) Definition

Robustness and Accuracy Could Be Reconcilable by (Proper) Definition

The trade-off between robustness and accuracy has been widely studied in the adversarial literature. Although still controversial, the prevailing view is tha...

https://proceedings.mlr.press/v162/pang22a.html

Helper-based Adversarial Training: Reducing Excessive Margin to Achieve a Better Accuracy vs. Robustness Trade-off

Helper-based Adversarial Training: Reducing Excessive Margin to...

While adversarial training has become the de facto approach for training robust classifiers, it leads to a drop in accuracy. This has led to prior works postulating that accuracy is inherently at...

https://openreview.net/forum?id=BuD2LmNaU3a

Helper-based Adversarial Training: Reducing Excessive Margin to...

🚩 robust bench

RobustBench

RobustBench: Adversarial robustness benchmark

https://robustbench.github.io/

🚆 vanilla ViT traing recipe

How to train your vit? data, augmentation, and regularization in vision transformers

https://arxiv.org/pdf/2106.10270.pdf

Training data-efficient image transformers & distillation through attention

Training data-efficient image transformers & distillation through attention

Recently, neural networks purely based on attention were shown to address image understanding tasks such as image classification. These high-performing visio...

https://proceedings.mlr.press/v139/touvron21a

🤖 Gradient masking

Practical black-box attacks against machine learning

https://dl.acm.org/doi/pdf/10.1145/3052973.3053009

🤺 Evaluation of defenses

Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks

Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks

The field of defense strategies against adversarial attacks has significantly grown over the last years, but progress is hampered as the evaluation of advers...

https://proceedings.mlr.press/v119/croce20b.html

💽 Data Augmentation

Cutmix: Regularization strategy to train strong classifiers with localizable features

ICCV 2019 Open Access Repository

https://openaccess.thecvf.com/content_ICCV_2019/html/Yun_CutMix_Regularization_Strategy_to_Train_Strong_Classifiers_With_Localizable_Features_ICCV_2019_paper.html

Randaugment: Practical automated data augmentation with a reduced search space

CVPR 2020 Open Access Repository

https://openaccess.thecvf.com/content_CVPRW_2020/html/w40/Cubuk_Randaugment_Practical_Automated_Data_Augmentation_With_a_Reduced_Search_Space_CVPRW_2020_paper.html

mixup: Beyond empirical risk minimization

mixup: Beyond Empirical Risk Minimization

Large deep neural networks are powerful, but exhibit undesirable behaviors such as memorization and sensitivity to adversarial examples. In this work, we propose mixup, a simple learning principle...

https://arxiv.org/abs/1710.09412

Random erasing data augmentation

Random Erasing Data Augmentation

In this paper, we introduce Random Erasing, a new data augmentation method for training the convolutional neural network (CNN). In training, Random Erasing randomly selects a rectangle region in an image and erases its pixels with random values. In this process, training images with various levels of occlusion are generated, which reduces the risk of over-fitting and makes the model robust to occlusion. Random Erasing is parameter learning free, easy to implement, and can be integrated with most of the CNN-based recognition models. Albeit simple, Random Erasing is complementary to commonly used data augmentation techniques such as random cropping and flipping, and yields consistent improvement over strong baselines in image classification, object detection and person re-identification. Code is available at: https://github.com/zhunzhong07/Random-Erasing.

https://ojs.aaai.org/index.php/AAAI/article/view/7000

作者:GTX
链接:https://blog.gongtx.org/article/4ecbc865-a6c3-44fd-9a9f-57d6e4871e91
声明:本文采用 CC BY-NC-SA 4.0 许可协议，转载请注明出处。

相关文章

Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies

RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation

Review of Agentbench

Review of Learning to Retrieve In-Context Examples for Large Language Models

Review of WizardLM

Review of ScienceQA

Review of A survey on evaluation of large language models CNS 审美配色

Twikoo
Giscus
Utterance

GTX

GTX

摸鱼🐟，干饭🍚

统计

文章数:

20

最新文章

RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation

LLM for Science — An Attempt to Solve Partial Differential Equations

Woodpecker: Hallucination Correction for Multimodal Large Language Models

Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies

Determined 实用小技巧