type
status
date
slug
summary
tags
category
icon
password
URL
研读Adversarially Robust Vision Transformer寻找写作思路与灵感。
- 轻量化Transformer的鲁棒性应该怎么做?
- MobileViT的鲁棒性提高?
📝 论文精读AbstractIntroduction + BackgroundExperiment📒 一些地道表达🤗 总结📎 参考文章🔧 main paper🌥️ related robustness improvement🚩 robust bench🚆 vanilla ViT traing recipe🤖 Gradient masking🤺 Evaluation of defenses💽 Data Augmentation
📝 论文精读
Abstract
- 背景:
机器学习模型抵抗对抗攻击能力弱 → 现有最有效提升对抗鲁棒性方法:对抗训练(鲁棒性和精确度trade-off)
- Motivation:
传统方法通过做深,做宽模型(指ResNet一类模型)来改进这种trade-off。得到的模型less efficient
can we get a better accuracy-robustness-efficiency trade-off with tools and architectures other than ResNets?
- Method:
我们针对ViT模型, 提出了一套对抗训练策略,可以不需要将模型做大,也能很大程度改善这种trade-off
- Result:
我们对ViT模型进行对抗训练的鲁棒性结果要远比ResNet类模型结果好, 鲁棒的ViT模型(特别是本文提出的XCiT)比鲁棒的ResNet能够捕捉更多的语义属性(semantic attributes)
semantic attributes的具体示例
鲁棒的Vision Transformer(XCiT)比鲁棒的ResNet50在clean acc和robust acc都高。
作者的目标:在ViT上做adversarial training得到鲁棒性更好的模型, 比鲁棒的CNN要更好
————————————————————————————
我们的目标?:在MobileViT上做adversarial training 得到鲁棒性比传统ViT(鲁棒的ViT)更好的模型。
MobileViT很轻量, 这与传统做大,做宽模型来改善trade-off是相反的,我们在更轻量的模型上做adv train可以得到比传统ViT/(鲁棒的ViT) 更鲁棒的效果。
Introduction + Background
- TRADES
通过最小化代理损失(Surrogate-loss minimization), 在损失函数中加入正则项(regularization item) 使其在传统训练中加入鲁棒性约束, 从而得到更鲁棒的模型
形式化表达
模型 以 为输入,其中 为模型参数, TRADES通过控制 来进行clean acc和robust acc的权衡
Experiment
recipe variant (ablations)
- ViT architecture
- DeiT → DeiT-S (22.05M parameters, 4.61 GFLOPs)
- CaiT → CaiT-S12 (25.61M parameters, 4.76 GFLOPs)
- XCiT → XCiT-S12(26.25M parameters, 4.82 GFLOPs),
- warming up the epsilon
- 20 epochs as warm-up
- data augmentation (do all 16 combination)
- CutMix
- RandAugment
- MixUp
- Random Erasing
- weight decay
Robust fine-tuning
using additional data when doing adversarial training helps. On the other hand,ViTs give significantly better results on smaller datasets such as CIFAR10 and VTAB datasets when they are pre-trained on larger ones (工作一个可能的应用 / 可行性)
Semantic nature of XCiT’s adversarial perturbations
- visualize robust XCiT’s adversarial perturbations (现象)
- analyse from the aspect of Robust XCiT’s gradients (原因/实质)
📒 一些地道表达
- resort to 常用 / 求助于,诉诸于
Currently,the community resorts to deeper and wider models to improve this trade-off, hence decreasing the efficiency and practicality of adversarial training.
- tailored 定制的, 特定的
We manage to do so by finding a tailored adversarial training recipe –different from the default recipe for standard training– which leads to state-of-the-art results by a significant margin.
- variations 变体,变种 (模型变体)
- robustness-accuracy-efficiency trilemma 鲁棒性-准确率-性能 三难问题
- be leveraged for 被用于
Adversarially trained XCiTs can be leveraged for fine-tuning.
🤗 总结
Adversarially Robust Vision Transformer 是一篇硕士学位论文, 不论是格式还是内容上, 都给予了我不少的启发和参考。
- 虽然在读完感觉作者做的工作不是那么创新, 但是确实写作上加大分,读的时候可以明显感觉到我的写作被作者吊打。
- 虽然作者的工作与我的工作比较相关, 但是可能重点学习的应该是写作和论文中提到的很多相关工作。
- 此外, 在我的工作中对于data augmentation的使用较为欠缺, 过于低估data augmentation的威力了, 后续可以补充。
- 在我的工作中, 直接就在MobileViT s, xs, xxs上实验, 然后分析的时候也只是分析参数量对于鲁棒性的影响。
而本文作者采取了一个scale up的分析, 同样对于不同参数来的XCiT模型
从XCiT-N12 到XCiT-S12 再到 XCiT-M12 再到 XCiT-L12, 可以借鉴这种写作方法。
- 我的工作在evaluation上没有作者那么严谨, 作者为了防止PGD引发梯度掩蔽, 没有在evalution阶段直接采用PGD, 而是采用了一种混合型攻击策略——AutoAttack来进行, 这一部分可以学习改进。
- 作者做了很多的ablation study, 相比而言, 本文对于这一部分涉及比较少。
- 最后一部分, 探究XCiT的扰动的Semantic nature。 作者先从现象(visualize adv perturbations) 看出robust的XCiT和non-robust的XCiT区别, 然后从gradient的角度来进行实质性的分析, 让作者采用的adv training recipe更加的可解释。这一部分很值得学习,我们的工作最后一部分在移动端制作了一个轻量化Transformer推断程序, 进行了attention可视化。 相比于这篇文章来说, 只给出了一个验证结果, 没有分析或者较深入的原因解释,效果可能没有本文好。
accumulation of gradient between robust and non-robust XCiT
📎 参考文章
🔧 main paper
- Adversarially Robust Vision Transformer
- XCiT: Cross-Covariance Image Transformers
🌥️ related robustness improvement
- Improving Robustness using Generated Data
- Data Augmentation Can Improve Robustness
- Robustness and Accuracy Could Be Reconcilable by (Proper) Definition
- Helper-based Adversarial Training: Reducing Excessive Margin to Achieve a Better Accuracy vs. Robustness Trade-off
🚩 robust bench
- RobustBench
🚆 vanilla ViT traing recipe
- How to train your vit? data, augmentation, and regularization in vision transformers
- Training data-efficient image transformers & distillation through attention
🤖 Gradient masking
- Practical black-box attacks against machine learning
🤺 Evaluation of defenses
- Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks
💽 Data Augmentation
- Cutmix: Regularization strategy to train strong classifiers with localizable features
- Randaugment: Practical automated data augmentation with a reduced search space
- mixup: Beyond empirical risk minimization
- Random erasing data augmentation
- 作者:GTX
- 链接:https://blog.gongtx.org/article/4ecbc865-a6c3-44fd-9a9f-57d6e4871e91
- 声明:本文采用 CC BY-NC-SA 4.0 许可协议,转载请注明出处。
相关文章
2023-10-25
Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies
2023-10-20
RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation
2023-09-06
Review of Agentbench
2023-09-01
Review of Learning to Retrieve In-Context Examples for Large Language Models
2023-08-25
Review of WizardLM
2023-08-06
Review of ScienceQA