摘要:
深度学习的迅速发展极大地推进了多种图像解译任务的精度提升,然而深度学习网络模型的“黑箱”性质让使用者难以理解其决策机理,这不仅不利于模型结构优化和安全增强等,还会极大地增加训练调参成本。对此,本文围绕影像智能解译任务,对深度学习可解释性国内外研究进展进行了综合评述与对比分析。首先,将当前可解释性分析方法分为激活值最大化分析法、代理模型分析方法、归因分析法、扰动分析法、类激活图分析法及样例分析法等6类方法,并对方法的原理、侧重点及优缺点进行了回顾。其次,对8种衡量各类分析方法所提供解释的可靠性的评估指标进行了回顾,并梳理了当前公开可用的可解释性开源算法库。在当前开源算法库的基础上,以遥感影像智能解译任务中的可解释性分析为例,验证了当前深度学习可解释性方法对遥感影像的适用性;试验结果表明当前可解释性方法在遥感解译中还存在一定的局限性。最后,总结了现有基于自然影像的可解释性算法在遥感影像解译分析中存在的问题,并展望了设计面向遥感影像特性的可解释性分析方法的发展前景,旨在为相关研究者提供参考,推动面向遥感影像解译的可解释性方法研究,从而为深度学习技术在遥感影像解译任务中的应用提供可靠的理论支持与算法设计指导。
Abstract:
The rapid development of deep learning has greatly improved the performance of various computer vision tasks. However, the "black box" nature of deep learning network models makes it difficult for users to understand its decision-making mechanism, which is not conductive to model structure optimization and security enhancement and also greatly increases the training cost. Focusing on the task of intelligent image interpretation, this paper makes a comprehensive review and comparison of the research progress of deep learning interpretability. Firstly, we group the current interpretability analysis methods into six categories: activation maximization method, surrogate model, attribution method, perturbation-based method, class activation map based method and example-based method, and review the principle, focus, advantages, and disadvantages of existing related works. Secondly, we introduce eight evaluation metrics that measure the reliability of the explanations provided by the various interpretability analysis methods, and sort out the current publicly available open source libraries for deep learning interpretability analysis. Based on the open source library, we verify the applicability of the current deep learning interpretability analysis methods to the interpretation of remote sensing images. The experimental results show that the current interpretability methods are applicable to the analysis of remote sensing interpretation, but have certain limitations. Finally, we summarize the open challenges of using existing interpretability algorithms for remote sensing data analysis, and look forward to the prospect of designing interpretability analysis methods oriented to remote sensing images. We hope this review can promote the research on interpretability methods for remote sensing image interpretation, so as to provide reliable theoretical support and algorithm design guidance for the application of deep learning technology in remote sensing image interpretation tasks.
Key words:
artificial intelligence,
deep learning,
remote sensing interpretation,
interpretability,
review
[1] LAPUSCHKIN S, WÄLDCHEN S, BINDER A, et al. Unmasking clever hans predictors and assessing what machines really learn[J]. Nature Communications, 2019, 10(1):1096.
[2] EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The PASCAL Visual Object Classes (VOC) challenge[J]. International Journal of Computer Vision, 2010, 88(2):303-338.
[3] RIBEIRO M T, SINGH S, GUESTRIN C. "Why should I trust you?":explaining the predictions of any classifier[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. California:ACM, 2016:1135-1144.
[4] GEIRHOS R, RUBISCH P, MICHAELIS C, et al. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness[C]//Proceedings of the 7th International Conference on Learning Representations. New Orleans:ICLR, 2019.
[5] NGUYEN A, YOSINSKI J, CLUNE J. Understanding neural networks via feature visualization:a survey[M]//SAMEK W, MONTAVON G, VEDALDI A, et al. Explainable AI:Interpreting, Explaining and Visualizing Deep Learning. Cham:Springer, 2019:55-76.
[6] ERHAN D, BENGIO Y, COURVILLE A, et al. Visualizing higher-layer features of a deep network[R]. University of Montreal, 2009.
[7] NGUYEN A, YOSINSKI J, CLUNE J. Deep neural networks are easily fooled:high confidence predictions for unrecognizable images[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston:IEEE, 2015:427-436.
[8] SIMONYAN K, VEDALDI A, ZISSERMAN A. Deep inside convolutional networks:Visualising image classification models and saliency maps[C]//Proceedings of the 2nd International Conference on Learning Representations. Banff:ICLR, 2014.
[9] MAHENDRAN A, VEDALDI A. Visualizing deep convolutional neural networks using natural pre-images[J]. International Journal of Computer Vision, 2016, 120(3):233-255.
[10] Inceptionism:going deeper into neural networks[EB/OL].[2022-04-17]. Google Research Blog. 2015. https://news.ycombinator.com/item?id=9736598.
[11] NGUYEN A, DOSOVITSKIY A, YOSINSKI J, et al. Synthesizing the preferred inputs for neurons in neural networks via deep generator networks[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona:Curran Associates Inc., 2016:3395-3403.
[12] DENG Jia, DONG Wei, SOCHER R, et al. ImageNet:a large-scale hierarchical image database[C]//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami:IEEE, 2009:248-255.
[13] ZHANG Quanshi, CAO Ruiming, SHI Feng, et al. Interpreting CNN knowledge via an explanatory graph[C]//Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence. New Orleans:AAAI Press, 2018:546.
[14] ZHANG Quanshi, YANG Yu, MA Haotian, et al. Interpreting CNNs via decision trees[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach:IEEE, 2019:6254-6263.
[15] LIU Xuan, WANG Xiaoguang, MATWIN S. Improving the interpretability of deep neural networks with knowledge distillation[C]//Proceedings of 2018 IEEE International Conference on Data Mining Workshops (ICDMW). Singapore:IEEE, 2018:905-912.
[16] CHEN Runjin, CHEN Hao, HUANG Ge, et al. Explaining neural networks semantically and quantitatively[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul:IEEE, 2019:9186-9195.
[17] ANCONA M, CEOLINI E, ÖZTIRELI C, et al. Towards better understanding of gradient-based attribution methods for deep neural networks[C]//Proceedings of the 6th International Conference on Learning Representations. Vancouver:ICLR, 2018.
[18] MONTAVON G, LAPUSCHKIN S, BINDER A, et al. Explaining nonlinear classification decisions with deep taylor decomposition[J]. Pattern Recognition, 2017, 65:211-222.
[19] SUNDARARAJAN M, TALY A, YAN Qiqi. Axiomatic attribution for deep networks[C]//Proceedings of the 34th International Conference on Machine Learning. Sydney:JMLR.org, 2017:3319-3328.
[20] ANCONA M, CEOLINI E, ÖZTIRELI C, et al. Gradient-based attribution methods[M]//SAMEK W, MONTAVON G, VEDALDI A, et al. Explainable AI:Interpreting, Explaining and Visualizing Deep Learning. Cham:Springer, 2019:169-191.
[21] KINDERMANS P J, HOOKER S, ADEBAYO J, et al. The (Un) reliability of saliency methods[M]//SAMEK W, MONTAVON G, VEDALDI A, et al. Explainable AI:Interpreting, Explaining and Visualizing Deep Learning. Cham:Springer, 2019:267-280.
[22] ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks[C]//Proceedings of the 13th European Conference on Computer Vision. Zurich:Springer, 2014:818-833.
[23] ZEILER M D, TAYLOR G W, FERGUS R. Adaptive deconvolutional networks for mid and high level feature learning[C]//Proceedings of 2011 International Conference on Computer Vision. Barcelona:IEEE, 2011:2018-2025.
[24] SPRINGENBERG J T, DOSOVITSKIY A, BROX T, et al. Striving for simplicity:The all convolutional net[C]//Proceedings of the 3rd International Conference on Learning Representations. San Diego:ICLR, 2015.
[25] ZHANG Jianming, BARGAL S A, LIN Zhe, et al. Top-down neural attention by excitation backprop[J]. International Journal of Computer Vision, 2018, 126(10):1084-1102.
[26] MONTAVON G, BINDER A, LAPUSCHKIN S, et al. Layer-wise relevance propagation:an overview[M]//SAMEK W, MONTAVON G, VEDALDI A, et al. Explainable AI:Interpreting, Explaining and Visualizing Deep Learning. Cham:Springer, 2019:193-209.
[27] SHRIKUMAR A, GREENSIDE P, KUNDAJE A. Learning important features through propagating activation differences[C]//Proceedings of the 34th International Conference on Machine Learning. Sydney:JMLR.org, 2017:3145-3153.
[28] BACH S, BINDER A, MONTAVON G, et al. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation[J]. PLoS One, 2015, 10(7):e0130140.()
[29] ROBNIK-ŠIKONJA M, KONONENKO I. Explaining classifications for individual instances[J]. IEEE Transactions on Knowledge and Data Engineering, 2008, 20(5):589-600.
[30] ZINTGRAF L M, COHEN T S, ADEL T, et al. Visualizing deep neural network decisions:Prediction difference analysis[C]//Proceedings of the 5th International Conference on Learning Representations. Toulon:ICLR, 2017.
[31] FONG R C, VEDALDI A. Interpretable explanations of black boxes by meaningful perturbation[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice:IEEE, 2017:3449-3457.
[32] DABKOWSKI P, GAL Y. Real time image saliency for black box classifiers[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach:Curran Associates Inc., 2017:6970-6979.
[33] FONG R, PATRICK M, VEDALDI A. Understanding deep networks via extremal perturbations and smooth masks[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul:IEEE, 2019:2950-2958.
[34] PETSIUK V, DAS A, SAENKO K. Rise:randomized input sampling for explanation of black-box models[C]//Proceedings of British Machine Vision Conference 2018. Newcastle:BMVA Press, 2018:151.
[35] SINGH K K, LEE Y J. Hide-and-seek:forcing a network to be meticulous for weakly-supervised object and action localization[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice:IEEE, 2017:3544-3553.
[36] WANG Xiaolong, SHRIVASTAVA A, GUPTA A. A-fast-RCNN:hard positive generation via adversary for object detection[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu:IEEE, 2017:3039-3048.
[37] WEI Yunchao, FENG Jiashi, LIANG Xiaodan, et al. Object region mining with adversarial erasing:a simple classification to semantic segmentation approach[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu:IEEE, 2017:6488-6496.
[38] ZHOU Bolei, KHOSLA A, LAPEDRIZA A, et al. Learning deep features for discriminative localization[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas:IEEE, 2016:2921-2929.
[39] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM:visual explanations from deep networks via gradient-based localization[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice:IEEE, 2017:618-626.
[40] CHATTOPADHAY A, SARKAR A, HOWLADER P, et al. Grad-CAM++:generalized gradient-based visual explanations for deep convolutional networks[C]//Proceedings of 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). Lake Tahoe:IEEE, 2018:839-847.
[41] SATTARZADEH S, SUDHAKAR M, PLATANIOTIS K N, et al. Integrated Grad-CAM:sensitivity-aware visual explanation of deep convolutional networks via integrated gradient-based scoring[C]//Proceedings of ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto:IEEE, 2021:1775-1779.
[42] WANG Haofan, WANG Zifan, DU Mengnan, et al. Score-CAM:score-weighted visual explanations for convolutional neural networks[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle:IEEE, 2020:111-119.
[43] DESAI S, RAMASWAMY H G. Ablation-CAM:visual explanations for deep convolutional network via gradient-free localization[C]//Proceedings of 2020 IEEE Winter Conference on Applications of Computer Vision. Snowmass:IEEE, 2020:972-980.
[44] AAMODT A, PLAZA E. Case-based reasoning:foundational issues, methodological variations, and system approaches[J]. AI Communications, 1994, 7(1):39-59.
[45] KUNCHEVA L I, BEZDEK J C. Nearest prototype classification:clustering, genetic algorithms, or random search?[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 1998, 28(1):160-164.
[46] BIEN J, TIBSHIRANI R. Prototype selection for interpretable classification[J]. The Annals of Applied Statistics, 2011, 5(4):2403-2424.
[47] KIM B, RUDIN C, SHAH J. The bayesian case model:a generative approach for case-based reasoning and prototype classification[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal:MIT Press, 2014:1952-1960.
[48] KIM B, KHANNA R, KOYEJO O. Examples are not enough, learn to criticize! criticism for interpretability[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona:Curran Associates Inc., 2016:2288-2296.
[49] LI O, LIU Hao, CHEN Chaofan, et al. Deep learning for case-based reasoning through prototypes:a neural network that explains its predictions[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence. New Orleans:AAAI, 2018:3530-3537.
[50] COOK R D. Detection of influential observation in linear regression[J]. Technometrics, 1977, 19(1):15-18.
[51] KOH P W, LIANG P. Understanding black-box predictions via influence functions[C]//Proceedings of the 34th International Conference on Machine Learning. Sydney:JMLR.org, 2017:1885-1894.
[52] KOH P W, ANG K S, TEO H H K, et al. On the accuracy of influence functions for measuring group effects[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook:Curran Associates Inc., 2019:5254-5264.
[53] YUAN Xiaoyong, HE Pan, ZHU Qile, et al. Adversarial examples:attacks and defenses for deep learning[J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(9):2805-2824.
[54] VAN LOOVEREN A, KLAISE J. Interpretable counterfactual explanations guided by prototypes[C]//Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Bilbao:Springer, 2021:650-665.
[55] GOYAL Y, WU Ziyan, ERNST J, et al. Counterfactual visual explanations[C]//Proceedings of the 36th International Conference on Machine Learning. Long Beach:ICML, 2019:2376-2384.
[56] BAU D, ZHOU Bolei, KHOSLA A, et al. Network dissection:quantifying interpretability of deep visual representations[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu:IEEE, 2017:3319-3327.
[57] HOOKER S, ERHAN D, KINDERMANS P J, et al. A benchmark for interpretability methods in deep neural networks[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver:Curran Associates Inc., 2019:9737-9748.
[58] MONTAVON G. Gradient-based vs. propagation-based explanations:an axiomatic comparison[M]//SAMEK W, MONTAVON G, VEDALDI A, et al. Explainable AI:Interpreting, Explaining and Visualizing Deep Learning. Cham:Springer, 2019:253-265.
[59] LAPUSCHKIN S, BINDER A, MONTAVON G, et al. The LRP toolbox for artificial neural networks[J]. The Journal of Machine Learning Research, 2016, 17(1):3938-3942.
[60] ALBER M, LAPUSCHKIN S, SEEGERER P, et al. iNNvestigate neural networks![J]. Journal of Machine Learning Research, 2019:1-8.
[61] MEUDEC R. tf-explain[EB/OL].[2022-04-01]. https://pypi.org/project/tf-explain/.
[62] PASZKE A, GROSS S, MASSA F, et al. PyTorch:an imperative style, high-performance deep learning library[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver:Curran Associates Inc., 2019:8026-8037.
[63] 李德仁, 童庆禧, 李荣兴, 等. 高分辨率对地观测的若干前沿科学问题[J]. 中国科学:地球科学, 2012, 42(6):805-813. LI Deren, TONG Qingxi, LI Rongxing, et al. Current issues in high-resolution earth observation technology[J]. Scientia Sinica Tertae, 2012, 42(6):805-813.
[64] 龚健雅. 人工智能时代测绘遥感技术的发展机遇与挑战[J]. 武汉大学学报(信息科学版), 2018, 43(12):1788-1796. GONG Jianya. Chances and challenges for development of surveying and remote sensing in the age of artificial intelligence[J]. Geomatics and Information Science of Wuhan University, 2018, 43(12):1788-1796.
[65] 龚健雅, 许越, 胡翔云, 等. 遥感影像智能解译样本库现状与研究[J]. 测绘学报, 2021, 50(8):1013-1022. DOI:10.11947/j.AGCS.2021.20210085. GONG Jianya, XU Yue, HU Xiangyun, et al. Status analysis and research of sample database for intelligent interpretation of remote sensing image[J]. Acta Geodaetica et Cartographica Sinica, 2021, 50(8):1013-1022. DOI:10.11947/j.AGCS.2021.20210085.
[66] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas:IEEE, 2016:770-778.
[67] YANG Yi, NEWSAM S. Bag-of-visual-words and spatial extensions for land-use classification[C]//Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems. California:ACM, 2010:270-279.
李振洪, 朱武, 余琛, 张勤, 张成龙, 刘振江, 张雪松, 陈博, 杜建涛, 宋闯, 韩炳权, 周佳薇.
雷达影像地表形变干涉测量的机遇、挑战与展望
[J]. 测绘学报, 2022, 51(7): 1485-1519.
程涛, 张洋, James Haworth.
基于网络和图的时空智能——概念、方法和应用
[J]. 测绘学报, 2022, 51(7): 1629-1639.
刘经南, 罗亚荣, 郭迟, 高柯夫.
PNT智能与智能PNT
[J]. 测绘学报, 2022, 51(6): 811-828.
王家耀, 武芳, 闫浩文.
大变化时代的地图学
[J]. 测绘学报, 2022, 51(6): 829-842.
张勤, 赵超英, 陈雪蓉.
多源遥感地质灾害早期识别技术进展与发展趋势
[J]. 测绘学报, 2022, 51(6): 885-896.
刘瑜, 郭浩, 李海峰, 董卫华, 裴韬.
从地理规律到地理空间人工智能
[J]. 测绘学报, 2022, 51(6): 1062-1069.
龚健雅, 张觅, 胡翔云, 张展, 李彦胜, 姜良存.
智能遥感深度学习框架与模型设计
[J]. 测绘学报, 2022, 51(4): 475-487.
王权, 尤淑撑.
陆地卫星遥感监测体系及应用前景
[J]. 测绘学报, 2022, 51(4): 534-543.
毛文婧, 王卫林, 焦利民, 刘安宝.
基于深度学习的中国连续空间覆盖PM
2.5
浓度预报
[J]. 测绘学报, 2022, 51(3): 361-372.
金飞, 官恺, 刘智, 韩佳容, 芮杰, 李庆高.
无监督密集匹配特征提取网络性能分析
[J]. 测绘学报, 2022, 51(3): 426-436.
何直蒙, 丁海勇, 安炳琪.
高分辨率遥感影像建筑物提取的空洞卷积E-Unet算法
[J]. 测绘学报, 2022, 51(3): 457-467.
李德仁, 徐小迪, 邵振峰.
论万物互联时代的地球空间信息学
[J]. 测绘学报, 2022, 51(1): 1-8.
地址:北京市西城区三里河路50号 邮编:100045
电话:010-68531192,68531322,68531162,68531338 传真:010-68531317 Email:chxb@ chinajournal.net.cn
网 址:
http://xb.sinomaps.com
本系统由
北京玛格泰克科技发展有限公司
设计开发 技术支持:[email protected]