添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
I am a third year Ph.D student in the Department of Automation at Tsinghua University, advised by Prof. Jiwen Lu and Prof. Jie Zhou . In 2020, I obtained my B.Eng. in the Department of Automation, Tsinghua University. I am broadly interested in computer vision and deep learning. My current research focuses on model architectures and generative models. Email &nbsp/&nbsp Google Scholar &nbsp/&nbsp Github 2022-03: Check out our work at CVPR 2022 on language-guided dense prediction ( DenseCLIP ). 2021-09: GFNet and DynamicViT are accepted to NeurIPS 2021 . 2021-07: 2 papers on video understanding and interpretable metric learning are accepted to ICCV 2021 . UniPC: A Unified Predictor-Corrector Framework for Fast Sampling of Diffusion Models Wenliang Zhao *, Lujia Bai *, Yongming Rao , Jie Zhou , Jiwen Lu preprint [arXiv] [Code] [Project Page]

UniPC is a training-free framework designed for the fast sampling of diffusion models, which consists of a corrector (UniC) and a predictor (UniP) that share a unified analytical form and support arbitrary orders. HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions Yongming Rao *, Wenliang Zhao *, Yansong Tang , Jie Zhou , Ser-Nam Lim , Jiwen Lu NeurIPS , 2022 [arXiv] [Code] [Project Page] [中文解读]

HorNet is a family of generic vision backbones that perform explicit high-order spatial interactions based on Recursive Gated Convolution.

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting Yongming Rao *, Wenliang Zhao* , Guangyi Chen , Yansong Tang , Jie Zhou , Jiwen Lu IEEE/CVF Conference on Computer Vision and Pattern Recognition ( CVPR ) , 2022 [arXiv] [Code] [Project Page] [中文解读] DenseCLIP is a new framework for dense prediction by implicitly and explicitly leveraging the pre-trained knowledge from CLIP. Global Filter Networks for Image Classification Yongming Rao *, Wenliang Zhao* , Zheng Zhu , Jiwen Lu , Jie Zhou Conference on Neural Information Processing Systems ( NeurIPS ) , 2021 [arXiv] [Code] [Project Page] [中文解读(By HappyAIWalker)]

Global Filter Networks is a transformer-style architecture that learns long-term spatial dependencies in the frequency domain with log-linear complexity.

DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification Yongming Rao , Wenliang Zhao , Benlin Liu , Jiwen Lu , Jie Zhou , Cho-Jui Hsieh Conference on Neural Information Processing Systems ( NeurIPS ) , 2021 [arXiv] [Code] [Project Page] [知乎]

We present a dynamic token sparsification framework to prune redundant tokens in vision transformers progressively and dynamically based on the input.

Towards Interpretable Deep Metric Learning with Structural Matching Wenliang Zhao* , Yongming Rao *, Ziyi Wang, Jiwen Lu , Jie Zhou IEEE International Conference on Computer Vision ( ICCV ) , 2021 [arXiv] [Code]

We present a framework (DIML) to add interpretability to metric learning and improve the performance of deep metric learning models.

Group-aware Contrastive Regression for Action Quality Assessment Xumin Yu *, Yongming Rao *, Wenliang Zhao , Jiwen Lu , Jie Zhou IEEE International Conference on Computer Vision ( ICCV ) , 2021

We propose a new contrastive regression (CoRe) framework to learn the relative scores by pair-wise comparison, which highlights the differences between videos and guides the models to learn the key hints for assessment.

  • 2020 Outstanding Undergraduate, Tsinghua University
  • 2018 Tang Lixin Scholarship, Tsinghua University
  • 2019 Tsinghua Presidential Award Nomination, Tsinghua University
  • 2018 Zheng Weimin Scholarship, Tsinghua University
  • 2018 Jiang Nanxiang Scholarship, Tsinghua University
  • 2018 1st prize in 36th Challenge Cup, Tsinghua University
  • 2017 Qualcomm Scholarship
  • 2017 National Scholarship, Tsinghua University
  •