publications

image

2025

  1. CVPR Highlight
    Zhanbo2025.pdf
    Learning Human-centric Motion Representation for Action Analysis
    Zhanbo Huang, Xiaoming Liu, and Yu Kong
    In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2025
  2. arxiv
    Make it NoisEasier: Boosting Text-to-Video Generation with Direct Noise Optimization
    Yujiang Pu and Yu Kong
    In under review, 2025
  3. arxiv
    GaitPro: Nuisance-Invariant Gait Recognition via Condition Annotations and Proxy Samples
    Zhanbo Huang, Dingqiang Ye, Xiaoming Liu, and Yu Kong
    In under review, 2025
  4. arxiv
    Are We Merely Justifying Results ex Post Facto? Quantifying Explanatory Inversion in Post-Hoc Model Explanations
    Zhen Tan, Song Wang, Yifan Li, Yu Kong, Jundong Li, Tianlong Chen, and Huan Liu
    In under review, 2025
  5. arxiv
    IndustryEQA: Pushing the Frontiers of Embodied Question Answering in Industrial Scenarios
    Yifan Li, Yuhang Chen, Anh Dao, Lichi Li, Zhongyi Cai, Zhen Tan, Tianlong Chen, and 1 more author
    In under review, 2025
  6. arxiv
    Yujiang2025.pdf
    Show Me: Generating Instructional Videos with Diffusion Models
    Yujiang Pu, Zhanbo Huang, Vishnu Boddeti, and Yu Kong
    In under review, 2025
  7. Open Set DeepFake Detection with Category-specific Characteristics
    Zhongyi Cai, Bryce Gernon, Wentao Bao, Yifan Li, Matthew Wright, and Yu Kong
    In under review, 2025
  8. Procedural Mistake Detection via Action Effect Modeling
    Wenliang Guo, Yujiang Pu, and Yu Kong
    In under review, 2025
  9. Yifan2025.pdf
    ViT-Split: Unleashing the Power of Vision Foundation Models via Efficient Splitting Heads
    Yifan Li, Tianqin Li, Xin Li, Wenbin He, Yu Kong, and Ren Liu
    In under review, 2025
  10. Continual Visual Question Answering Through Bayesian Mixture of Experts Aggregation
    Mahsa Mozaffari, Hitesh Sapkota, Yu Kong, and Qi Yu
    In under review, 2025
  11. arXiv
    Yifan2025.png
    Visual Large Language Models for Generalized and Specialized Applications
    Yifan Li, Zhixin Lai, Wentao Bao, Zhen Tan, Anh Dao, Kewei Sui, Jiayi Shen, and 3 more authors
    In under review, 2025
  12. arXiv
    Yifan2024.png
    Window Token Concatenation for Efficient Visual Large Language Models
    Yifan Li, Wentao Bao, Botao Ye, Zhen Tan, Tianlong Chen, Huan Liu, and Yu Kong
    In 2nd CVPR Workshop on Efficient Large Vision Models, 2025
  13. Advancing Assessment Fairness and Equity in Medical Education with Artificial Intelligence
    Chi Chang, Wentao Bao, Yu Kong, and Heather Laird-Fick
    In 2025 American Educational Research Association Annual Meeting, 2025
  14. WACV
    Wentao2025.gif
    Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection
    Wentao Bao, Kai Li, Deep Anil Patel, Yuxiao Chen, and Yu Kong
    In Winter Conference on Applications of Computer Vision (WACV), 2025

2024

  1. ECCV
    WentaoECCV2024a.png
    Prompting Language-Informed Distribution for Compositional Zero-Shot Learning
    Wentao Bao, Lichang Chen, Heng Huang, and Yu Kong
    In European Conference on Computer Vision (ECCV), 2024
  2. ECCV
    YujiangECCV2024.png
    SHINE: Saliency-aware HIerarchical NEgative Ranking for Compositional Temporal Grounding
    Zixu Cheng, Yujiang Pu, Shaogang Gong, Parisa Kordjamshidi, and Yu Kong
    In European Conference on Computer Vision (ECCV), 2024
  3. ECCV
    YifanECCV2024.png
    Facial Affective Behavior Analysis Using Fine-grained Emotion Instructions
    Yifan Li, Anh Dao, Wentao Bao, Zhen Tan, Tianlong Chen, Huan Liu, and Yu Kong
    In European Conference on Computer Vision (ECCV), 2024
  4. ECCV
    WentaoECCV2024b.png
    Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment
    Yuxiao Chen, Kai Li, Wentao Bao, Deep Patel, Yu Kong, Martin Renqiang Min, and Dimitris Metaxas
    In European Conference on Computer Vision (ECCV), 2024
  5. IJCAI
    IJCAI2024.png
    A Survey of Multimodal Sarcasm Detection
    Shafkat Farabi, Tharindu Ranasinghe, Diptesh Kanojia, Yu Kong, and Marcos Zampieri
    In International Joint Conference on Artificial Intelligence (IJCAI), 2024
  6. CVPR
    CVPRW2024.png
    The Wolf Within: Covert Injection Of Malice Into Mllm Societies Via An Mllm Operative
    Zhen Tan, Chengshuai Zhao, Raha Moraffah, Yifan Li, Yu Kong, Tianlong Chen, and Huan Liu
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Responsible Generative AI Workshop, 2024
  7. arXiv
    Wentao2024.png
    Exploring Latent Space Energy-based Model for Fine-grained Open Set Recognition
    Wentao Bao, Qi Yu, and Yu Kong
    2024
    Under review

2023

  1. Using Computer Vision to Assess Students’ Safety Behaviors in an Objective Structured Clinical Examination
    Chi Chang, Hong Zhuang, Yu Kong, and Heather Laird-Fick
    In The annual ChangeMedEd conference, 2023
  2. MM
    ATM: Action Temporality Modeling for Video Question Answering
    Junwen Chen, Jie Zhu, and Yu Kong
    In ACM Multimedia, 2023
  3. ICCV
    Uncertainty-aware State Space Transformer for Egocentric 3D Trajectory Forecasting
    Wentao Bao, Lele Chen, Libing Zeng, Zhong Li, Yi Xu, Junsong Yuan, and Yu Kong
    In International Conference on Computer Vision (ICCV), 2023
  4. arXiv
    An Eye for an Eye: Defending against Gradient-based Attacks with Gradients
    Hanbin Hong, Yuan Hong, and Yu Kong
    In arXiv, 2023
  5. CVPR
    Catch Missing Details: Image Reconstruction with Frequency Augmented Variational Autoencoder
    Xinmiao Lin, Yikang Li, Jenhao Hsiao, Chiu Man Ho, and Yu Kong
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
  6. WACV
    Ancestor Search: Generalized Open Set Recognition via Hyperbolic Side Information Learning
    Xiwen Dengxiong and Yu Kong
    In IEEE/CVF Winter Conferences on Applications of Computer Vision (WACV), 2023

2022

  1. CVPR
    Learning of Global Objective for Network Flow in Multi-Object Tracking
    Shuai Li, Yu Kong, and Hamid Rezatofighi
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
  2. CVPR
    GateHUB: Gated History Unit with Background Suppression for Online Action Detection
    Junwen Chen, Gaurav Mittal, Ye Yu, Yu Kong, and Mei Chen
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
  3. CVPR
    OpenTAL: Towards Open Set Temporal Action Localization
    Wentao Bao, Qi Yu, and Yu Kong
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
  4. AAAI
    A Dynamic Meta-Learning Model for Time-Sensitive Cold-Start Recommendations
    Krishna Prasad Neupane, Ervine Zheng, Yu Kong, and Qi Yu
    In Association for the Advancement of Artificial Intelligence (AAAI), 2022
  5. S&P
    Universal 3-Dimensional Perturbations for Black-Box Attacks on Video Recognition Systems
    Shangyu Xie, Han Wang, Yu Kong, and Yuan Hong
    In IEEE Symposium on Security and Privacy (S&P), 2022
  6. IJCV
    Human Action Recognition and Prediction: A Survey
    Yu Kong and Yun Fu
    International Journal of Computer Vision (IJCV), 2022

2021

  1. From Ensemble Clustering to Subspace Clustering: Cluster Structure Encoding
    Zhiqiang Tao, Jun Li, Huazhu Fu, Yu Kong, and Yun Fu
    IEEE Transactions on Neural Networks and Learning Systems (T-NNLS), 2021
  2. Coupling Adversarial Graph Embedding for Transductive Zero-shot Action Recognition
    Yi Tian, Yaping Huang, Wanru Xu, and Yu Kong
    Neurocomputing, 2021
  3. BMVC
    Gradient Frequency Modulation for Visually Explaining Video Understanding Models
    Xinmiao Lin, Wentao Bao, and Yu Kong
    In British Machine Vision Conference (BMVC), 2021
  4. ICCV
    Explainable Video Entailment with Grounded Visual Evidence
    Junwen Chen and Yu Kong
    In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2021
  5. ICCV
    DRIVE: Deep Reinforced Accident Anticipation with Visual Explanation
    Wentao Bao, Qi Yu, and Yu Kong
    In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2021
  6. ICCV
    Evidential Deep Learning for Open Set Action Recognition
    Wentao Bao, Qi Yu, and Yu Kong
    In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2021
  7. IJCNN
    Multiple Instance Relational Learning for Video Anomaly Detection
    Xiwen Dengxiong, Wentao Bao, and Yu Kong
    In International Joint Conference on Neural Networks (IJCNN), 2021
  8. T-IP
    Accurate and Fast Image Denoising via Attention Guided Scaling
    Yulun Zhang, Kunpeng Li, Kai Li, Gan Sun, Yu Kong, and Yun Fu
    IEEE Transaction on Image Processing (T-IP), 2021
  9. Revealing a History: Palimpsest Text Separation by Recreatingthe Historical Document with Generative Networks
    Anna Starynska, David Messinger, and Yu Kong
    International Journal on Document Analysis and Recognition (IJDAR), 2021

2020

  1. ICPR
    Privacy Attributes-aware Message Passing Neural Network for Visual Privacy Attributes Classification
    Hanbin Hong, Wentao Bao, Yuan Hong, and Yu Kong
    In International Conference on Pattern Recognition (ICPR), 2020
  2. RIT-18: A Novel Dataset for Compositional Group Activity Understanding
    Junwen Chen, Haiting Hao, Hanbin Hong, and Yu Kong
    In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshop, 2020
  3. IJCAI
    Few-shot Human Motion Prediction via Learning Novel Motion Dynamics
    Chuanqi Zang, Mingtao Pei, and Yu Kong
    In International Joint Conference on Artificial Intelligence (IJCAI), 2020
  4. IROS
    Object-Aware Centroid Voting for Monocular 3D Object Detection
    Wentao Bao, Qi Yu, and Yu Kong
    In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020
  5. ECCV
    Group Activity Prediction with Sequential Relational Anticipation Model
    Junwen Chen, Wentao Bao, and Yu Kong
    In European Conference on Computer Vision (ECCV), 2020
  6. MM
    Uncertainty-based Traffic Accident Anticipation with Spatio-Temporal Relational Learning
    Wentao Bao, Qi Yu, and Yu Kong
    In ACM Multimedia, 2020
  7. MM
    Activity-driven Weakly-Supervised Spatio-Temporal Grounding from Untrimmed Videos
    Junwen Chen, Wentao Bao, and Yu Kong
    In ACM Multimedia, 2020
  8. T-PAMI
    Residual Dense Network for Image Restoration
    Yulun Zhang, Yapeng Tian, Yu Kong, Bineng Zhong, and Yun Fu
    IEEE Transaction on Pattern Analysis and Machine Intelligence (T-PAMI), 2020
  9. EDBT
    Publishing Video Data with Indistinguishable Objects
    Han Wang, Yuan Hong, Yu Kong, and Jaideep Vaidya
    In 23rd International Conference on Extending Database Technology (EDBT), 2020
  10. T-IP
    Visual Object Tracking Via Multi-Stream Deep Similarity Learning Networks
    Kunpeng Li, Yu Kong, and Yun Fu
    IEEE Transactions on Image Processing (T-IP), 2020
  11. T-CSVT
    Aligned Dynamic-Preserving Embedding for Zero-Shot Action Recognition
    Yi Tian, Yu Kong, Qiuqi Ruan, Gaoyun An, and Yun Fu
    IEEE Transactions on Circuits and Systems for Video Technology, 2020
  12. T-PAMI
    Adversarial Action Prediction Networks
    Yu Kong, Zhiqiang Tao, and Yun Fu
    IEEE Transaction on Pattern Analysis and Machine Intelligence (T-PAMI), 2020

2019

  1. T-CSVT
    Semi-Supervised Cross-Modality Action Recognition by Latent Tensor Transfer Learning
    Chengcheng Jia, Zhengming Ding, Yu Kong, and Yun Fu
    IEEE Trans. Circuits and Systems for Video Technology (T-CSVT), 2019
  2. Activity Recognition
    Yu Kong and Yun Fu
    In Deep Learning through Sparse and Low-Rank Modeling, 2019

2018

  1. Hierarchical and Spatio-Temporal Sparse Representation for Human Action Recognition
    Yi Tian, Yu Kong, Qiuqi Ruan, Gaoyun An, and Yun Fu
    IEEE Transactions on Image Processing (T-IP), 2018
  2. ICDM
    Clustered Lifelong Learning via Representative Task Selection
    Gan Sun, Yang Cong, Yu Kong, and Xiaowei Xu
    In IEEE International Conference on Data Mining (ICDM), 2018
  3. CVPR
    Residual Dense Network for Image Super-Resolution
    Yulun Zhang, Yapeng Tian, Yu Kong, Bineng Zhong, and Yun Fu
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018
  4. AAAI
    Action Prediction from Videos via Memorizing Hard-to-Predict Samples
    Yu Kong, Shangqian Gao, Bin Sun, and Yun Fu
    In AAAI Conference on Artificial Intelligence (AAAI), 2018

2017

  1. T-BD
    Deep Geo-constrained Auto-encoder for Non-landmark GPS Estimation
    Shuhui Jiang, Yu Kong, and Yun Fu
    IEEE Trans. Big Data, 2017
  2. MM
    Deep Active Learning Through Cognitive Information Parcels
    Wencang Zhao, Yu Kong, Zhengming Ding, and Yun Fu
    In ACM Multimedia, 2017
  3. AAAI
    Sparse Subspace Clustering by Learning Approximation l0 Codes
    Jun Li, Yu Kong, and Yun Fu
    In AAAI Conference on Artificial Intelligence (AAAI), 2017
  4. IJCAI
    Multi-Stream Deep Similarity Learning Networks for Visual Tracking
    Kunpeng Li, Yu Kong, and Yun Fu
    In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI), 2017
  5. T-NNLS
    Probabilistic Low-Rank Multitask Learning
    Yu Kong, Ming Shao, Kang Li, and Yun Fu
    IEEE Transactions on Neural Networks and Learning Systems (T-NNLS), 2017
  6. IJCV
    Max-Margin Heterogeneous Information Machine for RGB-D Action Recognition
    Yu Kong and Yun Fu
    International Journal of Computer Vision (IJCV), 2017
  7. CVPR
    Deep Sequential Context Networks for Action Prediction
    Yu Kong, Zhiqiang Tao, and Yun Fu
    In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
  8. T-IP
    Deeply Learned View-Invariant Features for Cross-View Action Recognition
    Yu Kong, Zhengming Ding, Jun Li, and Yun Fu
    IEEE Transactions on Image Processing (T-IP), 2017

2016

  1. Introduction
    Yu Kong and Yun Fu
    In Human Activity Recognition and Prediction, 2016
  2. Activity Prediction
    Yu Kong and Yun Fu
    In Human Activity Recognition and Prediction, 2016
  3. T-IP
    Learning Fast Low-Rank Projection for Image Classification
    Jun Li, Yu Kong, Handong Zhao, Jian Yang, and Yun Fu
    IEEE Transactions on Image Processing (T-IP), 2016
  4. MM
    Deep Convolutional Neural Network with Independent Softmax for Large Scale Face Recognition
    Yue Wu, Jun Li, Yu Kong, and Yun Fu
    In Proceedings of the 2016 ACM on Multimedia Conference, 2016
  5. RGB-D Action Recognition
    Chengcheng Jia, Yu Kong, Zhengming Ding, and Yun Fu
    In Human Activity Recognition and Prediction, 2016
  6. T-BD
    Efficient Image Geotagging Using Large Databases
    Dmitry Kit, Yu Kong, and Yun Fu
    IEEE Transactions on Big Data, 2016
  7. Action Recognition and Human Interaction
    Yu Kong and Yun Fu
    In Human Activity Recognition and Prediction, 2016
  8. T-IP
    Close Human Interaction Recognition Using Patch-Aware Models
    Yu Kong and Yun Fu
    IEEE Transactions on Image Processing (T-IP), 2016
  9. T-IP
    Discriminative Relational Representation Learning for RGB-D Action Recognition
    Yu Kong and Yun Fu
    IEEE Transactions on Image Processing (T-IP), 2016
  10. CVIU
    Learning hierarchical 3D kernel descriptors for RGB-D action recognition
    Yu Kong, Behnam Satarboroujeni, and Yun Fu
    Computer Vision and Image Understanding (CVIU), 2016
  11. T-PAMI
    Max-Margin Action Prediction Machine
    Yu Kong and Yun Fu
    IEEE Transaction on Pattern Analysis and Machine Intelligence (T-PAMI), 2016

2015

  1. FG
    Hierarchical 3D kernel descriptors for action recognition using depth sequences
    Yu Kong, Behnam Satarboroujeni, and Yun Fu
    In Automatic Face and Gesture Recognition (FG), 2015 11th IEEE International Conference and Workshops on, 2015
  2. CVPR
    Bilinear Heterogeneous Information Machine for RGB-D Action Recognition
    Yu Kong and Yun Fu
    In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015

2014

  1. T-PAMI
    Interactive Phrases: Semantic Descriptions for Human Interaction Recognition
    Yu Kong, Yunde Jia, and Yun Fu
    IEEE Transaction on Pattern Analysis and Machine Intelligence (T-PAMI), 2014
  2. Learning a discriminative mid-level feature for action recognition
    CuiWei Liu, MingTao Pei, XinXiao Wu, Yu Kong, and YunDe Jia
    Science China Information Sciences, 2014
  3. MM
    Latent tensor transfer learning for RGB-D action recognition
    Chengcheng Jia, Yu Kong, Zhengming Ding, and Yun Raymond Fu
    In ACM Multimedia, 2014
  4. IJCNN
    LASOM: Location Aware Self-Organizing Map for Discovering Similar and Unique Visual Features of Geographical Locations.
    Dmitry Kit, Yu Kong, and Yun Fu
    In International Joint Conference on Neural Network (IJCNN), 2014
  5. Modeling supporting regions for close human interaction recognition
    Yu Kong and Yun Fu
    In European Conference on Computer Vision Workshop, 2014
  6. Recognising human interaction from videos by a discriminative model
    Yu Kong, Wei Liang, Zhen Dong, and Yunde Jia
    IET Computer Vision, 2014
  7. ECCV
    A Discriminative Model with Multiple Temporal Scales for Action Prediction
    Yu Kong, Dmitry Kit, and Yun Fu
    In European Conference on Computer Vision (ECCV), 2014

2013

  1. FG
    Activity recognition by learning structural and pairwise mid-level features using random forest
    Jie Hu, Yu Kong, and Yun Fu
    In Automatic Face and Gesture Recognition (FG), 2013 10th IEEE International Conference and Workshops on, 2013

2012

  1. ICPR
    Action recognition with discriminative mid-level features
    Cuiwei Liu, Yu Kong, Xinxiao Wu, and Yunde Jia
    In Pattern Recognition (ICPR), 2012 21st International Conference on, 2012
  2. BMVC
    Contour-HOG: A Stub Feature based Level Set Method for Learning Object Contour
    Zhi Yang, Yu Kong, and Yun Fu
    In British Machine Vision Conference (BMVC), 2012
  3. ICPR
    Decomposed contour prior for shape recognition
    Zhi Yang, Yu Kong, and Yun Fu
    In Pattern Recognition (ICPR), 2012 21st International Conference on, 2012
  4. ICME
    A Hierarchical Model for Human Interaction Recognition
    Yu Kong and Yunde Jia
    In IEEE International Conference on Multimedia and Expo (ICME), 2012
  5. ECCV
    Learning Human Interaction by Interactive Phrases
    Yu Kong, Yunde Jia, and Yun Fu
    In European Conference on Computer Vision (ECCV), 2012

2011

  1. Recognizing human interaction by multiple features
    Zhen Dong, Yu Kong, Cuiwei Liu, Hongdong Li, and Yunde Jia
    In Pattern Recognition (ACPR), 2011 First Asian Conference on, 2011
  2. PRL
    Adaptive learning codebook for action recognition
    Yu Kong, Xiaoqin Zhang, Weiming Hu, and Yunde Jia
    Pattern Recognition Letters, 2011

2010

  1. ICIP
    Compact visual codebook for action recognition
    Qingdi Wei, Xiaoqin Zhang, Yu Kong, Weiming Hu, and Haibin Ling
    In International Conference on Image Processing (ICIP), 2010
  2. A swarm intelligence based searching strategy for articulated 3D human body tracking
    Xiaoqin Zhang, Weiming Hu, Xiangyang Wang, Yu Kong, Nianhua Xie, Hanzi Wang, Haibin Ling, and 1 more author
    In IEEE Conference on Computer Vision and Pattern Recognition Workshop, 2010
  3. Learning human actions with an adaptive codebook
    Yu Kong, Xiaoqin Zhang, Weiming Hu, and Yunde Jia
    In Virtual Systems and Multimedia (VSMM), 2010 16th International Conference on, 2010

2009

  1. Group action recognition using space-time interest points
    Qingdi Wei, Xiaoqin Zhang, Yu Kong, Weiming Hu, and Haibin Ling
    In Advances in Visual Computing, 2009
  2. ACCV
    Learning group activity in soccer videos from local motion
    Yu Kong, Weiming Hu, Xiaoqin Zhang, Hanzi Wang, and Yunde Jia
    In Asian Conference on Computer Vision (ACCV), 2009

2008

  1. ICPR
    Group action recognition in soccer videos
    Yu Kong, Xiaoqin Zhang, Qingdi Wei, Weiming Hu, and Yunde Jia
    In International Conference on Pattern Recognition (ICPR), 2008