About Me
	I am a senior research scientist at Meta Reality Labs. I obtained Ph.D. degree (2019-2024) from the Department of CSE, Michigan State University (MSU), working with Prof. Yu Kong. Prior to MSU, I spent three Ph.D. years (2019-2022) at the GCCIS, Rochester Institute of Technology (R.I.T), working with Prof. Yu Kong and Prof. Qi Yu. I completed both my Bachelor's and Master's degrees at the School of Remote Sensing and Information Engineering,  Wuhan University (WHU) in 2016 and 2019, where I was advised by Prof. Daiqin Yang and Prof. Zhenzhong Chen at the Lab. of Intelligent Information Processing (IIP). I had research internship collaborations with excellent industrial researchers from Apple, OPPO US Research Center, and NEC Lab America. 
	
	I am broadly interested in real-world computer vision challenges related to visual recognition, prediction, understanding, and 3D vision. My Ph.D. research focuses on open-world visual understanding problems by developing CV/ML models spanning from conventional vision models to recent multi-modal LLM and GenAI approaches. In Reality Labs, I develop on-device biometric authentication systems for the latest AI-powered mixed/augmented-reality (MR/AR) glasses.
	
	
	
News [ ]
]
	
	
	
	
	| 
		2025.02: I delivered an online lecture talk to CS570 at Emory, invited by Dr. Wei Jin.2025.01: We released a survey on Visual Large Language Models. Thanks to Yifan and all collaborators!2024.10: 🎉🎉🎉 One paper is accepted by WACV 2025.2024.07: I joined Meta Reality Labs as a research scientist. 2024.07: I successfuly passed the Ph.D. dissertation defense at CSE Department of MSU.2024.07: Three papers are accepted by  ECCV 2024 (two co-authored)!2024.03: I am selected to present in CVPR 2024 Doctoral Consortium and chat with Prof. Jason Corso.2024.02: I successfuly passed the MSU PhD Comprehensive Exam, being a Ph.D. candidate!2023.07: One paper is accepted by ICCV 2023.2023.05: I am invited to deliver a talk on open-set recognition at the the 2nd MSU-ND workshop.2023.02: I will be a research intern at NEC Laboratories America, Inc. (Princeton, NJ) in this summer.2023.02: One co-authored paper is accepted by CVPR 2023.2022.08: I started my second journey of Ph.D. study at the CSE department at MSU!2022.07: One co-authored paper is accepted by ECCV 2022.2022.06: Start my internship at OPPO U.S. Research Center at Palo Alto, CA. (on-site)2022.05: I attended the conference ICRA 2022 on-site at Philadelphia, PA.2022.04: I received the CVPR 2022 Travel Award to attend the conference at New Orleans, LA.2022.03: One paper is accepted by CVPR 2022 for Oral presentation!2021.10: One co-authored paper is accepted by BMVC 2021.2021.07: Two papers are accepted by ICCV 2021, with one paper for Oral presentation!2021.06: Start my internship at Apple Inc., 3D Vision Team at Apple Maps. (remote)2021.04: One co-authored paper is accepted by IJCNN 2021.2020.07: Two papers are accepted by ACM MM 2020 (one co-authored).2020.07: One co-authored paper is accepted by ECCV 2020.2020.06: One paper is accepted by IROS 2020.2020.06: One co-authored paper is accepted by ICPR 2020.2020.05: I passed the Ph.D. Research Potential Assessment!2019.08: Start my new journey at RIT, Rochester, NY. 
 | 
	
	
	 
  Conferences
  
	
	
		Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection
		Wentao Bao, Kai Li, Yuxiao Chen, Deep Patel, Martin Renqiang Min, Yu Kong
		
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2025
		arXiv
		Code
	    BibTeX
	 
  
	
	
		Prompting Language-Informed Distribution for Compositional Zero-Shot Learning
		Wentao Bao, Lichang Chen, Heng Huang, Yu Kong
		
European Conference on Computer Vision (ECCV), 2024
		arXiv
		Code
	    BibTeX
	 
  
  
  
	
	
		Uncertainty-aware State Space Transformer for Egocentric 3D Hand Trajectory Forecasting
		Wentao Bao, Lele Chen, Libing Zeng, Zhong Li, Yi Xu, Junsong Yuan, Yu Kong
		
International Conference on Computer Vision (ICCV), 2023
	    PDF 
		Code
		Project
		arXiv
	  BibTeX
	 
  
  
  
	
	
		OpenTAL: Towards Open Set Temporal Action Localization
		Wentao Bao, Qi Yu, Yu Kong
		
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022 (
Oral)
	    
PDF
		arXiv
		Poster
	    Code
	    BibTeX
	 
  
		
  	
  
  
  	
  	
  		Evidential Deep Learning for Open Set Action Recognition
  		Wentao Bao, Qi Yu, Yu Kong
  		
International Conference on Computer Vision (ICCV), 2021 (
Oral)
		
PDF 
  		arXiv
  		Poster
		Code
		BibTeX
  	 
  
  	
  	
  		DRIVE: Deep Reinforced Accident Anticipation with Visual Explanation
  		Wentao Bao, Qi Yu, Yu Kong
  		
International Conference on Computer Vision (ICCV), 2021
		PDF 
  		arXiv
  		Poster
		Code
		BibTeX
  	 
  
		
		
		
	
  
  
	
	
		Uncertainty-based Traffic Accident Anticipation with Spatio-Temporal Relational Learning
		Wentao Bao, Qi Yu, Yu Kong
		
The 28th ACM International Conference on Multimedia (MM), 2020
		arXiv 
		DOI 
		Code 
		Dataset 
		BibTeX
	 
  
		
	
  
  
  
	
	
		Object-Aware Centroid Voting for Monocular 3D Object Detection
		Wentao Bao, Qi Yu, Yu Kong
		
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020
		PDF 
		arXiv 
		Demo
		BibTeX
	 
  
		
		
	
  
  
  Journals
  
  
	
	
		Human Scanpath Prediction based on Deep Convolutional Saccadic Model
		Wentao Bao, Zhenzhong Chen
		
Elsevier Journal of Neurocomputing (Neurocomputing), 2020
		DOI 
		
		BibTeX
	 
  
	
	
		MonoFENet: Monocular 3D Object Detection with Feature Enhancement Networks
		Wentao Bao, Bin Xu, Zhenzhong Chen
		
IEEE Transactions on Image Processing (TIP), 2019
		DOI 
		
		BibTeX
	 
  
  Preprints
	
	
		
		
			Latent Space Energy-based Model for Fine-grained Open Set Recognition
			Wentao Bao, Qi Yu, Yu Kong
			
Preprint, 2023
			arXiv
		BibTeX
		 
	
Selected Awards & Honors
  Awards
    
	-  CVPR 2024 Travel Award for presentation in  CVPR'24 Doctoral Consortium, Seattle, USA, 2024.
-  CVPR 2022 Travel Award for in-person conference at New Orleans, USA, 2022.
    
-  AAAI 2020 Travel Award for in-person conference at New York, USA, 2020.
    
-  Postgraduate Academic Innovation Award from Wuhan University, 2020. 
-  Grand Prize Winner, ICME 2018 Grand Challenge on Salient360! Visual Attention Modeling for 360 Content, 2018. 
-  Bronze Award in Hubei Province, The 2nd China College Students "Internet Plus" Innovation and Entrepreneurship Competition. 2016. 
-  Second-Class Prize, The 3rd National Graduate Contest on Smart-City Technology and Creative Design, Abnormal Event Detection. 2016. 
-  First Prize, IEEE BigMM 2015 Challenge: "Large-Scale Object Tracking over a Multiple-Camera Network". 2015. 
-  Third-Class Prize, The 14th "Challenge Cup" National Undergraduate Curricular Academic Science and Technology Contest on "Smart City". 2015. 
-  Meritorious Winner, Mathematical Contest in Modeling (MCM). 2015. 
-  Second prize, The 12th "SuperMap Cup" National Undergraduate GIS Contest, Android Application Development. 2014. 
    
Honors
	
	-  Excellent Graduated Student, Wuhan University, 2019. (top 5%)
-  China National Scholarship, 2018. 
-  The First-class Academic Scholarship, Wuhan University, 2017 & 2018. (top 10%)
-  Outstanding Postgraduate Student, Wuhan University, 2017 & 2018. (top 10%)
-  Excellent Graduate Freshman Scholarship of Wuhan University, 2016. (top 10%)
-  Advanced Individual, Wuhan University, 2016. 
Academic Services
  Editorial Board
	
  Conference Reviewer
    
    - IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR): 2021, 2022, 2023, 2024, 2025, 2026 
- IEEE/CVF International Conference on Computer Vision (ICCV): 2021, 2025 
- European Conference on Computer Vision (ECCV): 2022, 2024
- International Joint Conference on Artificial Intelligence (IJCAI): 2023
- ACM International Conference on Multimedia (ACM MM): 2019, 2020, 2021, 2022, 2023
- IEEE/CVF Winter Conference on Applications of Computer Vision (WACV): 2023, 2024, 2025
- IEEE International Conference on Robotics and Automation (ICRA): 2021
Journal Reviewer
    
  Membership
	
	-  IEEE Student Member.
-  ACM Student Member.
Volunteer
	
  Teaching
	
		- Teaching Assistant, MSU CSE-402: Biometrics and Pattern Recognition, FS2023.
- Teaching Activities (DRL Intro.), RIT CSCI-631: Foundations of Computer Vision, SS2021 & SS2022.
- Teaching Activities (EDL Intro.), RIT CISC-849: PhD Seminar, FS2021.
Academic Talks
	
		- 2025.02.18: Delivering a guest lecture at Emory CS570 hosted by Dr. Wei Jin.
  		- 2021.08.20 & 2021.09.16: Delivering two academic talks in Chinese media Jishi and TechBeat, introducing our recent ICCV Oral paper.
		- 2020.11.17: Delivering an academic talk in the 2020 RIT Graduate Virtual Showcase: A Vision Into the Future.