Bin Xiao

Bin Xiao is a Research Engineer of Meta GenAI team. He is working the multi-modality Llama model development at Meta. His research interests include computer vision, deep learning and multi-modality large language models. His representative works include phi-3-vision, Florence models, and high-resolution network (HRNet).

Research Highlights

Selected publications

Please refer to my Google scholar for a full list of my publications.

  1. Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
    Microsoft GenAI team
    [arXiv] [HuggingFace]
  2. Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
    Bin Xiao, Haiping Wu, Weijian Xu, Xiyang Dai, Houdong Hu, Yumao Lu, Michael Zeng, Ce Liu, Lu Yuan
    [arXiv] [HuggingFace] CVPR 2024 (oral)
  3. Deep High-Resolution Representation Learning for Human Pose Estimation
    Ke Sun, Bin Xiao, Dong Liu, Jingdong Wang
    Conference on Computer Vision and Pattern Recognition (CVPR), 2019
    [Paper] [Code] Paper Digest Most Influential CVPR Papers
  4. CvT: Introducing Convolutions to Vision Transformers
    Haiping Wu, Bin Xiao, Noel Codella, Mengchen Liu, Xiyang Dai, Lu Yuan, Lei Zhang
    International Conference on Computer Vision (ICCV), 2021
    [Paper] [Code] Paper Digest Most Influential ICCV Papers
  5. Simple baselines for human pose estimation and tracking
    Bin Xiao, Haiping Wu, Yichen Wei
    European Conference on Computer Vision (ECCV), 2018
    [Paper] [Code] Paper Digest Most Influential ECCV Papers
  6. DaViT: Dual Attention Vision Transformers
    Mingyu Ding, Bin Xiao, Noel Codella, Ping Luo, Jingdong Wang, Lu Yuan
    European Conference on Computer Vision (ECCV), 2022
  7. Unified contrastive learning in image-text-label space
    Jianwei Yang, Chunyuan Li, Pengchuan Zhang, Bin Xiao, Ce Liu, Lu Yuan, Jianfeng Gao
    Conference on Computer Vision and Pattern Recognition (CVPR), 2022
    [Paper]
  8. Florence: A New Foundation Model for Computer Vision
    Lu Yuan, Dongdong Chen, Yi-Ling Chen, Noel Codella, Xiyang Dai, Jianfeng Gao, Houdong Hu, Xuedong Huang, Boxin Li, Chunyuan Li, Ce Liu, Mengchen Liu, Zicheng Liu, Yumao Lu, Yu Shi, Lijuan Wang, Jianfeng Wang, Bin Xiao, Zhen Xiao, Jianwei Yang, Michael Zeng, Luowei Zhou, Pengchuan Zhang
    [arXiv]
  9. Lite-hrnet: A lightweight high-resolution network
    Changqian Yu, Bin Xiao, Changxin Gao, Lu Yuan, Lei Zhang, Nong Sang, Jingdong Wang
    Conference on Computer Vision and Pattern Recognition (CVPR), 2021
    [Paper] [Code]
  10. HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation
    Bowen Cheng, Bin Xiao, Jingdong Wang, Honghui Shi, Thomas S Huang, Lei Zhang
    Conference on Computer Vision and Pattern Recognition (CVPR), 2020
    [Paper] [Code]
  11. Deep High-Resolution Representation Learning for Visual Recognition
    Jingdong Wang, Ke Sun, Tianheng Cheng, Borui Jiang, Chaorui Deng, Yang Zhao, Dong Liu, Yadong Mu, Mingkui Tan, Xinggang Wang, Wenyu Liu, Bin Xiao
    IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2020
    [Paper]
  12. Integral human pose regression
    Xiao Sun, Bin Xiao, Fangyin Wei, Shuang Liang, Yichen Wei
    European Conference on Computer Vision (ECCV), 2018
    [Paper]

Honors and Awards

  • 1st place in Look into Person Challenge 2019: Single-Person Human Pose Estimation Track
  • 2nd place in Object356 Challenge 2019 : Full track
  • 1st place in PoseTrack Multi-Person Pose Tracking Challenge 2018
  • 2nd place in COCO Keypoint Detection Challenge 2018

Professinonal Activities

  • Conference reviewer: CVPR, ICCV, ECCV, ICLR, and et.al.
  • Journal reviewer: T-PAMI, T-MM, IJCV, and et.al.