Hi, my name is

Haiyi Mei

梅海艺

I'm a

About Me

👋 I’m a committed algorithm researcher and backend engineer with 3 years of industry experience. Specializing in 3D synthetic data generation for computer vision tasks, I contributed to 9 papers in top-tier conferences.

Recently, I’ve been focusing on text/image-to-video generation, contributing to VAE training, dataset preparation, automated annotation, and video acquisition using crawler techniques. I’m well-versed in the entire T2V pipeline and passionate about pushing the boundaries of innovation in this field.

My expertise spans:

  • Engine tools development: Unreal Engine, Blender, etc.
  • Rendering techniques: PBR, differentiable rendering, NeRF, 3DGS, etc.
  • 3D computer vision: Novel view synthesis, generative models on 3D, etc.
  • Text/image-to-video generation: VAE, Diffusion models, dataset preparation (automated annotation, crawler techniques), etc.
  • Backend development: FastAPI, Flask, scalable robust system design, etc.
  • Open-source projects: CI/CD, Docker, Kubernetes, etc.

Languages and tools I’m working with:

Python C++ PyTorch Unreal Engine Blender PyTorch3D Badge Mitsuba Badge
Docker Flask FastAPI Kubernetes Git Linux LaTex

Experience

Mid-Level Algorithm Researcher - SenseTime
June 2021 - Present

» Text/Image-to-Video Generation

I’m well-versed in the entire T2V pipeline and passionate about pushing the boundaries of innovation in this field.

  • Led VAE training, successfully reproducing results comparable to the VAE of CogVideo with similar performance.
  • Contributed to the data annotation process by supporting the application of textual, visual, and qualitative analyses to ensure high-quality video datasets.
  • Developed automated solutions for video content acquisition from various online platforms lacking direct API support.

» Synthetic Data Generation


»» XRFeitoria

I spearheaded the development of XRFeitoria XRFeitoria, a rendering toolbox designed to streamline synthetic data generation. This tool simplifies the construction of pipelines for topics including human mesh recovery, novel view synthesis, etc, leading to the publishing of several papers within a short timeframe:


»» SynBody

I held a principal role in designing SynBody, a large-scale synthetic dataset with layered human models.

  • Participated in designing SMPL-XL, which enriches SMPL-X in hair, garments, accessories, and textures.
  • Built a System of synthetic data generation as a Software-as-a-Service (SaaS) platform, including layered human creation, motion retargeting, scene composition, rendering pipeline, and flow control.
  • Published a paper presented at ICCV 2023 as co-first author.

» Demo videos

I’m proficient in crafting advanced technology demonstration videos.

Visiting Research Intern - NLPR | CASIA
2018 - 2021

Started to work on synthetic data generation for computer vision tasks.

Had the experience of reproducing the SOTA methods in video captioning. Code

Open-Source Projects

[XRFeitoria: Rendering Toolbox for Synthetic Data Generation](https://github.com/openxrlab/xrfeitoria)
Python C++ Blender Unreal Engine
xrfeitoria
XRFeitoria: Rendering Toolbox for Synthetic Data Generation
  • Control over engine (UE/Blender) through RPC using system python.
  • Support multiple engine backends, including Unreal Engine and Blender.
  • Render photorealistic images with ground-truth annotations.
  • Manage assets/cameras, including import, place, export, and delete.
  • CLI tools to render images from a mesh file.
  • Streamline building rendering pipelines across various domains, effectively implemented in over 8 projects.

Publications

Differentiable Convex Polyhedra Optimization from Multi-view Images
DiffConvex arXiv
ECCV 2024
Differentiable Convex Polyhedra Optimization from Multi-view Images
Daxuan Ren
, Haiyi Mei,
Hezi Shi
,
Jianmin Zheng
,
Jianfei Cai
,
Lei Yang
A method is introduced for differentiable rendering of convex polyhedra, combining hyperplane intersection with vertex optimization, enabling efficient shape representation without 3D implicit fields. It addresses limitations of prior methods and sets a new standard for representing convex polyhedra.
WHAC: World-grounded Humans and Cameras
WHAC arXiv
ECCV 2024
WHAC: World-grounded Humans and Cameras
Wanqi Yin
,
Zhongang Cai
,
Ruisi Wang
,
Fanzhou Wang
,
Chen Wei
, Haiyi Mei,
Weiye Xiao
,
Zhitao Yang
,
Qingping Sun
,
Atsushi Yamashita
,
Ziwei Liu
,
Lei Yang
WHAC, a framework for jointly estimating human models (SMPL-X) and camera poses from monocular video, using depth cues from human motion. Along with this, the WHAC-A-Mole dataset was presented, featuring annotated human motions and camera trajectories.
[Digital Life Project: Autonomous 3D Characters with Social Intelligence](https://digital-life-project.com/)
arXiv
CVPR 2024
Digital Life Project: Autonomous 3D Characters with Social Intelligence
Zhongang Cai
,
Jianping Jiang
,
Zhongfei Qing
,
Xinying Guo
,
Mingyuan Zhang
,
Zhengyu Lin
, Haiyi Mei,
Chen Wei
,
Ruisi Wang
,
Wanqi Yin
, Xiangyu Fan, Han Du,
Liang Pan
,
Peng Gao
,
Zhitao Yang
, Yang Gao,
Jiaqi Li
, Tianxiang Ren,
Yunkun Wei
,
Xiaogang Wang
,
Chen Change Loy
,
Lei Yang
,
Ziwei Liu
The Digital Life Project creates autonomous 3D characters capable of engaging in social interactions and expressing through body motions. This project is groundbreaking in simulating life in a digital environment.
[AiOS: All-in-One-Stage 3D Wholebody Mesh Recovery](https://ttxskk.github.io/AiOS/)
AiOS arXiv
CVPR 2024
AiOS: All-in-One-Stage 3D Wholebody Mesh Recovery
Qingping Sun
,
Yanjun Wang
,
Ailing Zeng
,
Wanqi Yin
,
Chen Wei
,
Wenjia Wang
, Haiyi Mei,
Chi Sing Leung
,
Ziwei Liu
,
Lei Yang
,
Zhongang Cai
AiOS performs human localization and SMPL-X estimation in a progressive manner. It is composed of (1) the body localization stage that predicts coarse human location; (2) the Body refinement stage that refines body features and produces face and hand locations; (3) the Whole-body Refinement stage that refines whole-body features and regress SMPL-X parameters.
[PrimDiffusion: Volumetric Primitives Diffusion for 3D Human Generation](https://frozenburning.github.io/projects/primdiffusion/)
PrimDiffusion arXiv
NeurIPS 2023
PrimDiffusion: Volumetric Primitives Diffusion for 3D Human Generation
Zhaoxi Chen
,
Fangzhou Hong
, Haiyi Mei,
Guangcong Wang
,
Lei Yang
,
Ziwei Liu
PrimDiffusion performs the diffusion and denoising process on a set of primitives which compactly represent 3D humans. This generative modeling has explicit pose, view, and shape control, with the capability of modeling off-body topology in well-defined depth. It enables downstream tasks like 3D texture transfer and inpainting.
[SMPLer-X: Advanced 3D Human Body Modeling](https://caizhongang.com/projects/SMPLer-X/)
SMPLer-X arXiv
NeurIPS 2023
(Datasets and Benchmarks Track)
SMPLer-X: Advanced 3D Human Body Modeling
Zhongang Cai
,
Wanqi Yin
,
Ailing Zeng
,
Chen Wei
,
Qingping Sun
,
Yanjun Wang
,
Hui En Pang
, Haiyi Mei,
Mingyuan Zhang
,
Lei Zhang
,
Chen Change Loy
,
Lei Yang
,
Ziwei Liu
SMPLer-X is the first generalist foundation model for Expressive human pose and shape estimation (EHPS). With big data and large model, SMPLer-X exhibits strong performance across diverse test benchmarks and excellent transferability to even unseen environments.
[SynBody: Synthetic Dataset with Layered Human Models for 3D Human Perception and Modeling](https://synbody.github.io/)
arXiv
ICCV 2023
SynBody: Synthetic Dataset with Layered Human Models for 3D Human Perception and Modeling
Zhitao Yang*
,
Zhongang Cai*
, Haiyi Mei*, Shuai Liu*,
Zhaoxi Chen*
,
Weiye Xiao
, Yukun Wei, Zhongfei Qing,
Chen Wei
,
Bo Dai
,
Wayne Wu
,
Chen Qian
,
Dahua Lin
,
Ziwei Liu
,
Lei Yang
SynBody is a large-scale synthetic dataset with massive number of subjects and high-quality annotations. It supports various research topics, including human mesh recovery and novel view synthesis for human (Human NeRF).
[Zolly: Zoom Focal Length Correctly for Perspective-Distorted Human Mesh Reconstruction](https://wenjiawang0312.github.io/projects/zolly/)
Zolly arXiv
ICCV 2023 (Oral)
Zolly: Zoom Focal Length Correctly for Perspective-Distorted Human Mesh Reconstruction
Wenjia Wang
,
Yongtao Ge
, Haiyi Mei,
Zhongang Cai
, Qingping Sun, Yanjun Wang,
Chunhua Shen
,
Lei Yang
,
Taku Komura
Zolly, the first 3DHMR method focusing on perspective-distorted images, outperforms existing methods on perspective-distorted datasets and the standard benchmark (3DPW).
[SHERF: Generalizable Human NeRF from a Single Image](https://skhu101.github.io/SHERF/)
SHERF arXiv
ICCV 2023
SHERF: Generalizable Human NeRF from a Single Image
Shoukang Hu*
,
Fangzhou Hong*
,
Liang Pan
, Haiyi Mei,
Lei Yang
,
Ziwei Liu
Reconstruct human NeRF from a single image in one forward pass!
[HumanLiff: Layer-wise 3D Human Generation with Diffusion Model](https://skhu101.github.io/HumanLiff/)
HumanLiff arXiv
Preprint 2023
HumanLiff: Layer-wise 3D Human Generation with Diffusion Model
Shoukang Hu
,
Fangzhou Hong
,
Tao Hu
,
Liang Pan
, Haiyi Mei,
Weiye Xiao
,
Lei Yang
,
Ziwei Liu
HumanLiff learns the layer-wise 3D human generative model with a unified diffusion process.

Education

2019 - 2022
Master of Biomedical/Medical Engineering
Shandong University

I Published:

2015 - 2019
Bachelor of Automation
Nanjing University of Science and Technology

I Published:

  • Patent filed with CNIPA: CN108453742B.
  • Mian ZHANG, Ying HUANG, Haiyi MEI, Yu GUO. Intelligent interaction method for power distribution robot based on Kinect. Journal of Shandong University(Engineering Science), 2018, 48(5): 103-108.

Extracurricular Activities

  • Head of the Symphony Orchestra of School, 2016 - 2018.
  • First Place Award, National University Piano Competition, Chinese Golden Bell Award for Music, 2017.

Get in Touch

🥳 Contact me for any questions or want to collaborate! Always open to new opportunities.

haiyimei [at] gmail.com