🧍♂️ Biography
Hi, 😊 here is HUANG Jiehui, Resume. I am passionate 🚀 about academic research and aim to use my research findings to address real-world challenges, thereby making meaningful and impactful contributions to society.
I have been relentlessly striving to see humanity achieve digital immortality and machine consciousness at the earliest opportunity. My primary research interest is in controllable AIGC & VLM. If you are interested in collaboration or wish to contact me, please feel free to reach out via email.
Previously, I received a B.S. degree in Automation from Nanchang University, Jiangxi, China, in 2022. I am currently working toward an M.S. degree, advised by Xiaodan Liang (梁小丹), co-supervised by Shencai Liao (Professor at United Arab Emirates University, IEEE Fellow, IAPR Fellow). I am working in HCPLab, Artificial Intelligence at the School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, China. I am also lucky to have opportunities to collaborate with Peter. X. Liu (Professor at Carleton University, IEEE Fellow), Calvin Yu-Chian Chen (Professor at Peking University).
🔥 News
- 2025.01: 🎉🎉 One paper was accepted by IEEE Transactions on Instrumentation & Measurement.
- 2024.11: 🎉🎉 Obtain the National Scholarship from the Sun Yat-sen University.
- 2024.04: 🎉🎉 Release (✨ 800+ Star), ConsistentID, one high-fidelity and fast customized portrait generation model.
- 2024.01: 🎉🎉 One paper was accepted by Computers in Biology and Medicine.
- 2023.12: 🎉🎉 Two papers were accepted by AAAI and Knowledge-Based Systems respectively.
- 2023.11: 🎉🎉 One paper was accepted by Neurocomputing.
- 2022.10: 🎉🎉 Obtain the Master’s Scholarship for first class from the Sun Yat-sen University.
- 2021.11: 🎉🎉 Obtain the National Scholarship from the Nanchang University.
💻 Internships
- 2024.12 - 2025.03, Pixocial Technology Team, cooperate with Haoxiang Li.
- 2024.04 - 2024.09, Tencent, Hunyuan Team, Tencent TEG, Shenzhen.
- 2023.11 - 2024.03, Lenovo, Research Institute, Shenzhen.
- 2023.07 - 2023.10, SenseTime, Research Institute, Shenzhen.
📝 Publications

ConsistentID:Portrait Generation with Multimodal Fine-Grained Identity Preserving
Jiehui Huang, Xiao Dong, Wenhui Song, Hanhui Li, Jun Zhou, Yuhao Cheng, Shutao Liao, Long Chen, Yiqiang Yan, Shengcai Liao, Xiaodan Liang*
- We introduce ConsistentID to improve fine-grained customized facial generation by incorporating detailed descriptions of facial regions and local facial features.
- We devise an ID-preservation network optimized by facial attention localization strategy, enabling more accurate ID preservation and more vivid facial generation.
- We introduce the inaugural fine-grained facial generation dataset, FGID, addressing limitations in existing datasets for capturing diverse identity-preserving facial details.

TMBL: Transformer-based multimodal binding learning model for multimodal sentiment analysis
Jiehui Huang, Jun Zhou, Zhenchao Tang, Jiaying Lin, and Calvin Yu-Chian Chen*
- Considering that existing multi-modal fusion systems rarely consider fine-grained word-level interactions, we redesigned the Transformer structure, effectively improving the ACC index by 6%.
- In order to solve the problem of modal heterogeneity caused by multi-modal feature fusion, inspired by CLIP, a cross-model binding mechanism was designed for each modality to more effectively fuse modal features.
- Aiming at the modal aliasing problem caused by the difficulty in distinguishing modal features, CLS and PositionEmbedding information are designed to effectively distinguish modal space and semantic relationships.

Progressive network based on detail scaling and texture extraction: A more general framework for image deraining
Jiehui Huang, Zhenchao Tang, Xuedong He, Jun Zhou, Defeng Zhou, and Calvin Yu-Chian Chen*
- In order to enhance the coupling and portability of the module, the existing rain removal module was redesigned and a multi-scale coupling method was established. A simple and effective strategy achieved a model increase of 5%.
- In order to improve the transferability and generalization of the model, a detail scaling module is designed to extract generalized features from degraded images and restore finer details to avoid distortion.
- The attention layer and feed-forward layer in the Transformer block are enhanced to extract universal features more efficiently, enhancing the model’s ability to capture comprehensive and transferable features.
- The progressive learning strategy assists in learning more multi-scale features and achieves SOTA performance on data sets such as SPA-Data, RainDrop, RID, and Rain100.

Comprehensive View Embedding Learning for Single-cell Multimodal Integration
Zhenchao Tang, Jiehui Huang, Guanxing Chen, Pengfei Wen, and Calvin Yu-Chian Chen*
- Embedding learning is performed on single-cell multi-modal data from three views, such as the regulatory relationship between different modalities and the relationship between single-cell fine-grained features in each modality.
- By learning graph link embeddings, the proposed CoVEL can model cross-modal regulatory relationships and use biological knowledge to bridge the gap between feature spaces under different modalities.
- To ensure that differences between modalities are eliminated and biological heterogeneity is preserved, single-cell fine-grained embeddings and contrastive cell embeddings are unsupervisedly learned on multimodal data.
- The proposed self-supervised learning method effectively finds the information between data from the perspective of representation learning, while the generation method focuses on learning the information within the data.
Computers In Biology And Medicine 2024
Parkinson’s Severity Diagnosis Explainable Model Based on 3D Multi-Head Attention Residual Network, Jiehui Huang, Linsan Lin, Fengcheng Yu, Xuedong He, et. al | ProjectSoft Computing 2021
Spiral-based chaotic chicken swarm optimization algorithm for parameters identification of photovoltaic models, Miao Li, Chunquan Li, Zhenyu Huang, Jiehui Huang, Gaige Wang, Peter. X. Liu*Applied Intelligence 2022
A novel decomposition-based ensemble model for short-term load forecasting using hybrid artificial neural networks, Zhiyuan Liao, Jiehui Huang, Yuxin Cheng, Chunquan Li, Peter. X. Liu*IET Generation, Transmission & Distribution 2021
A decomposition‐based multi‐time dimension long short‐term memory model for short‐term electric load forecasting, Jinglin Liu, Jiehui Huang, Zhiwang Zhou, Chunquan Li, Zhiyuan Liao, and Peter. X. Liu*
🎖 Honors and Awards
- 2024.11 National Scholarship of Sun Yat-sen University
- 2022.10 The First Prize Scholarship of Sun Yat-sen University
- 2022.06 Outstanding Graduate of Nanchang University
- 2021.10 National Scholarship of Nanchang University(0.7%)
- 2021.8 (CIMC) Siemens Cup China Intelligent Manufacturing Challenge: National Preliminary Championship First Prize
- 2020.8 (RMUC) Robomaster Infantry Group: National Championship First Prize
- 2020.2 A patent type for a non-blocking controllable projectile launch system: Invention Patent
📖 Educations
2022.09 - now, M.S Student.
Artificial Intelligence, School of Intelligent Systems Engineering(ISE).
Sun Yat-sen University, Shenzhen.
2018.09 - 2022.06, Undergraduate.
Automation, School of Intelligent Systems Engineering.
Nanchang University (NCU), Nanchang.