I am now working as Research SDE in multimedia search group in Microsoft Live Search.

        I got my PhD degree in December, 2008 from Image Formation and Processing Group (IFP) in University of Illinois at Urbana-Champaign. My technical advisor is Professor Thomas S. Huang. I obtained my B.S. and M.S. degree in computer science under the supervision of Professor Shaoping Ma from State Key Laboratory of Intelligent Technology and Systems (LITS) and Department of Computer Science and Technology in Tsinghua University  in 1999 and 2001.  After that, I worked as an Assistant Researcher in Microsoft Research Asia during 2001 and 2004. My current research interests include machine learning and face related projects such as detection, tracking, pose estimation and recognition.      




      Camera/Microphone Array for Dynamic 3D Face Data Collection


We are building a synchronized camera/microphone array, which is able to capture facial expression video with speech and reconstruct the dynamic 3D face model. Both hardware and software issues including camera calibration, video/audio synchronization, facial feature points tracking, and 3D reconstruction are considered. To our best knowledge, this is the first camera/microphone array which is able to capture synchronized high resolution facial action video with speech. The system can be applied to collect multimodal 3D data for facial expression analysis and surveillance, etc (More…)


       Building Large Scale 3D Face Database for Face Analysis


      We propose to build a large scale 3D face database with dense correspondence for variant face analysis research purposes. Large scale means that the number of subjects in the database is more than 400, which is, to our best knowledge, the biggest one at this time. 3D face means that we provide both the texture and shape of human faces, which is also balanced in gender and race. Dense correspondence means that the key facials points with semantic meanings are carefully labeled and aligned among different faces, which can be used for a broad range of face analysis tasks. More and more data is still being collected and processed to enlarge the extensive 3D face database. The resulted face database provides solid ground truth for human face related tasks such as alignment, tracking, recognition and animation, etc. (More….)


       3D Face Reconstruction and Recognition

           An analysis-by-synthesis framework for face recognition with variant pose, illumination and expression (PIE) is proposed in this project. First, an efficient 2D-to-3D integrated face reconstruction approach is introduced to reconstruct a personalized 3D face model from a single frontal face image with neutral expression and normal illumination; Then, realistic virtual faces with different PIE are synthesized based on the personalized 3D face to characterize the face subspace; Finally, face recognition is conducted based on these representative virtual faces. Compared with other related works, this framework has the following advantages: 1) only one single frontal face is required for face recognition, which avoids the burdensome enrollment work; 2) the synthesized face samples provide the capability to conduct recognition under difficult conditions like complex PIE; and 3) the proposed 2D-to-3D integrated face reconstruction approach is fully automatic and more efficient. The extensive experimental results show that the synthesized virtual faces significantly improve the accuracy of face recognition with variant PIE. (More...) 

       Face Detection, Alignment and Recognition

           We created a prototype of Face Detection, alignment and Recognition. In this prototype, we showed our technologies of face detection, tracking, alignment and recognition. It is the first real-time multi-view face recognition pipeline system in the world and was shown in CVPR 01, FG 02, ECCV 02, etc. 

       User Attention Tracking in Large Display

           This system demonstrates the technology that automatically tracks a user attention area in a large display such that the attentive documents or folders in a large display or multiple displays could be activated. That is, when a user sits in front of a large scale display, the system is able to track the user's head and key facial points in real-time with a single camera, and derives user's attention to different positions within the large screen, and automatically move the cursor to the attention center, activate opened documents or highlight folder within the attention area. In this way, the efficiency of user's interaction in the large display and experience will be greatly improved. (More...)

       Photo Date Detection and Recognition

             There are large quantities of photos which need to be digitalized and indexed. The majority of them have date strings exposed on the corner which indicate the shot time. We use image processing and pattern recognition technology to locate and recognition them. Different formats and fonts are supported.




Conference Papers:

    Y.X. Hu, H. Tang, T.S. Huang, Camera and Microphone Array for 3D AUDIOVISUAL Face Data Collection, The 33rd International Conference on Acoustics, Speech, and Signal Processing (ICASSP2008)

    C.X. Zhou, Y.X. Hu, Y. Fu, Y.M. Wang, T.S. Huang, 3D Shape Analysis Using Hybrid Permutation Tests, ICASSP2008

    Y.X. Hu, Z.Q. Zhang, X. Xu, Y. Fu, Building Large Scale 3D Face Database for Face Analysis, Multimedia Content Analysis and Mining International Workshop, (MCAM 2007)

    Y.X. Hu, Y. Fu, U. Tariq, T.S. Huang, Subjective Experiments on Gender and Ethnicity Recognition from Different Face Representations, submitted to the 15th The International Conference on Image Processing (ICIP2008)

    H.Z. Ning, Y.X. Hu, T.S. Huang, Efficient Initialization of Mixtures of Experts for Human Pose Estimation, submitted to ICIP2008

    Y.X. Hu, T.S. Huang, Subspace Leanring for Human Head Pose Estimation, submitted to IEEE International Conference on Multimedia & Expo, (ICME2008)

    Y. X. Hu, Z.H. Zeng, L.J. Yin, X.Z. Wei, T.S. Huang, A Study of Non-frontal-view Facial Expression recognition, Submitted to ICME2008;

    H. Tang, Y.X. Hu, Y. Fu, M. Hasegawa-Johnson, T.S. Huang, Real-Time Conversion from a Single 2D Face Image to a 3D Text-driven Emotive Audio/Visual Avatar, submitted to ICME2008

    Z.Q. Zhang, Y.X. Hu, M. Liu and T.S. Huang, Head Pose Estimation in Seminar Room using Multi View Face Detectors, in Proceedings of the CLEAR06, Springer LNCS series

    J.L. Tu, Y. Fu, Y.X. Hu, T.S. Huang, Evaluation of Head Pose Estimation for Studio Data. CLEAR 2006

    Z.H. Zeng, Y.X. Hu, M. Liu, Y. Fu, T.S. Huang, , Training Combination Strategy of Multi-stream Fused HMM for Audio-visual Affect Recognition, the 14th ACM International Conference on Multimedia (ACM MM2006)

    Z.H. Zeng, Y.X. Hu, Y. Fu, T. S. Huang, G. I. Roisman, Z. Wen: Audio-visual emotion recognition in adult attachment interview. ICMI 2006

    Z.Q. Zhang, Y.X. Hu, T.L. Yu, T.S. Huang, "Minimum Variance Estimation of 3D Face Shape from Multi-view", the 7th International Conference on Automatic Face and Gesture Recognition (FGR2006)

    H.Z. Ning, Tony X. Han, Y.X. Hu, Z.Q. Zhang, Y. Fu and T. S. Huang, A Real-time Shrug Detector, FGR2006

    Z.H. Zeng, Y. Fu, Glenn I. Roisman, Z. Wen, Y.X. Hu and T. S. Huang, One-Class Classification for Spontaneous Facial Expression Analysis, FGR2006

    Y.X. Hu, L.B. Chen, Y. Zhou, H.J. Zhang, "Estimating Face Pose by Facial Asymmetry and Geometry", FGR2004

    Y.X. Hu, D.L. Jiang, S.C. Yan, H.J. Zhang, "Automatic 3D Reconstruction for Face Recognition", FGR2004

    Y.X. Hu, S.C. Yan, D.L. Jiang, Y. Zhou, Personalized 3D Face Model Reconstruction from Single Image, The 8th European Conference on Computer Vision (ECCV 2004) Demo Summary

    L. Zhang, Y.X.Hu, M.J Li, W.Y. Ma, H.J Zhang, “Efficient Propagation for Face Annotation in Family Albums”, (ACM MM2004)

    S.C. Yan, Y.X. Hu, X.F. He, "Discriminant Analysis on Embedded Manifold", ECCV 2004

    Y. Zhou, L. Zhang, Y.X. Hu, H.J. Zhang, "Robust Face Alignment", IEEE Intl. Conf. on Computer Vision 2003 (ICCV2003) Demo Summary

    X.F. He, S.C. Yan, Y.X. Hu, H.J. Zhang, "Spectral Analysis for Face Recognition", Asia Conference on Computer Vision 2004 (ACCV2004)

    X.F. He, S.C. Yan, Y.X. Hu, H.J. Zhang, "Learning a Locality Preserving Subspace for Visual Recognition", ICCV 2003

    L.B. Chen, L. Zhang, Y.X. Hu, M.J. Li, H.J. Zhang, Head Pose Estimation using Fisher Manifold Learning ICCV2003 Workshop on RATFG 2003

    S.Z. Li, X.L. Zou, Y.X. Hu, Z.Q. Zhang, S.C. Yan, X.H. Peng, L. Huang, H.J. Zhang. "Real-Time Multi-View Face Detection, Tracking, Pose Estimation, Alignment, and Recognition". CVPR 2001 Demo Summary


Journal Papers: 

    Z.H. Zeng, Y.X. Hu, G. I. Roisman, Z. Wen, T. S. Huang,  Audio-visual Spontaneous Emotion Recognition (Invited submission), Lecture Notes in Artificial Intelligence for Human Computing, 2007.

    S. Yan, Y.X. Hu, D. Xu, H. Zhang, B. Zhang, Q. Cheng. "Nonlinear Discriminant Analysis on Embedded Manifold", IEEE Transactions on Circuits and Systems for Video Technology (TCSVT2006).

    Z.H. Zeng., Y. Fu, G.I. Roisman, Z. Wen, Y.X. Hu, and T.S. Huang, Spontaneous Emotional Facial Expression Detection. Journal of Multimedia, 2006 1(5): 1-8. (JMM2005)

    D.L. Jiang, Y.X. Hu, S.C. Yan, H.J. Zhang, "Efficient 3D Reconstruction for Face Recognition", Journal of Pattern Recognition, Special Issue on Image Understanding for Digital Photographs, 2005 (PR2005)

    X.F He, S.C Yan, Y.X Hu, H.J Zhang, “Laplacian Face for Face Recognition”, IEEE Trans. of Pattern Analysis and Machine Intelligence (TPAMI2005),

    S.C Yan, X.F. He, Y.X. Hu, H.J. Zhang, M.J. Li, Q.S. Cheng, "Bayesian Shape Localization for Face Recognition Using Global and Local Textures", IEEE Transactions on Circuits and Systems for Video Technology, 2003 (TCSVT2003)



    Y.X. Hu, L. Zhang, M.J. Li, H.J. Zhang, US Patent 7,391,888: Head Pose Assessment Methods and Systems

    D.L. Jiang, H.J. Zhang, L. Zhang, S.C. Yan, Y.X. Hu, US Patent 7,415,152: “Method and System for Constructing a 3D Representation of a Face from a 2D Representation”

    L. Zhang, M.J. Li, W.Y. Ma, Y.F. Sun, Y.X. Hu, US Patent 7,403,642: “Efficient Propagation For Face Annotation”


Academic Services


Program Committee Member: CVPR2008, ICCV2007



    Journals: TPAMI, TCSVT, TIP, PR, JCST, etc.

    Conferences :

ACM Multimedia, ACM MIR, ACM CIVR, etc


EuroGraphics, BMVC, MCAM, etc.




