Cross-pose Face Recognition By Integrating Regression .

2y ago
16 Views
3 Downloads
1.19 MB
8 Pages
Last View : 15d ago
Last Download : 3m ago
Upload by : Jayda Dunning
Transcription

Li and Huang EURASIP Journal on Wireless Communications and Networking(2019) ESEARCHOpen AccessCross-pose face recognition by integratingregression iteration and interactivesubspaceKefeng Li* and Quanzhen HuangAbstractAt present, the pose change of the face test sample is the main reason that affects the accuracy of facerecognition, and the design of cross-pose recognition algorithm is a technical problem to be solved. In this paper, across-pose face recognition algorithm which integrates the regression iterative method and the interactivesubspace method was proposed, and through the regression iteration, the target function converges rapidly andimportant characteristics of the sample were extracted. Then, the posture of cross-pose face image was estimated,and finally, the interactive subspace method was applied to judge the similarity of the human face. Theexperimental results of FERET face database ( 45 and 90 posture) and MIT-CBCL face database (N-fold crossvalidation) showed that the proposed RIM-ISM algorithm had a higher recognition rate and robustness, and it couldeffectively solve the difficulty of cross-pose face recognition.Keywords: Regression iterative method (RIM), Cross-pose face recognition, Pose estimation, Interactive subspacemethod (ISM) , N-fold cross-validation1 IntroductionThe cross-pose face recognition problem is caused bytexture changes brought by the rotation transformationof the face. When face pose change is bigger, andwithin-cluster variation is greater than inter-cluster variation, the recognition rate will fell sharply. Therefore,cross-pose face recognition is an important researchtopic in recent years.In order to solve the problem of cross-pose recognition, there are two main research methods: the algorithms based on pose correction and feature-basedalgorithms. Pose correction method rotates the faceimage and fits it into a new image, and this synthesismethod includes two types of face models based on 3Dand 2D.Y. Taigman et al. proposed DeepFace [1]. First ofall, 3D face alignment was used for face pose correction,and then face frontal was processed. Later, it was thenfed into the convolutional neural network for feature extraction, using the classifier for face authentication. Masiet al. proposed pose-aware CNN models (PAMs) method* Correspondence: kefengli922@126.comSchool of Electrical Information Engineering, Henan University ofEngineering, Zhengzhou, Chinato deal with pose changes for large pose changes [2].The method uses a convolutional neural network tolearn a specific multi-pose model and converts themulti-pose image to the learned model. Zhu et al. designed a new multi-view sensor, which can separate poseand sample information by using different neurons inthe network and reconstruct various pose images of asingle 2D image [3]. M. Kan et al. designed a stackedprogressive auto-encoders (SPAE) neural network modelto achieve nonlinear modeling of posture changes in asmall-scale data [4]. J. Yim et al. designed a deep neuralnetwork based on multi-task learning, the network canachieve any pose and any illumination face rotated tothe designated face pose and normal illumination.Feature-based methods are to extract face features whichare robust to pose for face recognition [5]. Y. Sun proposed DeepID network structure, which considered localand global characteristics simultaneously, and introduced a larger data set for model training [6]. FlorianSchroff et al. proposed FaceNet framework to directlylearn the coding method from image to Euclidean space,and then conduct face recognition, face verification andface clustering on basis of this coding [7]. Omkar M. The Author(s). 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, andreproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link tothe Creative Commons license, and indicate if changes were made.

Li and Huang EURASIP Journal on Wireless Communications and NetworkingParkhi applied the deep network VGG to face recognitiontask. This network structure is implemented by a 16-layerconvolutional neural network and a three-layer fully connected layer, and it has been tested in LFW and YouTubeFaces database and achieved a better recognition effect[8]. Recently, cross-pose recognition methods based onfeatures such as scale-invariant feature transform (SIFT)[9], LBP [10], and Gabor [11] have achieved good recognition effects under controlled conditions and partially unconstrained conditions.Based on the study of these two kinds of methods, andthrough combining regression iterative method withinteractive subspace method, this study designed aRIM-ISM cross-pose recognition algorithm: first of all,the regression iteration method was adopted to regresscross-pose face shape to the location close to the realshape and detect important characteristics of the sample.Secondly, the deflection angle of the 2D face image corresponding to the 3D rectangular coordinate system wascalculated in the 2D plane and the pose correction wascarried out. Finally, the similarity of cross-pose facialimage was measured by using the interactive subspacemethod. In order to verify the effectiveness of RIM-ISMalgorithm, large-scale experiments were conducted inthe FERET database and the MIT-CBCL database. Compared with other algorithms, the experimental resultsshow that the proposed RIM-ISM cross-pose face recognition algorithm had great advantages.2 Methodology2.1 Extraction of important characteristicsCross-pose face recognition is to recognize or identifyfaces of any pose in an image. The human face is a 3Dstructure, and the face pose can change in the three dimensions (X, Y, and Z), as shown in Fig. 1. Pitch θx rotates around X-axis, yaw θy rotates around Y-axishorizontally, and roll θz rotates around Z-axis plane.The position of key feature points changes due to thechange of pose, and the regression iterative method wasadopted in this paper to deal with the change. The basicprinciple of regression iterative method is to provide aninitial shape for a given face image, and through constantiterations, the initial shape is regressed to the positionwhich is close to or equal to that of the real shape, andthen obtains a series of downward directions and the directions on the scale, so as to make the target functionconverge to the minimum value quickly through studying,which improves the running speed of the program.Given a face image containing m pixels, x Rm 1,d(x) Rp 1 represent p important characteristics of thesample, and h(·) represents a nonlinear characteristic extraction function. Then, on the basis of a given initialshape x0, regression iterative method was used to regress(2019) 2019:105Page 2 of 8Fig. 1 Three degrees of freedom for face pose changex0 to the correct shape x* of the face, that is, to solve Δxwhich minimized f(x0 Δx) in Formula (1):f ðx0 þ ΔxÞ ¼ khðd ðx0 þ ΔxÞÞ ϕ k22ð1ÞWhere ϕ h(d(x )) is SIFT characteristics obtainedfrom the real feature points of the face. The objectivefunction was expanded by Taylor expansion as follows:1f ðx0 þ ΔxÞ f ðx0 Þ þ J f ðx0 ÞT Δx þ ΔxH ðx0 ÞΔx2ð2ÞWhere, Jf(x0) is Jacobian matrix, H(x0) is Hessianmatrix, and Formula (2) solved the derivative of Δx:Δx ¼ H 1 ðx0 Þ J f ðx0 Þ ¼ 2H 1 J Th ðϕ k 1 ϕ Þð3ÞWhere Δx x k x k 1 , Jh is Jacobian matrix, andϕ k 1 h(d(x k 1)) the feature vector extracted from facefeature point xk 1. The regression iterative method wasapplied to solve the optimization problem of Formula (1).The optimal solution could be obtained by solvingFormula (3), and Formula (3) could be rewritten as the regression iterative formula:Δx ¼ Rk 1 ϕ k 1 þ bk 1ð4ÞWhere Rk 1 is the gradient optimization direction, bk 1is the average deviation parameter, and Rk 1 and bk 1could be obtained directly through training and learning.The regression iterative method could be used to detect

Li and Huang EURASIP Journal on Wireless Communications and Networkingthe important characteristics of the sample in real timeand extract the eyes, mouth, nose, and other importantcharacteristic positions of the sample. At last, a total of 45pairs of important characteristics were detected, as shownin Fig. 2.2.2 Estimation of pose deflectionFace pose deflection estimation is to calculate the deflection angle (θx, θy, θz) of the 2D face image correspondingto the 3D rectangular coordinate system in the 2D plane,extract the important characteristics (eyes, nose, mouth,etc.) of the face image, and then determine the feature correspondence between the 3D model and the 2D model.The multi-pose sample image was defined as IQ, andpi(xi, yi)i 1, 2, , 45 is an important characteristic ofthe sample detected by the regression iterative method,as shown in Fig. 2c–d. According to the perspective projection principle of the camera, the corresponding relationship between the key feature point P of the 3Dmodel and the key feature point p of the current faceimage IQ could be obtained as follows: T T pi ; P i ¼ ð x i ; y i ; X i ; Y i ; Z i Þ Tð5ÞThe mapping relation between important characteristics from the 2D model to the 3D model was defined asCQ, thus: C Q ¼ A Q RQ t Q p ¼ C Q Pð6ÞWhere AQ is the camera’s internal parameter mappingmatrix, tQ is the translation mapping matrix, and RQ isthe rotation mapping matrix. RQ contains the attitudeinformation of the image to be recognized, and its mapping relation is(2019) 2019:1052R11RQ ¼ 4 R21R31Page 3 of 8R12R22R323R13R23 5R33ð7ÞFormula (5), (6), and (7) can determine the pose of theface image IQ to be recognized, that is, the deflectionangle (θx, θy, θz) of the 2D face image in the 3D coordinate system. The rotation matrix can be expressed as23100Rx ðθx Þ ¼ 4 0 cosθx sinθx 5ð8Þ0 sinθxcosθx23cosθy 0 sinθy 10 5Ry θy ¼ 4 0ð9Þ sinθy 0 cosθy23cosθz sinθz 0Rz ðθz Þ ¼ 4 sinθzcosθz 0 5ð10Þ001The rotation mapping matrix RQ can be expressed as RQ ¼ Rz ðθz ÞRy θy Rx ðθx Þð11ÞFormula (8) to (11) can be used to calculate the deflection angle (θx, θy, θz), respectively. The calculation formula is as follows:θx ¼ a tan2ðR32 ; R33 Þð12Þθy ¼ sin 1 ðR31 Þ R21R11θz ¼ a tan2;cosβ cosβð13Þð14Þ2.3 Determination of face similarityAfter the face deflection angle was estimated, thecross-pose face image was rotated to correct the pose, so(a) Frontal face to be identified (b) -45 face to be identified (c)Frontal face key feature points (d) - 45 key feature pointsFig. 2 Schematic diagram of important characteristics detection. a Frontal face to be identified. b 45 face to be identified. c Frontal face keyfeature points. d 45 key feature points

Li and Huang EURASIP Journal on Wireless Communications and Networking(2019) 2019:105as to fit the approximate positive attitude image and establish the subspace G after pose correction, and vectorg G. The reference subspace of frontal face was definedas D, vector d D; through calculating the similarity between two subspaces, it determined whether the twosubspaces were the same category. Subspace methodwas usually used as the similarity criterion through computing two subspace cosine of the angle θ, namely,cosθ ¼ λmaxcosθ ¼ S ðg Þ ¼N1 Xðg; φn Þ2kg k2 n¼1jðd; g Þj222d D;g G;kdk 0;kg k 0 kd k kg ksupð17ÞThrough using the interactive subspace method, theeigenvalue problem of calculating high-dimensionalmatrix PQP was transformed into calculating the eigenvalue of a low-dimensional matrix, and let the matrix X beN X X ¼ xij ¼ρi ; φn φn ; ρ jð18Þn¼1ð15ÞWhere φn is the eigenvector of D, and N is the dimension of D. In order to avoid the interference brought byartificial selection of reference subspace, and improvethe stability of the similarity between samples, this paperadopted the interactive subspace method (ISM) to measure the similarity between cross-pose face images: onesubspace was used as the input subspace, and the othersubspace was used as a reference subspace. Then thesubspace method calculation was conducted for twotimes, and the two new spatial vectors were obtained.Later, the included angle between the two new vectors θwas calculated, and the principle was shown in Fig. 3.The calculation formula is as follows:cos2 θ ¼Page 4 of 8ð16ÞWhere ρj is the eigenvector of G, L is the dimensions ofG (L N), and the matrix X could be decomposed intoW T XW ¼ Λð19ÞWhere the diagonal value of the matrix Λ is the eigenvalue of X, and the maximum eigenvalue λmax was selectedto evaluate the similarity between D and G:S ISM ðD; GÞ ¼ λmaxð20ÞAccording to Formula (17), 0 SISM 1, if SISM iscloser to 1, the greater the similarity between the twosamples will be, and the greater the possibility that theface image it represents is the same person. On the contrary, if SISM is closer to 0, the greater the difference between the two samples will be, and the face image itrepresents cannot be the same person.3 Experiments3.1 FERET database cross-pose recognition experimentFormula (16) has a local maximum value, so matrix Pand Q were defined as the orthogonal projection matrixof subspace D and G, and the value of cosθ could be obtained by calculating the maximum value λmax of theeigenvalue of matrix PQP, namely:Fig. 3 Interactive subspace method schematicIn order to verify the cross-pose face recognition performance of the proposed method, an experiment was conducted on FERET database, which was created by theFERET project of the U.S. military. It contains 14,051cross-pose, multi-illumination grayscale face images. The

Li and Huang EURASIP Journal on Wireless Communications and Networkingdatabase contains subsets for different orientations. Figure 4 shows an example of the images of FERET databaseposture subset.Using the “leave-one-out” cross-validation method, theRIM-ISM method was compared with the three methodsof DeepFace, SIFT, and Gabor, and a face image in thedatabase was removed and used as test data, and theremaining images were used for training.The experiment was repeated until all the images weretested again. Although the calculation was tedious, thesample utilization rate was the highest, so it was suitablefor small samples.We chose face image of 45 and 90 with a largerdifficulty to test, and the experimental data were shownin Figs. 5 and 6.3.2 N-fold cross-validationIn order to test the robustness and recognition efficiencyof the RIM-ISM algorithm, N-fold cross-validation wasperformed in MIT-CBCL face database. The N-foldcross-validation randomly divided the sample data intoN parts and selected one of them as the test sample setand N-1 parts as the training sample set. Thecross-validation was repeated for N times, and eachsub-sample was verified once, and the average result ofN times was the estimation of the final single sample.The advantage of this test method is that it can randomly select generated sub-samples repeatedly for training and verification, so as to verify the robustness of facerecognition algorithm in the face of random samples. Atthis time, the obtained average recognition rate andaverage recognition time had a more scientific statisticalsignificance [12].MIT-CBCL face database consisted of 10 people, and200 images were collected from each of them, so thereare a total of 2000 images. The sample image containeda large range of continuous pose changes, lighting andexpression changes, in which the pose changes includedhorizontal pose changes and pitch angle changes. Figure 7 shows an example of MIT-CBCL face databasepose change image.Fig. 4 Images of FERET database pose subset(2019) 2019:105Page 5 of 8The MIT-CBCL database was randomly divided into10 parts, and each part contained 1 pose change imageof each person. Therefore, each group was composed of10 images, and each image corresponded to a differentperson. A 10-fold cross-validation test was conducted.In each experimental operation, 9 groups (90 images intotal) were used as the training samples, and 1 group (10images in total) was reserved as the test samples. Theexperimental data was shown in Table 1, showing the efficiency comparison of the RIM-ISM method with theDeepFace, SIFT, and Gabor methods.4 Results and discussionThe data in Figs. 5 and 6 showed that when the samples’characteristic dimensions were relatively low, theRIM-ISM algorithm is close to the recognition rate ofDeepFace algorithm. With the increase of the samples’characteristic dimensions, the RIM-ISM algorithm had ahigher accuracy than DeepFace, SIFT, and Gabor method.In the experiment of 45 pose change, the recognitionrate was up to 97.1%, and in the experiment of 90 posechange, the recognition rate was up to 79.3%, which werehigher than the performance of the other three methodsin the same condition. The recognition rate in Fig. 5 wassignificantly higher than that in Fig. 6, indicating that thedifficulty of face recognition increased with the increase ofpose deflection angle. Since the pose of the test sampleswas changed from 45 to 90 , the extraction of key feature points were insufficient, and the 45 pairs of featurepoints were reduced to 19 pairs, resulting in the calculation deviation of the three deflection angles in face posedeflection estimation. When using ISM to determine thesimilarity between cross-pose face images, the accuracywas greatly reduced.As can be seen from the data in Table 1, for randommulti-pose face samples, the RIM-ISM method proposedin this paper could achieve an average recognition rateof 97.52% in the 10-fold cross-validation test, which washigher than DeepFace, SIFT, and Gabor method. In thisexperiment, the average recognition time of a single testsample was 36.28 ms, which was far lower than that of

(2019) 2019:105Li and Huang EURASIP Journal on Wireless Communications and NetworkingPage 6 of 81.00.90.8Recognition 0051015202530Dimension of featureFig. 5 Average recognition rate of 45 pose of FERET database1.00.90.8Recognition 005101520Dimension of featureFig. 6 Average recognition rate of 90 pose of FERET database2530

Li and Huang EURASIP Journal on Wireless Communications and Networking(2019) 2019:105Page 7 of 8Fig. 7 Pose change face image of MIT-CBCL databasethe other three methods. There were two main reasonsfor the high efficiency of the RIM-ISM method:(1) Through constant iterations, learning a series ofdecreasing directions and scales in that direction,the initial shape was regressed to the positionwhich was close to or equal to that of the realshape, so as to make the objective functionconverge to the minimum value at a very fastspeed. Therefore, it avoided the problem of solvingthe Jacobian matrix and Hessian matrix, improvedthe running speed of the algorithm, and simplifiedthe complexity of the algorithm.(2) Through using the interactive subspace method, theeigenvalue problem of high-dimensional matrix wastransformed into the eigenvalue of a low-dimensionalmatrix, which greatly reduced the computation andimproved the recognition efficiency.To sum up, the RIM-ISM algorithm had strong robustness and higher recognition efficiency when dealingwith random cross-pose samples.5 ConclusionAiming at the problem of pose change recognition inface recognition technology, this paper proposed across-pose face recognition method based on regressioniterative method and interactive subspace method.Table 1 10-fold cross-validation comparison of MIT-CBCL databaseMethodAverage recognition rate (%)Average recognition timeGabor [11]92.13 (115 115)85.71 msSIFT [9]95.42 (115 115)102.63 msDeepFace [1]96.25 (115 115)68.42 msRIM-ISM97.52 (115 115)36.28 msThe numbers in brackets indicate the image sizeThrough the regression iteration, the target functionconverges rapidly, and 45 pairs of important characteristic positions such as eyes, mouth, and nose of the targetface sample were extracted. Later, the pose correctionwas carried out by solving the pose deflection angle ofthe cross-pose face image. Finally, the similarity ofcross-pose face image was measured by using the interactive subspace method. The cross-pose recognition experiments in FERET database and MIT-CBCL databaseindicated that the RIM-ISM method proposed in thispaper had a higher recognition accuracy, and it simplified the computation complexity. Meanwhile, it had astrong robustness in dealing with face samples with largeangle pose changes.AbbreviationsISM: Interactive subspace method; MVP: Multi-view perceptron; PAMs: Poseaware CNN models; RIM: Regression iterative method; SIFT: Scale-invariantfeature transform; SPAE: Stacked progressive auto-encodersAcknowledgementsThis research had received support from the Science and TechnologyDepartment of Henan Province.FundingThe authors acknowledge the National Natural Science Foundation ofChina-Henan Joint Fund (Grant: U1804162), the Henan Province Science andTechnology Project (Grant: 192102310180).Availability of data and materialsNot applicable.Authors’ contributionsLKF is the main writer of this paper. He completed the derivation process ofRIM-ISM cross-pose face recognition algorithm, completed the simulation inFERET database and MIT-CBCL database, and analyzed the experimental result. HQZ provided some important suggestions for face pose estimationand experimental methods. Both authors read and approved the finalmanuscript.Competing interestsThe authors declare that they have no competing interests.

Li and Huang EURASIP Journal on Wireless Communications and NetworkingPublisher’s NoteSpringer Nature remains neutral with regard to jurisdictional claims in publishedmaps and institutional affiliations.Received: 19 February 2019 Accepted: 10 April 2019References1. Y. Taigman, M. Yang, M. Ranzato, in Computer Vision and Pattern Recognition(CVPR), 2014 IEEE Conference. DeepFace: Closing the gap to human-levelperformance in face verification (Columbus, 2014), pp. 1701–17082. I. Masi, S. Rawls, G. Medioni, in Computer Vision and Pattern Recognition(CVPR), 2016 IEEE Conference. Pose-aware face recognition in the wild (LasVegas, 2016), pp. 4838–48463. Z. Zhu, P. Luo, X. Wang, in the 27th International Conference on Neural InformationProcessing Systems. Multi-view perceptron: a deep model for learning faceidentity and view representations (Cambridge, 2014), pp. 217–2254. M. Kan, S. Shan, H. Chang, in Computer Vision and Pattern Recognition(CVPR),2014 IEEE Conference. Stacked Progressive -Encoders (SPAE) for facerecognition across poses (Columbus, 2014), pp. 1883–18905. J. Yim, H. Jung, B.I. Yoo, in Computer Vision and Pattern Recognition (CVPR),2015 IEEE Conference. Rotating your face using multi-task deep neuralnetwork (Boston, 2015), pp. 676–6846. Y. Sun, X. Wang, X. Tang, in Computer vision and pattern recognition (CVPR),2014 IEEE Conference. Deep learning face representation from predicting10,000 classes (2014), pp. 1891–18987. F. Schroff, D. Kalenichenko, J. Philbin, in Computer Vision and PatternRecognition (CVPR), 2015 IEEE Conference. FaceNet: A unified embedding forface recognition and clustering (Boston, 2015), pp. 815–8238. O.M. Parkhi, A. Vedaldi, A. Zisserman, in British Machine Vision Conference.Deep face recognition, vol 171 (2015), pp. 1–171 129. L. Wu, Z. Peng, Y. Hou, et al., Complete pose binary SIFT for face recognitionwith pose variation. Chin. J. Sci. Instrum. 36(4), 736–742 (2015)10. C. Li, W. Wei, J. Li, et al., A cloud-based monitoring system via face recognitionusing Gabor and CS-LBP features. J. Supercomput. 73(4), 1532–1546 (2017)11. B.S. Oh, K.A. Toh, A. Teoh, et al., An analytic Gabor feedforward network forsingle-sample and pose-invariant face recognition. IEEE Trans. ImageProcess. 27(6), 2791–2805 (2018)12. R.S. Babatunde, S.O. Olabiyisi, E.O. Omidiora, Assessing the performance ofrandom partitioning and K-fold cross validation methods of evaluation of aface recognition system. Adv. Image Video Process. 3(6), 18–26 (2015)(2019) 2019:105Page 8 of 8

show that the proposed RIM-ISM cross-pose face recog-nition algorithm had great advantages. 2 Methodology 2.1 Extraction of important characteristics Cross-pose face recognition is to recognize or identify faces of any pose in an image. The human face

Related Documents:

Oct 22, 2019 · Guidelines f or Teaching Specific Yoga Poses 50 Baby Dancer Pose 51 Bridge Pose 52 Cat/Cow Pose 53 Chair Pose 54 Chair Twist Pose: Seated 55 Chair Twist Pose: Standing 56 Child’s Pose 57 Cobra Pose 58 Crescent Moon Pose 59 Downward Dog Pose 60 Extended L

(http://www.yogajournal.com/pose/child-s-pose/) (http://www.yogajournal.com/pose/child-s-pose/) Child's Pose (http://www.yogajournal.com/pose/child-s-pose/)

pose-robust face recognition remains a challenge. To meet this challenge, this chap-ter introduces reference-based similarity where the similarity between a face image and a set of reference individuals (the "reference set") defines the reference-based descriptor for a face image. Recognition is performed using the reference-based

2 X. Nie, J. Feng, J. Xing and S. Yan (a) Input Image (b) Pose Partition (c) Local Inference Fig.1.Pose Partition Networks for multi-person pose estimation. (a) Input image. (b) Pose partition. PPN models person detection and joint partition as a regression process inferred from joint candidates. (c) Local inference. PPN performs local .

10 Questions and Answers About Fashion Posing This section with take you through the core features of any pose. Firstly you will learn what makes a pose a "fashion pose". Then you will learn the core posing elements. You will start with the basic "S" structure pose, the core pose for any woman. Then you will learn how to pose feet and legs.

or for pose propagation from frame-to-frame [12, 24]. Brox et al. [7] propose a pose tracking system that interleaves be-tween contour-driven pose estimation and optical flow pose propagation from frame to frame. Fablet and Black [10] learn to detect patterns of human motion from optical flow. The second class of methods comprises approaches that

into two approaches: depth and color images. Besides, pose estimation can be divided into multi-person pose estimation and single-person pose estimation. The difficulty of multi-person pose estimation is greater than that of single. In addition, based on the different tasks, it can be divided into two directions: 2D and 3D. 2D pose estimation

The anatomy of the lactating breast: Latest research and clinical implications Knowledge of the anatomy of the lactating breast is fundamental to the understanding of its function. However, current textbook depictions of the anatomy of the lactating breast are largely based on research conducted over 150 years ago. This review examines the most .