# GSOC2017 **Repository Path**: xtpgy521/GSOC2017 ## Basic Information - **Project Name**: GSOC2017 - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2018-12-29 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # [Facemark API for OpenCV][pull_request] Student: Laksono kurnianggoro\ Mentor: Delia Passalacqua\ Link to commits: https://github.com/opencv/opencv_contrib/pull/1257/commits \ Link to codes: https://github.com/opencv/opencv_contrib/pull/1257/files ## Introduction Facial landmark detection is a useful algorithm with many possible applications including expression transfer, virtual make-up, facial puppetry, faces swap, and many mores. This project aims to implement a scalable API for facial landmark detector. Furthermore, it will also implement 2 kinds of algorithms, including active appearance model (AAM) [1] and regressed local binary features (LBF) [2]. **References**\ [1] G. Tzimiropoulos and M. Pantic, "Optimization problems for fast AAM fitting in-the-wild," ICCV 2013.\ [2] S. Ren, et al. , “Face alignment at 3000 fps via regressing local binary features”, CVPR 2014. [![Facial landmarks detection using FacemarkLBF][preview]][vid_lbf] [See in YouTube][vid_lbf] ## Project Details The works in this project is summarized in this [commits list][commits]. In the proposal, there were few list of functionalities to be added to the API. However, during the coding period, in order to make the proposed API more reliable, various new functionalities were added as suggested by the mentors and another student who works in similar project. #### Here are the list of functionalities that initially proposed: - base class of the API (Facemark class) (*done*).\ This base class provides several functionalities including `read()`, `write()`, `setFaceDetector()`, `getFaces()`, `training()`, `loadModel()`, and `fit()`. - User defined face detector (*done*)\ The configurable face detector is stored in the instance of a landmark detection algorithm and can be set using the `setFaceDetector()` function. This allows the users to use their own face detector in the algorithm. - dataset parser (*done*).\ There are 3 kinds of utility functions that parse the information from dataset: `loadTrainingData()`, `loadDatasetList()`, and `loadFacePoints()`. - documentation (*done*) [see preview][documentation] - Tutorials (*done*) [see preview][tutorials] - sample codes (*done* : 3 programs are available) - 2 algorithms, AAM and LBF (*done*) #### Some new functionalities that are not in original plan but were developed during the coding period: - Extra parameter for the fitting function.\ Each face landmark detector might needs their own parameters in the fitting process. Some parameter are fixed in all the time but some other might changes according to the input data. In the case of fixed parameter, the developer can just add this parameter as the member of `Params`. However, in the case of dynamic parameter, the fitting function should allow extra parameter in the runtime. Hence the optional extra parameter is added to the fitting function by passing it as void parameter (`void* extra_params`) which can holds any types of parameter. - Allows extra parameters in the user defined face detector\ Because of the same reason as the previously explained extra parameter for the fitting function, the user defined face detector also should allow extra parameter. - Test codes\ Test codes are useful to perform automatic assessment to make sure that the implementation is compatible to various devices. - Functionality to add the dataset one by one in the training process\n As the discussion with another student who works on similar project, this functionality should be added since the code developer suggested that the module should not have dependency on `imgcodecs`. Previously, this dependency is needed because the dataset loading function was programmed to loads image inside the API. After this discussion, the image loading was removed from the API and the `addTrainingSample()` function is added to the base class to alleviate this problem. #### Unsubmitted functionalities During the coding period, there are some improvements (not listed in original plan) that were tested but failed to be merged to the OpenCV. 1. Trainer for the LBF algorithm.\ The LBF algorithm utilize liblinear to trains it regressor. During the coding period, there was an initiative to liberate the implementaion of LBF from this dependency as liblinear provides to much functionalities that are not needed. Two kinds of methods were tested to train the regressor including stochastic gradient descent (SGD) and regularized least square (RLS) regression method. However, during the test, SGD cannot produces optimal solution in general since the parameters should be set properly and depends on the characteristic of the dataset. Meanwhile for the RLS, it requires a lot of time and memory due to the needs of inverse matrix computation. Thus this method is not scalabe and will be useless for large-sized dataset. ```cpp Mat FacemarkLBFImpl::Regressor::regressionSGD(Mat x, Mat y, int max_epoch, int batch_sz, double lambda, double eta){ Mat pred; x.convertTo(x, CV_64F); y.convertTo(y, CV_64F); Mat w = Mat::zeros(x.cols, 1, CV_64F); cv::theRNG().state = cv::getTickCount(); randn(w,0.0,1.0); Mat dw = Mat::zeros(x.cols, 1, CV_64F); std::vector E; Mat dE; double gamma = 0.7; int maxIdx; for(int i=0;i=x.rows){ maxIdx = x.rows; }else{ maxIdx = j+batch_sz; } Mat x_batch = Mat(x,Range(j,maxIdx)); Mat y_batch = Mat(y,Range(j,maxIdx)); dE = dE + x_batch.t()*(x_batch*(w-gamma*dw)-y_batch)+lambda*w; }//batch dw = gamma*dw + eta*dE/norm(dE); w = w - dw; } pred = x*w; std::cout<`. The difference between overloads is the input parameter. The first version accept 2 input file paths, one for image list and the other for ground-truth list. Meanwhile for the second version, it only accept 1 input which represent path to file which contains the path of the training image followed by the ground-truth points in each line. The more detailed information is available in the [documentation][documentation]. - `loadFacePoints` : This function is useful to load the data of facial points stored in a path referred by the input parameter. - `drawFacemarks` : This functions is useful to draw the landmark points into a given image. There are 2 kinds of annotation format supported by the `loadTrainingData`, standard form and one line style format. The first format also supported by the `loadFacePoints` function. An example of the standard format is shown below: ``` version: 1 n_points: 68 { 212.716603 499.771793 230.232816 566.290071 ... } ``` And here is an example for the one sample one line format: ``` /home/user/ibug/image_003_1.jpg 336.820955 240.864510 334.238298 260.922709 335.266918 ... /home/user/ibug/image_005_1.jpg 376.158428 230.845712 376.736984 254.924635 383.265403 ... ... ``` #### Example of usage The main purpose of this API is to provides as simple as possible interfaces for performing the facial landmark detection process. Here are the example for each main task, training and fitting where both of them requires few lines of codes. Example of code for the training process: ```cpp /*declare the facemark instance*/ Ptr facemark = FacemarkLBF::create(); /* load the dataset list*/ String imageFiles = "../data/images_train.txt"; String ptsFiles = "../data/points_train.txt"; std::vector images_train; std::vector landmarks_train; loadDatasetList(imageFiles,ptsFiles,images_train,landmarks_train); /*add the training samples to the trainer*/ Mat image; std::vector facial_points; for(size_t i=0;iaddTrainingSample(image, facial_points); } /*training process*/ facemark->training(); ``` Example of code for the fitting process: ```cpp /*load a trained model*/ facemark->loadModel("../data/lbf.model"); Mat image = imread("image.jpg"); /*the the faces*/ std::vector faces; facemark->getFaces(img, faces); /*perform the fitting process*/ std::vector > landmarks; facemark->fit(image, faces, landmarks); ``` ## The Facemark AAM algorithm ![UML of the FacemarkAAM][uml_aam] The AAM algorithm is ported from the [Matlab version][aam_code_ori] which is provided by the original author of the related paper. In the original implementation, the data are processed in double precission format (64bit) while in this Facemark API the data are processed in float (32bit) datatype. This algorithm works better whenever initialization information is provided (rotation, translation, and scale). Thus this algorithm needs extra parameters in the fitting process. These initialization parameters can be obtained using user defined function (default face detector with pose and scale is not provided in the API, however and example of this function is available at the sample code). Here is a snippet taken from the sample code that demonstrating the utilization of extra parameter in the fitting process: ```cpp std::vector conf; std::vector faces_eyes; for(unsigned j=0;j0){ std::vector > landmarks; //output facemark->fit(image, faces_filtered, landmarks, (void*)&conf); for(unsigned j=0;j facemark = FacemarkLBF::create(); /*get the input image*/ Mat image = imread("image.jpg"); /*load a trained model*/ facemark->loadModel("../data/lbf.model"); /*get faces*/ std::vector faces; facemark->getFaces(img, faces); /*detect the landmarks*/ std::vector > landmarks; facemark->fit(image, faces, landmarks); ``` [vid_lbf]: https://www.youtube.com/watch?v=B7WGyhl2zm8 [preview]: https://raw.githubusercontent.com/kurnianggoro/GSOC2017/master/data/preview_lbf.gif [facemark_api]: https://raw.githubusercontent.com/kurnianggoro/GSOC2017/master/data/facemark_api.png [uml]: http://uml.mvnsearch.org/gist/334f84a4f5c59bc50aa52d1946dc1fd9 [uml_aam]: https://raw.githubusercontent.com/kurnianggoro/GSOC2017/master/data/facemark_aam.png [uml_lbf]: https://raw.githubusercontent.com/kurnianggoro/GSOC2017/master/data/facemark_lbf.png [pull_request]: https://github.com/opencv/opencv_contrib/pull/1257 [codes]: https://github.com/opencv/opencv_contrib/pull/1257/files [commits]: https://github.com/opencv/opencv_contrib/pull/1257/commits [documentation]: http://pullrequest.opencv.org/buildbot/export/pr_contrib/1257/docs/db/dd8/classcv_1_1face_1_1Facemark.html [tutorials]: http://pullrequest.opencv.org/buildbot/export/pr_contrib/1257/docs/d5/d47/tutorial_table_of_content_facemark.html [aam_code_ori]: https://ibug.doc.ic.ac.uk/download/tzimiro_iccv2013_code.zip [lbf_cod_ori]: https://github.com/luoyetx/face-alignment-at-3000fps [lbf_model]: https://raw.githubusercontent.com/kurnianggoro/GSOC2017/master/data/lbfmodel.yaml