Replicating the visual system that the brain uses to process the information is an area of substantial interest. This thesis is situated in the context of a fully automated system capable of analyzing facial features when the target is near the cameras, and tracking his identity when his facial features are no more traceable. The first part of this thesis is devoted to face pose estimation procedures to be used in face recognition scenarios. We proposed a new label-sensitive embedding based on a sparse representation called Sparse Label sensitive Locality Preserving Projections. In an uncontrolled environment observed by cameras from an unknown distance, person re-identification relying upon conventional biometrics such as face recognition is not feasible. Instead, visual features based on the appearance of people can be exploited more reliably. In this context, we propose a new embedding scheme for single-shot person re-identification under non overlapping target cameras. Each person is described as a vector of kernel similarities to a collection of prototype person images. The robustness of the algorithm is improved by proposing the Color Categorization procedure. In the last part of this thesis, we propose a Siamese architecture of two Convolutional Neural Networks (CNN), with each CNN reduced to only eleven layers. This architecture allows a machine to be fed directly with raw data and to automatically discover the representations needed for classification