Genetic data are now common in many domains. Typically, these genetic studies try to associate genetics with single phenotypes, behaviors, or diagnostic criteria. However, many of these studies include multiple behavioral variables and very large genetic data sets. The analysis of these data sets faces two particular challenges: 1) How to integrate many behavioral and genetic variables when 2) only a small number of variables are interpretable. To address these issues, we propose the integration of partial least squares and a sparsified approach to multiple correspondence analysis.