LinkedIn open-sources Generalized Deep Mixed Model (GDMix), a framework that makes training of AI personalization models more efficient. It supports deep learning models and so its an improvement over LinkedIn’s previous release, Photon ML.
GDMix trains two kinds of models: fixed effect and random effect models, used in search personalization and recommender systems. Usually, these models are challenging to teach in isolation. Still, GDMix accelerates the process by breaking the large models into a global model (fixed effect) and many small models (random effects) and then solving them separately. This approach allows for faster training of models with commodity hardware and eliminates the need for memory, specialized processors, and networking equipment.
GDMix uses TensorFlow for data reading and gradient computation, resulting in 10% to 40% (compared with Photon ML) increased training speed on various datasets. The framework automatically trains and evaluates models and can handle hundreds of millions of models.
DeText can be used as a global fixed-effect model within GDMix to train natively. DeText is applied to various tasks, including multi-class classification, search and recommendation ranking, and query understanding. Using deep neural networks, it leverages semantic matching to understand member intents in search and recommender systems. DeText and DMix will automatically train and evaluate the user-specified fixed-effect model type, connecting the model to the subsequent random effect models. GDMix currently supports deep natural models and logistic regression models. DeText supports arbitrary models users design and also train outside GDMix.
The launch of GDMix succeeded in the LinkedIn release of a toolkit to measure AI model fairness: LinkedIn Fairness Toolkit (LiFT). The LiFT is used during training to measure biases in corpora. It evaluates notions of fairness for models while detecting the difference in their performance across subgroups. According to LinkedIn, before the training, the LiFT is applied internally to measure the fairness metrics of training datasets for models.