LinkedIn allows its users to add information about themselves to their profiles, such as their career history, education, abilities, to name a few. They use AI models to extract profile traits or entities from members’ inputs, a process called standardization and knowledge graph construction. It results in creating a knowledge graph of the entities to which a member is linked. This is an essential component of gaining a better understanding of member profiles to identify more relevant jobs, news items, connections, and advertisements for members on the site.
They now want to infer “missing” profile entities that aren’t extracted in the existing knowledge network as part of this approach. For example, one can deduce that a person is skilled in Tensorflow if they know machine learning and work at Google, even if their current profile does not mention so.
There are many reasons why missing entities will always exist, and some of them are mentioned below:
- Most entity extraction technologies rely heavily on textual data. If an entity isn’t referenced clearly in the text, the models will likely overlook it.
- The member could not provide firms with all of the information they needed. For example, the member may opt to put only a subset of their talents on their profile rather than listing all of their skills.
Therefore, the team believes that they could provide better suggestions to members across LinkedIn products if they can infer these missing entities. However, inferring these missing entities is a difficult task as it requires a comprehensive grasp of the member profile.
Current techniques on entity extraction rely on text as their primary input and are unable to infer entities not referenced directly in the text. The LinkedIn team plans to use the entities collected from member inputs to infer missing entities to overcome this shortcoming.
The team trained the model using the self-supervision method. The model learns to predict the masked attributes by masking, or hiding, a few attributes from a member’s profile.
Their approach formulates entity inference as an inference problem on a graph. Further, they use Graph Neural Networks (GNN) to tackle the link prediction problem. GNNs (Graph Neural Networks) are neural networks used to extract data from graphs. GNN learns a latent representation for each node in an input graph, with each node’s representation being an aggregation of its neighbors’ representations. The representation learned by GNN captures the connection structure in the input graph due to this process.
Existing GNN models have a gap in aggregating the neighbors (member entities) since they rely on simple aggregation approaches like averaging or weighted averaging. These simple aggregation approaches would fail if intricate interconnections existed between existing items.
To overcome this problem, the team developed Entity-BERT, a revolutionary GNN model that uses a multi-layer bidirectional transformer for aggregation. They employed a neural network called ‘Transformer’ to update a node’s representation given a set of existing entities. The transformer computes the interaction (attention) between every pair of things. This technique is repeated 6 to 24 times to capture increasingly intricate relationships between entities.
In Natural Language Processing (NLP), where the goal is to understand the interactions between words in a given sentence, multi-layer bidirectional transformers have shown higher performance in sentence interpretation. Experiments show that Bidirectional Encoder Representation with Transformers (BERT) has surpassed non-Transformer neural networks in a number of NLP tasks. According to them, BERT can also increase entity inference performance.
The LinkedIn skills recommender algorithm suggests skills that a user may have but hasn’t mentioned on their profile. Entity-BERT is used by the team to infer and recommend talents not shown on the member’s profile. Entity-BERT is compared to a previous technique that employed a member’s current entities with easier aggregation. The results demonstrate that members accepted more suggestions with the Entity-BERT-based method. Furthermore, these new skills resulted in increased overall member involvement, such as more sessions.
Advertisers on LinkedIn can specify their target demographic using profile features. Furthermore, some of them choose audience expansion, which broadens the audience to include additional members who share a similar interest. The team employed Entity-BERT to grow member profile entities (businesses, skills, and titles) and then use these expanded entities to expand their audience. Compared to a previous expansion strategy without the Entity-BERT, audience expansion with Entity-BERT had a statistically significant impact on Ads revenue while having no negative effect on user experience metrics (such as Ads Click-Through-Rate).