Researchers From CMU and LinkedIn Open-Sources The Implementation of PASS (Performance-Adaptive Sampling Strategy) For Deep Learning

Understanding the relationships between entity sets maintained in a database is critical. In this context, an entity is an object or a data component.

Entity relationships are many times depicted using graphs in a variety of ways. For instance, professional graphs show how people collaborate, whereas social graphs show how people connect with one another. 

To better utilize the graphs, deep learning models called GNNs (Graph Neural Networks) are taught to interpret graphs. GNNs, for example, look at a member’s connections and connections of connections. They then use this neighborhood knowledge to do AI tasks like search and recommendation. 

However, GNNs have several limitations in terms of how they exploit a member’s neighbors. 

  1. A GNN-based strategy, for starters, does not scale to real-world social networks. In many circumstances, a single member has many connections, and utilizing all of them is impractical. A celebrity, for example, could have hundreds of millions of connections. 
  2. The fact that not every connection is relevant to the work at hand is a second barrier. For example, in the job recommendation task, the members connections who work in quite different industries and may be personal friends would be irrelevant to the task.

Some existing approaches work by selecting a predetermined number of neighbors which restricts the scale of the GNN’s inputs. While this addresses the first issue, these samplers have the downside of not considering which neighbors are more significant for GNNs.A random sample of neighbors may offer a less accurate suggestion than a sample of relevant neighbors.

To address the issues mentioned above, a team from LinkedIn and CMU devised a novel GNN method called “Performance-Adaptive Sampling Strategy,” or “PASS,” and open source the implementation of PASS that selects appropriate neighbors using an AI algorithm. The AI model developed by PASS learns how to choose neighbors who improve the predicted accuracy of the GNN model. By examining the attributes of a given neighborhood, the AI model determines whether or not to select that neighbor. This method has the advantage of working well independent of the job for which the GNN model is being employed.

Source: https://dl.acm.org/doi/pdf/10.1145/3447548.3467284

They also developed a time-saving technique to train such AI neighbor selection models. The team tested their method on seven public benchmark graphs and two LinkedIn graphs. The results show that PASS outscored state-of-the-art GNN algorithms by 1.3 percent -10.4 percent. 

The team also demonstrates that PASS can attain reliable accuracy even when the input graph has noisy edges. To demonstrate this, they added noisy edges to the benchmark graphs. PASS outperformed the baseline algorithms by 2-3 times when this was done. 

This is the first strategy to select neighbors to maximize a GNN’s predictive performance. According to the team, in contrast to other GNN models that use more neighbors, PASS can achieve higher prediction accuracy by using fewer neighbors.

PASS introduces a new approach for picking neighbors using AI. The team plans to integrate PASS into multiple GNN apps in the future. They have open-sourced the PASS implementation to encourage researchers to develop more efficient and accurate GNN models.

Paper: https://dl.acm.org/doi/pdf/10.1145/3447548.3467284

Github: https://github.com/linkedin/PASS-GNN

Reference: https://engineering.linkedin.com/blog/2022/open-sourcing-PASS