A recent study from a multi-institutional research team introduces CW Networks (CWNs), a message-passing mechanism that produces state-of-the-art outcomes across a variety of molecular datasets while delivering superior expressivity than commonly utilized graph neural networks (GNNs).
GNNs’ expressive power stems primarily from their ability to determine whether two graphs are isomorphic or not. Recent research has revealed that GNNs have difficulty modeling long-range interactions and lack a principled approach to model higher-order structures, limiting their expressive capacity.
The study proposes CW Networks by scholars from the University of Cambridge, Imperial College London & Twitter, UCLA, MPI-MIS, and SJTU & UNSW (CWNs). The new message-passing technique allows for principled modeling of higher-order signals as well as distance compression between nodes. It works with regular cell complexes and outperforms GNNs in terms of expressive power. (The paper is mentioned at the end of the article)
The graph isomorphism problem is NP-intermediate, and the Weisfeiler-Lehman (WL) test can be used to benchmark the expressive power of today’s GNNs. The team first demonstrates how to convert graphs into higher-dimensional cell complexes; then color refines the resulting cell complexes to make isomorphism evaluation easier.
Cellular WL (CWL), which generalizes the Simplicial WL and WL tests, was chosen as the team’s color refinement scheme for cell complexes. Three phases comprise the method for a single cell:
- In a regular cell complex, all cells are given the same color at the start.
- Considering the color of a cell at each iteration, the color of the cell at the next iteration is computed by injectively mapping the multi-sets of colors belonging to adjacent cells using a constant HASH function.
- The method terminates when a stable coloring is achieved. If the color histograms of two cell complexes differ, the test is called non-isomorphic; otherwise, it is inconclusive.
The researchers also show that CWL applied to cell complexes is more potent than WL applied to starting graphs for a wide range of transformations.
The team goes on to discuss CWNs in more detail, with a focus on molecular graphs. The proposed CWNs’ cells receive two sorts of messages: the first specifies messages from atoms to bonds and bonds to rings, while the second includes messages between atoms connected by a bond and messages between bonds in the same ring. The team first demonstrates that CWNs are roughly equivalent to CWL in terms of power. CWNs can overcome the long-range interaction problems that plague standard GNNs by lowering the overall number of layers required in the presence of long-range node interaction. Furthermore, CWNs can be thought of as a (non-linear) generalization of linear diffusion operators on cell complexes, with a linear computational complexity with respect to the size of the input complex, which is substantially less than GNNs.
The team uses a cell isomorphism network (CIN) model. The model stacks CWN layers with local aggregators as in GIN (Graph isomorphism network). The proposed model was tested on eight TUDataset benchmarks from social networks (IMDB-B, IMDB-M, RDT-B), biology (PROTEINS), and chemistry (i.e., molecules – MUTAG, PTC, NCI1, and NCI109). Its performance was compared with baseline methods.
The result shows that CIN achieved the best mean accuracy on four out of eight datasets and stood second on the others. Experiments reveal that the proposed CW Networks technique may achieve state-of-the-art results on popular large-scale molecular graph datasets and other related tasks, demonstrating its strong message-passing performance on cell complexes.