Researchers at Purdue University Propose GTX: A Transactional Graph Data System for HTAP Workloads

Researchers from Purdue University have introduced GTX to address the challenge of handling large-scale graphs with high throughput read-write transactions while maintaining competitive graph analytics. Managing dynamic graphs efficiently is crucial for various applications like fraud detection, recommendation systems, and graph neural network training. Real-world graphs often exhibit temporal localities and hotspots, which existing transactional graph systems struggle to address. The research aims to create a transactional graph data system capable of efficiently managing dynamic graphs with high update arrival rates, temporal localities, and hotspots while supporting concurrent graph analytics.

Current transactional graph systems often use coarse-grained concurrency control mechanisms that cannot optimize to handle temporal localities and hotspots efficiently. These systems may suffer from performance degradation under concurrent workloads involving frequent updates. In contrast, the proposed data system GTX is a latch-free write-optimized transactional graph data system. GTX leverages atomic operations to eliminate latches employs delta-based multiversion storage and implements a hybrid transaction commit protocol. 

GTX also incorporates a delta-chain index to support efficient edge lookups and manage concurrency control at the delta-chain level. Unlike existing systems, it is designed to adapt to temporal localities and hotspots in graph updates while maintaining high throughput read-write transactions and competitive graph analytics performance.

GTX’s architecture revolves around a latch-free adjacency list-based graph store and a transaction manager with a concurrency control protocol. It employs a multi-version delta store where each delta captures vertex or edge operations, allowing efficient access and updates. GTX makes it easier for concurrent transactions and analytics to work together by controlling them at the delta-chain level and using a hybrid group commit protocol. This increases overall throughput. Additionally, GTX utilizes a delta-chain index for efficient edge lookups and supports adaptive concurrency control based on workload history. The system is prototyped as a graph library and evaluated using both real-world and synthetic datasets. The experiments show GTX’s ability to handle real-world power-law graphs with temporal localities and hotspots while maintaining millions of transactions per second throughput and competitive graph analytics performance.

In conclusion, the researchers address the challenge of efficiently managing dynamic graphs with high arrival rates of updates, temporal localities, and hotspots. By introducing GTX, a latch-free write-optimized transactional graph data system, the researchers provide a solution that outperforms existing systems regarding transaction throughput and robustness across various workloads. GTX’s ability to adapt to temporal localities and hotspots while maintaining competitive graph analytics performance makes it a promising tool for applications requiring efficient graph management and analysis.


Check out the PaperAll credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our 41k+ ML SubReddit

[Announcing Gretel Navigator] Create, edit, and augment tabular data with the first compound AI system trusted by EY, Databricks, Google, and Microsoft