Researchers at Peking University Open-Source ‘CircuitNet,’ a Dataset for Machine Learning Applications in Electronic Design Automation (EDA)

Electronic design automation (EDA), often known as computer-aided design (CAD), is a class of software tools used to create electronic systems like integrated circuits (ICs). EDA tools enable designers to create a design for large-scale integrated chips (VLSI) with billions of transistors. Due to the size and complexity of current electronic systems, EDA tools are crucial for VLSI design. The EDA research community has recently been actively investigating AI for IC methodologies to design cutting-edge chips, thanks to the explosion of artificial intelligence (AI) algorithms. Numerous studies have investigated machine learning-based solutions for cross-stage prediction tasks in the design cycle to promote speedier design convergence. In 2021, Google researchers used reinforcement learning (RL) to strategically insert macros in a chip design in their work titled “A graph placement methodology for rapid chip design,” which was also published in Nature journal.

Recent technological developments are evidence that “AI for EDA” is a topic that the design automation community is actively researching. The primary approach is to consider each macro as a stone and the chip layout as a Go board. Using 10,000 internal design samples as a pre-training set, an RL agent can learn to position one macro at a time. On Google’s TPU chips, it can outperform the performance of traditional EDA tools by fine-tuning the agent for each design and achieving higher performance, power, and area (PPA). However, the small size of datasets is the primary challenge that researchers must overcome. Although generating big internal datasets for validation is challenging, and there are few large public datasets available, most research can only generate small internal datasets for validation. 

Taking a step on this front, a research team from Peking University in China curated the first ever open-source dataset for machine learning applications in fast chip design, CircuitNet. This dataset is intended for use with AI in VLSI CAD applications for ICs. The collection includes 54 synthesized circuit netlists and over 10,000 samples from six open-source RISC-V architectures. It offers comprehensive support for cross-stage prediction jobs and helps with activities like predicting routing congestion, design rule check (DRC) violations, and IR drops. 

The first of CircuitNet’s four key characteristics is its enormous scale. The dataset is made up of more than 10K samples that were taken from a variety of runs using commercial PDKs that are now supported in 28nm technology node EDA tools. The scientists also intend to incorporate support for 14nm technology soon. The diversity of the dataset is the second crucial feature. Various parameters in logic synthesis and physical design are introduced to reflect the circumstances that can arise during the design phase. The dataset can be used for a multitude of tasks because it primarily allows the prediction of three different events: congestion, DRC violations, and IR drops. The dataset contains frequently used elements in cutting-edge techniques and is supported by trials. The dataset has been formatted to be user-friendly by the researchers. The features are preprocessed and converted into Numpy arrays with limited information removed, and the data can be loaded quickly and easily using Python scripts.

The authors experimented on three prediction tasks to validate the dataset for evaluation purposes: congestion, DRC violations, and IR drop. Each experiment uses a technique from a recent study and compares the results on CircuitNet using the same assessment metrics as the original study. Overall, the findings are in line with the original publications, proving the viability of CircuitNet. To increase the size and diversity of the dataset, the authors intend to use more data samples with large-scale designs in nodes with cutting-edge technologies in future work. Comprehensive documentation regarding the experimental setup is also available on GitHub.

This Article is written as a research summary article by Marktechpost Staff based on the research paper 'CircuitNet: an open-source dataset for machine learning applications in electronic design automation (EDA)'. All Credit For This Research Goes To Researchers on This Project. Check out the paper, github link and reference article.

Please Don't Forget To Join Our ML Subreddit
✅ [Featured Tool] Check out Taipy Enterprise Edition