KGC 2024 Session: Arachne: An Open Source Framework for Graph Analytics
Please enjoy this talk “Arachne: An Open Source Framework for Graph Analytics,” from the Knowledge Graph Conference 2024 given by David A. Bader, PhD, Distinguished Professor of Data Science at the New Jersey Institute of Technology (NJIT).
Talk description:
A real-world challenge in data science is to develop interactive methods for quickly analyzing new and novel datasets that are potentially of massive scale. In this talk, Bader discusses his development of knowledge graph algorithms in the context of Arkouda, an open source NumPy-like replacement for interactive data science on tens of terabytes of data.
Massive-scale analytics is an emerging field that integrates the power of high-performance computing and mathematical modeling to extract key insights and information from large-scale datasets. Productivity in massive-scale graph analytics entails quick interpretation of results through easy-to-use frameworks, while also adhering to design principles that combine high-performance computing and user-friendly simplicity.
However, data scientists often encounter challenges, especially with graph analytics, which require the analysis of complex data from various domains, such as cybersecurity as well as the natural and social sciences. To address this issue, he introduces Arachne, an open source framework that enhances accessibility and usability in massive-scale graph analytics. Arachne offers novel algorithms and implementations of graph kernels for efficient data analysis, such as connected components, breadth-first search, triangle counting, k-truss, among others.
The high-performance algorithms are integrated into a back-end server written in HPE/Cray’s Chapel language and can be accessed through a Python application programming interface (API). Arachne’s backend server is compatible with Linux supercomputers, is easy to set up, and can be utilized through either Python scripts or Jupyter notebooks, which makes it a desirable tool for data scientists who have access to high-performance computers.
In this talk, Bader presents an overview of the graph algorithms his research group has implemented into Arachne and, if applicable, the algorithmic innovations of each. Further, Bader will discuss improvements to the graph data structure to store extra information such as node labels, edge relationships, and node and edge properties. Arachne is built as an extension to the open source Arkouda framework and allows for graphs to be generated from Arkouda dataframes.
The open source code for Arachne can be found at https://github.com/Bears-R-Us/arkouda-njit
This is a joint work with Oliver Alvarado Rodriguez, Zhihui Du, Joseph Patchett, Naren Khatwani, Fuhuan Li, Bader is supported in part by the National Science Foundation award CCF-2109988.
Technical topics covered:
- Graph Theory
- Data Ingestion
- Data Integration
- Graph Data Science
- Querying Knowledge Graphs
- Graph Databases
- Knowledge Graph Applications
Do you enjoy talks like this one? Subscribe to the KGC newsletter and stay up to date with everything happening in the world of knowledge graphs, AI, and semantic tech.
***