Overview

NetKit-SRL, or NetKit for short (not to be confused with the NetKit sourceforge project), is an open-source Network Learning toolkit for statistical relational learning. It is written in Java 1.5 and was designed with a plug-and-play architecture to enable the mix-and-match between different components in the relational learning process. It integrates seamlessly with the Weka machine learning toolkit, making it possible to use any of Weka's learning classifiers in the context of relational learning.

The NetKit architecture is designed primarily to support statistical relational learning and inference on relational data. It represents relational data as a graph over which it does collective inference to make various predictions of attributes.

NetKit is also built to efficiently represent more complex relational data, such as multiple entities in directed and undirected graphs, multi-modal graphs, graphs with parallel edges, and hypergraphs. It provides a mechanism for quickly compute various graph statistics on large graphs. This facilitates the creation of analytic tools for complex data sets that can examine the relations between entities. The current distribution of NetKit includes implementations of a number of algorithms from graph theory, data mining, and social network analysis, such as statistical analysis, and calculation of network distances, flows, and importance measures (various centrality metrics).

As an open-source library, NetKit provides a powerful plug-and-play framework for applying (relational) machine learning algorithms to graph/network data. In addition, it provides a lightweight memory representation of graphs and relational data, making it easy to build network analysis and graph mining algorithms. The hope is that NetKit will make it easier for those who analyze and use machine learning algorithms with relational data to make use of one anothers' development efforts.


Announcements