Overview
NetKit-SRL, or NetKit for short (not to be confused with the NetKit sourceforge project), is an open-source Network Learning toolkit for statistical relational learning. It is written in Java 1.5 and was designed with a plug-and-play architecture to enable the mix-and-match between different components in the relational learning process. It integrates seamlessly with the Weka machine learning toolkit, making it possible to use any of Weka's learning classifiers in the context of relational learning.
The NetKit architecture is designed primarily to support statistical relational learning and inference on relational data. It represents relational data as a graph over which it does collective inference to make various predictions of attributes.
NetKit is also built to efficiently represent more complex relational data, such as multiple entities in directed and undirected graphs, multi-modal graphs, graphs with parallel edges, and hypergraphs. It provides a mechanism for quickly compute various graph statistics on large graphs. This facilitates the creation of analytic tools for complex data sets that can examine the relations between entities. The current distribution of NetKit includes implementations of a number of algorithms from graph theory, data mining, and social network analysis, such as statistical analysis, and calculation of network distances, flows, and importance measures (various centrality metrics).
As an open-source library, NetKit provides a powerful plug-and-play framework for applying (relational) machine learning algorithms to graph/network data. In addition, it provides a lightweight memory representation of graphs and relational data, making it easy to build network analysis and graph mining algorithms. The hope is that NetKit will make it easier for those who analyze and use machine learning algorithms with relational data to make use of one anothers' development efforts.
Announcements
- 2 July 2013: NetKit-SRL 1.4.0 released. This update adds more graph statistics to NetKit (alpha centrality and pagerank) and fixes some bugs in L2 and cosine distance computations.
- 8 May 2012: NetKit-SRL 1.3.0 released. This update enables NetKit to learn on multi-relational domains.
- 22 August 2010: NetKit-SRL 1.2.1p1 released. This fixes a problem with the 'EdgeTransform' functionality, where edgefiles were not read in.
- 5 August 2010: NetKit-SRL 1.2.1 released. This includes a fix to saving NetKit graphs and new capabilities for saving Pajek graphs (through the GraphStat module)
- 2 August 2010: NetKit-SRL 1.2.0 released. This includes updates read/write Pajek files, compute graph statistics, transform edges and save collective inference iterations to a Pajek project file to vizualize collective inference as an animation as well as final predictions
- 4 May 2010: NetKit-SRL 1.1.0 released. This includes updates to run active learning problems as described in the KDD-2009 paper.
- 21 April 2009: NetKit-SRL 1.0.4 released. This has a bug fix to correctly read some command-line parameters.
- 22 August 2008: NetKit-SRL 1.0.3 released. This has a critical bug fix in the aggregation code.
- 1 August 2008: NetKit-SRL 1.0.2 released