netkit.classifiers.relational
Class NetworkClassifierImp

java.lang.Object
  extended by netkit.classifiers.ClassifierImp
      extended by netkit.classifiers.relational.NetworkClassifierImp
All Implemented Interfaces:
Classifier, NetworkClassifier, Configurable
Direct Known Subclasses:
ClassDistribRelNeighbor, Harmonic, NetworkMetaClassifier, NetworkOnlyBayes, NetworkWeka, ProbRelationalNeighbor, WeightedVoteRelationalNeighbor

public abstract class NetworkClassifierImp
extends ClassifierImp
implements NetworkClassifier

Core implementation of the NetworkClassifier (and Classifier) interface. All methods that are generic have been implemented (although they can be overridden to be customized as necessary). It extends the core implementation of the basic Classifier interface (ClassifierImp) and implements only the methods that are specifically relational in nature.

The only four methods that any subclass need to implement are:

In addition, the classifier ought to override the public void induceModel(Graph graph, DataSplit split) method, which does the actual learning.

You may also want to override the getDefaultConfiguration() and configure(Configuration config) methods if the classifier can be configured in any special way (e.g., if it can take some parameters).

Finally, if the classifier needs to do some book-keeping at every run of the collective inference method, the you should override initializeRun(Estimate currPrior, Node[] unknowns)

Properties:

Author:
Sofus A. Macskassy (sofmac@gmail.com)
See Also:
doEstimate(netkit.graph.Node, double[]), Classifier.getName(), Classifier.getShortName(), Classifier.getDescription(), induceModel(netkit.graph.Graph, netkit.classifiers.DataSplit), getDefaultConfiguration(), configure(netkit.util.Configuration), initializeRun(netkit.classifiers.Estimate, netkit.graph.Node[])

Nested Class Summary
protected static class NetworkClassifierImp.Aggregation
          The possible ways the relational classifier can handle aggregation.
 
Field Summary
protected static AggregatorFactory aggFactory
          Get the aggregator factory, which will be used to get the aggregators needed for the classifier.
protected  NetworkClassifierImp.Aggregation aggregation
          What kind of aggregation should the classifier do.
protected  java.util.List<Aggregator> aggregators
          This list contains all the aggregators for an input graph.
protected  java.lang.String[] aggTypes
          This array contains the list of aggregators that this classifier will use.
protected  java.util.List<Aggregator> dynamicAggregators
          This list contains the 'dynamic' aggregators...
protected  Estimate prior
          This keeps track of the priors for the unknown nodes.
 
Fields inherited from class netkit.classifiers.ClassifierImp
attribute, classPrior, clsIdx, graph, keyIndex, logger, nodeType, right, tmpVector, useIntrinsic, vectorClsIdx
 
Constructor Summary
NetworkClassifierImp()
           
 
Method Summary
 int classify(Node node, Estimate prior, boolean updatePrior)
          Classify a given node into one of the given classes.
 void configure(Configuration config)
          Configure the classifier.
protected abstract  boolean doEstimate(Node node, double[] result)
          This is the final estimation method that will be called and the only estimation method that sub-classes should implement.
 boolean estimate(Node node, double[] result)
          Estimate the probabilities that a given node into belongs to any given class.
 double[] estimate(Node node, Estimate prior, boolean updatePrior)
          Estimate the probabilities that a given node into belongs to any given class It may use the class estimations of other nodes and may update the prior of the given node.
 boolean estimate(Node node, Estimate prior, double[] result, boolean updatePrior)
          Estimate the probabilities that a given node into belongs to any given class It may use the class estimations of other nodes and may update the prior of the given node.
 boolean estimate(Node node, Estimate prior, Estimate result, boolean updatePrior)
          Estimate the probabilities that a given node into belongs to any given class It may use the class estimations of other nodes and may update the prior of the given node.
protected  void generateAggregators()
          This generates all the aggregator instances needed to create all the aggregated values for all the attributes as directed by the configuration.
protected  java.lang.String[] getAttributeNames()
           
 Configuration getDefaultConfiguration()
          Default configuration for relational learners.
protected  boolean includeClassAttribute()
          Method to tell this object whether to include the class attribute when creating the internal instance representation for relational learning.
 void induceModel(Graph graph, DataSplit split)
          This method induces a new prediction model.
 void initializeRun(Estimate currPrior, Node[] unknowns)
          This is called prior to predicting labels for the unknown labels in the graph, in case the classifier needs to initialize itself.
protected  void makeVector(Node node, double[] vector)
           
 
Methods inherited from class netkit.classifiers.ClassifierImp
addListener, classify, classify, clearListeners, estimate, estimate, getLogger, getNofifyListeners, notifyListeners, notifyListeners, removeListener, reset, setNofityListeners
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface netkit.classifiers.Classifier
addListener, classify, classify, clearListeners, estimate, estimate, getDescription, getLogger, getName, getNofifyListeners, getShortName, notifyListeners, notifyListeners, removeListener, reset, setNofityListeners
 

Field Detail

aggFactory

protected static AggregatorFactory aggFactory
Get the aggregator factory, which will be used to get the aggregators needed for the classifier.


prior

protected Estimate prior
This keeps track of the priors for the unknown nodes.


aggregation

protected NetworkClassifierImp.Aggregation aggregation
What kind of aggregation should the classifier do. Default is to aggregate on everything.


aggTypes

protected java.lang.String[] aggTypes
This array contains the list of aggregators that this classifier will use. These are gotten from the classifier configuration.


aggregators

protected java.util.List<Aggregator> aggregators
This list contains all the aggregators for an input graph. It will contain all the aggregators that can be instantiated for all the attributes that should be aggregated.


dynamicAggregators

protected java.util.List<Aggregator> dynamicAggregators
This list contains the 'dynamic' aggregators... those whose values will change if the class estimates change. These are the only ones that need to be updated across iterations of the collective inference method.

Constructor Detail

NetworkClassifierImp

public NetworkClassifierImp()
Method Detail

doEstimate

protected abstract boolean doEstimate(Node node,
                                      double[] result)
This is the final estimation method that will be called and the only estimation method that sub-classes should implement.

Parameters:
node - The node whose class label needs to be estimated
result - The array to be filled with estimations for the class label
Returns:
true if the classifier estimated the class label. false if the classifier abstains.

includeClassAttribute

protected boolean includeClassAttribute()
Method to tell this object whether to include the class attribute when creating the internal instance representation for relational learning. If the configuration says not to use intrinsic variables (the non-relational variables), then the class attribute is also removed. However, classifiers that use WEKA underneath need to have the class attribute used. This method tells this object to create the internal data representation that includes the class attribute whether the intrinsics are on or off.

Returns:
This always returns false. Subclasses that always require a class attribute to be included should override to return true.

getDefaultConfiguration

public Configuration getDefaultConfiguration()
Default configuration for relational learners. This sets aggregation to be only for the class attribute, to use intrinsic variables and to use the mode, ratio and mean aggregators.

These are also the configuration objects that are set in this instance. If a classifier needs more, then it should override this (remembering to call super.getDefaultConfiguration() if needed) to set other default configuration options.

Specified by:
getDefaultConfiguration in interface Configurable
Overrides:
getDefaultConfiguration in class ClassifierImp

configure

public void configure(Configuration config)
Configure the classifier. This takes care of the type of aggregation is done, whether to use intrinsic variables, and what aggregator functions to use.

Specified by:
configure in interface Configurable
Overrides:
configure in class ClassifierImp
Parameters:
config - The Configuration object used to configure this classifier.
See Also:
aggregation

generateAggregators

protected void generateAggregators()
This generates all the aggregator instances needed to create all the aggregated values for all the attributes as directed by the configuration. It populates the aggregators List with these aggregators, which will then be used to convert an instance into a 'learning instance' by adding these aggregated values.

See Also:
aggregators, dynamicAggregators

induceModel

public void induceModel(Graph graph,
                        DataSplit split)
This method induces a new prediction model. Any subclass should remember to call super.induceModel(graph,split) to ensure that all internal variables have been set.

This method sets up crucial information needed for internal use. Of general interest, it resets the aggregators list to the new list of aggregators (by calling generateAggregators() and sets certain internal variables such as tmpVector and vectorClsIdx.

Specified by:
induceModel in interface Classifier
Overrides:
induceModel in class ClassifierImp
Parameters:
graph - The graph over which to induce a model
split - The datasplit which informs us which nodes have known class labels and which do not
See Also:
generateAggregators(), aggregators, ClassifierImp.induceModel(netkit.graph.Graph, netkit.classifiers.DataSplit), ClassifierImp.tmpVector, ClassifierImp.vectorClsIdx

getAttributeNames

protected java.lang.String[] getAttributeNames()
Overrides:
getAttributeNames in class ClassifierImp

makeVector

protected void makeVector(Node node,
                          double[] vector)
Overrides:
makeVector in class ClassifierImp

initializeRun

public void initializeRun(Estimate currPrior,
                          Node[] unknowns)
This is called prior to predicting labels for the unknown labels in the graph, in case the classifier needs to initialize itself. This is called at the beginning of every iteration of a collective inference run. This method does nothing and should be overridden if a particular classifier does need to do something specific.

Specified by:
initializeRun in interface NetworkClassifier
Parameters:
currPrior - The current 'priors' or estimates of the unknown lables
unknowns - The list of nodes which are to be predicted in the upcoming run.

classify

public final int classify(Node node,
                          Estimate prior,
                          boolean updatePrior)
Classify a given node into one of the given classes. It may use the class estimations of other nodes and may update the prior of the given node.

This method is final and calls the estimate(Node,Estimate,double[],boolean) method with false as the boolean as this takes care of that boolean if neccessary (by updating the prior object). It also notifies any listeners that the label of this node has been predicted.

Specified by:
classify in interface NetworkClassifier
Parameters:
node - The node to classify.
prior - The current class estimates of all initially unknown nodes.
updatePrior - Whether the classifier should update the prior of the node that it classifies. If true, then the prior object is updated with the predicted classification.
Returns:
The index of the class that this node is classified as. It returns -1 if it abstains.
See Also:
estimate(netkit.graph.Node, netkit.classifiers.Estimate, double[], boolean), ClassifierImp.notifyListeners(netkit.graph.Node, int)

estimate

public final boolean estimate(Node node,
                              Estimate prior,
                              double[] result,
                              boolean updatePrior)
Estimate the probabilities that a given node into belongs to any given class It may use the class estimations of other nodes and may update the prior of the given node.

This method is final and sets the prior object before it calls estimate(Node,double[]) to do the actual estimate. It takes care of the prior if necessary (by updating the prior object)

Specified by:
estimate in interface NetworkClassifier
Parameters:
node - The node to estimate.
prior - The current class estimates of all initially unknown nodes.
result - The array that is filled in with class estimates
updatePrior - Whether the classifier should update the prior of the node that it classifies. If true, then the prior object is updated with the new estimates.
Returns:
Whether the classifier abstained or inferred class probabilities
See Also:
estimate(netkit.graph.Node, double[]), prior

estimate

public final double[] estimate(Node node,
                               Estimate prior,
                               boolean updatePrior)
Estimate the probabilities that a given node into belongs to any given class It may use the class estimations of other nodes and may update the prior of the given node.

This method is final and calls the estimate(Node,Estimate,double[],boolean) method using a temporary double array, which is returned if the classifier does not abstain. If it abstains (the called estimate method returns false), then this method returns null.

Specified by:
estimate in interface NetworkClassifier
Parameters:
node - The node to estimate.
prior - The current class estimates of all initially unknown nodes.
updatePrior - Whether the classifier should update the prior of the node that it classifies. If true, then the prior object is updated with the new estimates.
Returns:
An array that is filled in with class estimates. This is null is the classifier abstains
See Also:
estimate(netkit.graph.Node, netkit.classifiers.Estimate, double[], boolean)

estimate

public final boolean estimate(Node node,
                              Estimate prior,
                              Estimate result,
                              boolean updatePrior)
Estimate the probabilities that a given node into belongs to any given class It may use the class estimations of other nodes and may update the prior of the given node.

This method is final and calls the estimate(Node,Estimate,double[],boolean) method.

Specified by:
estimate in interface NetworkClassifier
Parameters:
node - The node to estimate.
prior - The current class estimates of all initially unknown nodes.
result - The Estimate object that is updated with class estimates.
updatePrior - Whether the classifier should update the prior of the node that it classifies. If true, then the prior object is updated with the new estimates.
Returns:
Whether the classifier abstained or inferred class probabilities
See Also:
estimate(netkit.graph.Node, netkit.classifiers.Estimate, double[], boolean)

estimate

public final boolean estimate(Node node,
                              double[] result)
Estimate the probabilities that a given node into belongs to any given class.

This method is final and calls the doEstimate(Node,Estimate,double[],boolean) method. It also notifies any listeners that the label for this node has been estimated.

Specified by:
estimate in interface Classifier
Parameters:
node - The node to estimate.
result - The Estimate object that is updated with class estimates.
Returns:
Whether the classifier abstained or inferred class probabilities
See Also:
doEstimate(netkit.graph.Node, double[]), ClassifierImp.notifyListeners(netkit.graph.Node, double[])