weka.core
Class ContingencyTables

java.lang.Object
  extended by weka.core.ContingencyTables

public class ContingencyTables
extends java.lang.Object

Class implementing some statistical routines for contingency tables.

Version:
$Revision: 1.1 $
Author:
Eibe Frank ([email protected])

Constructor Summary
ContingencyTables()
           
 
Method Summary
static double chiSquared(double[][] matrix, boolean yates)
          Returns chi-squared probability for a given matrix.
static double chiVal(double[][] matrix, boolean useYates)
          Computes chi-squared statistic for a contingency table.
static boolean cochransCriterion(double[][] matrix)
          Tests if Cochran's criterion is fullfilled for the given contingency table.
static double CramersV(double[][] matrix)
          Computes Cramer's V for a contingency table.
static double entropy(double[] array)
          Computes the entropy of the given array.
static double entropyConditionedOnColumns(double[][] matrix)
          Computes conditional entropy of the rows given the columns.
static double entropyConditionedOnRows(double[][] matrix)
          Computes conditional entropy of the columns given the rows.
static double entropyConditionedOnRows(double[][] train, double[][] test, double numClasses)
          Computes conditional entropy of the columns given the rows of the test matrix with respect to the train matrix.
static double entropyOverColumns(double[][] matrix)
          Computes the columns' entropy for the given contingency table.
static double entropyOverRows(double[][] matrix)
          Computes the rows' entropy for the given contingency table.
static double gainRatio(double[][] matrix)
          Computes gain ratio for contingency table (split on rows).
static double log2MultipleHypergeometric(double[][] matrix)
          Returns negative base 2 logarithm of multiple hypergeometric probability for a contingency table.
static void main(java.lang.String[] ops)
          Main method for testing this class.
static double[][] reduceMatrix(double[][] matrix)
          Reduces a matrix by deleting all zero rows and columns.
static double symmetricalUncertainty(double[][] matrix)
          Calculates the symmetrical uncertainty for base 2.
static double tauVal(double[][] matrix)
          Computes Goodman and Kruskal's tau-value for a contingency table.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ContingencyTables

public ContingencyTables()
Method Detail

chiSquared

public static double chiSquared(double[][] matrix,
                                boolean yates)
Returns chi-squared probability for a given matrix.

Parameters:
matrix - the contigency table
yates - is Yates' correction to be used?
Returns:
the chi-squared probability

chiVal

public static double chiVal(double[][] matrix,
                            boolean useYates)
Computes chi-squared statistic for a contingency table.

Parameters:
matrix - the contigency table
yates - is Yates' correction to be used?
Returns:
the value of the chi-squared statistic

cochransCriterion

public static boolean cochransCriterion(double[][] matrix)
Tests if Cochran's criterion is fullfilled for the given contingency table. Rows and columns with all zeros are not considered relevant.

Parameters:
matrix - the contigency table to be tested
Returns:
true if contingency table is ok, false if not

CramersV

public static double CramersV(double[][] matrix)
Computes Cramer's V for a contingency table.

Parameters:
matrix - the contingency table
Returns:
Cramer's V

entropy

public static double entropy(double[] array)
Computes the entropy of the given array.

Parameters:
array - the array
Returns:
the entropy

entropyConditionedOnColumns

public static double entropyConditionedOnColumns(double[][] matrix)
Computes conditional entropy of the rows given the columns.

Parameters:
matrix - the contingency table
Returns:
the conditional entropy of the rows given the columns

entropyConditionedOnRows

public static double entropyConditionedOnRows(double[][] matrix)
Computes conditional entropy of the columns given the rows.

Parameters:
matrix - the contingency table
Returns:
the conditional entropy of the columns given the rows

entropyConditionedOnRows

public static double entropyConditionedOnRows(double[][] train,
                                              double[][] test,
                                              double numClasses)
Computes conditional entropy of the columns given the rows of the test matrix with respect to the train matrix. Uses a Laplace prior. Does NOT normalize the entropy.

Parameters:
train - the train matrix
test - the test matrix
the - number of symbols for Laplace
Returns:
the entropy

entropyOverRows

public static double entropyOverRows(double[][] matrix)
Computes the rows' entropy for the given contingency table.

Parameters:
matrix - the contingency table
Returns:
the rows' entropy

entropyOverColumns

public static double entropyOverColumns(double[][] matrix)
Computes the columns' entropy for the given contingency table.

Parameters:
matrix - the contingency table
Returns:
the columns' entropy

gainRatio

public static double gainRatio(double[][] matrix)
Computes gain ratio for contingency table (split on rows). Returns Double.MAX_VALUE if the split entropy is 0.

Parameters:
matrix - the contingency table
Returns:
the gain ratio

log2MultipleHypergeometric

public static double log2MultipleHypergeometric(double[][] matrix)
Returns negative base 2 logarithm of multiple hypergeometric probability for a contingency table.

Parameters:
matrix - the contingency table
Returns:
the log of the hypergeometric probability of the contingency table

reduceMatrix

public static double[][] reduceMatrix(double[][] matrix)
Reduces a matrix by deleting all zero rows and columns.

Parameters:
matrix - the matrix to be reduced
the - matrix with all zero rows and columns deleted

symmetricalUncertainty

public static double symmetricalUncertainty(double[][] matrix)
Calculates the symmetrical uncertainty for base 2.

Parameters:
matrix - the contingency table
Returns:
the calculated symmetrical uncertainty

tauVal

public static double tauVal(double[][] matrix)
Computes Goodman and Kruskal's tau-value for a contingency table.

Parameters:
matrix - the contingency table
Goodman - and Kruskal's tau-value

main

public static void main(java.lang.String[] ops)
Main method for testing this class.