|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object weka.filters.Filter KEAPhraseFilter
public class KEAPhraseFilter
This filter splits the text in selected string attributes into phrases. The resulting string attributes contain these phrases separated by '\n' characters. Phrases are identified according to the following definitions: A phrase is a sequence of words interrupted only by sequences of whitespace characters, where each sequence of whitespace characters contains at most one '\n'. A word is a sequence of letters or digits that contains at least one letter, with the following exceptions: a) '.', '@', '_', '&', '/', '-' are allowed if surrounded by letters or digits, b) '\'' is allowed if preceeded by a letter or digit, c) '-', '/' are also allowed if succeeded by whitespace characters followed by another word. In that case the whitespace characters will be deleted.
Constructor Summary | |
---|---|
KEAPhraseFilter()
|
Method Summary | |
---|---|
java.lang.String |
attributeIndicesTipText()
Returns the tip text for this property |
boolean |
batchFinished()
Signify that this batch of input to the filter is finished. |
java.lang.String |
disallowInternalPeriodsTipText()
Returns the tip text for this property |
java.lang.String |
getAttributeIndices()
Get the current range selection. |
boolean |
getDisallowInternalPeriods()
Get whether the supplied columns are to be processed |
boolean |
getInvertSelection()
Get whether the supplied columns are to be processed |
java.lang.String[] |
getOptions()
Gets the current settings of the filter. |
java.lang.String |
globalInfo()
Returns a string describing this filter |
boolean |
input(Instance instance)
Input an instance for filtering. |
java.lang.String |
invertSelectionTipText()
Returns the tip text for this property |
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options |
static void |
main(java.lang.String[] argv)
Main method for testing this class. |
void |
setAttributeIndices(java.lang.String rangeList)
Set which attributes are to be processed |
void |
setAttributeIndicesArray(int[] attributes)
Set which attributes are to be processed |
void |
setDisallowInternalPeriods(boolean disallow)
Set whether selected columns should be processed. |
boolean |
setInputFormat(Instances instanceInfo)
Sets the format of the input instances. |
void |
setInvertSelection(boolean invert)
Set whether selected columns should be processed. |
void |
setOptions(java.lang.String[] options)
Parses a given list of options controlling the behaviour of this object. |
Methods inherited from class weka.filters.Filter |
---|
batchFilterFile, filterFile, getOutputFormat, inputFormat, isOutputFormatDefined, numPendingOutput, output, outputFormat, outputPeek, useFilter |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public KEAPhraseFilter()
Method Detail |
---|
public java.lang.String globalInfo()
public java.util.Enumeration listOptions()
listOptions
in interface OptionHandler
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-R index1,index2-index4,...
Specify list of attributes to process. First and last are valid indexes.
(default none)
-V
Invert matching sense
-P
Disallow internal periods
setOptions
in interface OptionHandler
options
- the list of options as an array of strings
java.lang.Exception
- if an option is not supportedpublic java.lang.String[] getOptions()
getOptions
in interface OptionHandler
public boolean setInputFormat(Instances instanceInfo) throws java.lang.Exception
setInputFormat
in class Filter
instanceInfo
- an Instances object containing the input
instance structure (any instances contained in the object are
ignored - only the structure is required).
java.lang.Exception
- if the inputFormat can't be set successfullypublic boolean input(Instance instance) throws java.lang.Exception
input
in class Filter
instance
- the input instance
java.lang.Exception
- if the input instance was not of the correct
format or if there was a problem with the filtering.public boolean batchFinished() throws java.lang.Exception
batchFinished
in class Filter
java.lang.NullPointerException
- if no input structure has been defined,
java.lang.Exception
- if there was a problem finishing the batch.public static void main(java.lang.String[] argv)
argv
- should contain arguments to the filter: use -h for helppublic java.lang.String invertSelectionTipText()
public boolean getInvertSelection()
public void setInvertSelection(boolean invert)
invert
- the new invert settingpublic java.lang.String disallowInternalPeriodsTipText()
public boolean getDisallowInternalPeriods()
public void setDisallowInternalPeriods(boolean disallow)
disallow
- the new invert settingpublic java.lang.String attributeIndicesTipText()
public java.lang.String getAttributeIndices()
public void setAttributeIndices(java.lang.String rangeList)
rangeList
- a string representing the list of attributes. Since
the string will typically come from a user, attributes are indexed from
1. public void setAttributeIndicesArray(int[] attributes)
attributes
- an array containing indexes of attributes to select.
Since the array will typically come from a program, attributes are indexed
from 0.
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |