Into

Modules

Documentation

classPiiClassifierOperation

#include <PiiClassifierOperation.h>

A superclass for classifier operations.

Inherits PiiDefaultOperation, PiiProgressController

Inherited by PiiBoostClassifierOperation, PiiPerceptronOperation, PiiPointMatchingOperation, PiiVectorQuantizerOperation

Description

This operation provides standard classification and learning facilities. In classification, a classification algorithm (usually a class derived from PiiClassifier) is used to map incoming feature vectors to real numbers. In learning, the operation will collect samples until the learning algorithm (usually a class derived from PiiLearningAlgorithm) is started.

Learning is usually an off-line process in which a batch of samples is first collected and a learning algorithm is applied to it. Certain algorithms such as the SOM are also capable of incremental (on-line) learning.

The learningBatchSize property is used as a learning/classification switch. Setting its value to zero disables learning and turns the operation into classification mode. If the learning algorithm is capable of on-line learning and learningBatchSize is set to one, each incoming sample will be directly sent to learning.

Batch learning must be initiated by the user by calling the startLearningThread() function. Although learning is usually done off-line, it is possible to start the learning thread while the operation is running. The old classifier will be replaced by the new one once the learning thread finishes. The downside of run-time learning is that the old classifier must be kept in memory while training. If you want to avoid this, reset the old classifier before learning.

Inputs

featuresa feature vector. Features are usually represented as a row matrix with a primitive content type (such as PiiMatrix<double>), but subclasses are free to use any feature representation appropriate for the task in hand.
labela label for the corresponding feature vector (double). This input is ignored by non-supervised classifiers (such as the SOM). In supervised classifiers (such as k-NN), the input can be left unconnected in classification, but not in learning.
weightan optional weight for the training sample. This input will be used only by learning algorithms that are capable of weighted learning, and only in training mode. If this input is not connected, a weight of 1.0 is assumed for all samples.

Outputs

classificationthe result of classification (double). Either a class index or a regression. NaN indicates failures.
The usual way of creating a custom classifier is to first create an operation class that reflects the configuration of the classifier in its properties and uses pure virtual getter and setter functions for each. Then, an inner template class called "Template" is derived from this operation class and instantiated for float and double types. The reason for this design pattern is that Qt's moc can't cope with template classes. This pattern is implemented, for example, in PiiKnnClassifierOperation.
The template classes are registered to the resource database so that the template type is a part of the name. For example, PiiKnnClassifierOperation::Template<double> is registered as PiiKnnClassifierOperation<double>.

Once the first feature vector has been received, the number of features in subsequent feature vectors must stay the same. One needs to explicitly reset the classifier before samples with a different number of features can be used. If the operation has training data, both the current classifier and the collected training data must be cleared. This is done by calling reset() and setting learningBatchSize to zero.

Properties

int

The number of samples currently buffered.

PiiClassification::LearnerCapabilities

A read-only property that specifies the capabilities of the learning algorithm.

int

The number of features in each sample.

The action to perform with new samples when learningBatchSize has been exceeded.

int

The maximum number of training samples collected for learning.

A textual description of a learning error.

bool

A read-only property whose value is true when the learning thread is running, and false otherwise.

double

Progress required to emit the progressed() signal.

Public Slots

bool
( )

Learns the batch of collected samples.

void
( )

Resets the classifier.

bool

Start the learning thread.

void

Stop the learning thread.

Signals

void
(
  • bool success
)

This signal is emitted when the learning thread finishes.

void
(
  • double percentage
)

Informs about the progress of a learning algorithm.

Constructors and destructor

(
  • PiiClassification::LearnerCapabilities capabilities
)

Constructs a new classifier operation.

Destroys the operation.

Public member functions

virtual bool
(
  • double progressPercentage
)

Implementation of the PiiProgressController interface.

virtual void
(
  • bool reset
)

If reset is true and the learning thread is running, this function stops it.

Protected member functions

virtual void

Called by setState() just before the operation changes to a new state.

virtual int

Returns the number of samples currently in buffer.

PiiClassification::LearnerCapabilities

Get a pointer to the ouput that emits a class index for each incoming feature vector (also in learning mode).

virtual double
( )  = 0

Reads a feature vector from the features input and emits its classification to the classification output.

template<class SampleSet>
double
( )

Reads a feature vector from the features input and calls classifier.classify() using it as the input.

virtual void
(
  • double label
  • double weight
)  = 0

Reads a feature vector from the features input and stores it into a batch of samples that will be used as the training samples when the training thread is started.

virtual int
( )  = 0

Returns the number of features the classifier/learning algorithm is expecting.

Get a pointer to the input that receives feature vectors.

virtual void

Called when the operation stops after on-line learning.

Get a pointer to the (optional) input that receives class indices for the incoming feature vectors.

virtual bool

Trains a learning algorithm with the collected set of samples.

template<class SampleSet>
bool
( )

A template function that installs this as the progress controller to algorithm and feeds it with the given samples and labels.

int
bool
template<class SampleSet>
int
( )

Reads a feature vector from the features input and calls algorithm.learnOne() using label as the class label and weight as the importance.

virtual double
(
  • double label
  • double weight
)

Reads a feature vector from the features input, sends it to an on-line learning algorithm, and emits the classification result to the classification output.

virtual bool

Returns true if the learning algorithm needs a learning thread, and false otherwise.

(
  • PiiClassification::LearnerCapabilities capabilities
)

Constructs a new classifier operation.

virtual void

Classifies an incoming feature vector (see classify()).

double
double

With supervised learning algorithms, this function reads the label input and returns the class label.

double

Returns the value read from the weight input, or 1.0 if the input is not connected.

virtual void
( )  = 0

Replaces the current classifier with a newly trained one.

virtual void
( )  = 0

Resets the classifier.

virtual void
(
  • int newSize
)  = 0

Resizes the batch of buffered samples.

void
void
(
  • int learningBatchSize
)
void
( )

Sets the learning error message.

void
(
  • double progressStep
)

Get a pointer to the (optional) input that receives weights associated with incoming samples.

Property details

  • int bufferedSampleCount

    [read]

    The number of samples currently buffered.

  • PiiClassification::LearnerCapabilities capabilities

    [read]

    A read-only property that specifies the capabilities of the learning algorithm.

  • int featureCount

    [read]

    The number of features in each sample.

    Initially, this value is unknown (0). This property will be set when the first sample is sent to the learning algorithm.

  • PiiClassification::FullBufferBehavior fullBufferBehavior

    [read, write]

    The action to perform with new samples when learningBatchSize has been exceeded.

    The default is OverwriteRandomSample.

  • int learningBatchSize

    [read, write]

    The maximum number of training samples collected for learning.

    This property is also used as a training/classification switch. Setting the value to zero means that no training samples will be collected, and the operation will only classify incoming samples. If learningBatchSize is set to one and the learning algorithm is capable of on-line learning, incoming samples will be used to train the algorithm one by one. If learningBatchSize is set to N (N > 1), a buffer of N first, last or randomly selected samples will be kept in memory. If learningBatchSize is -1, all incoming samples will be buffered without a limit. The buffered samples will be used as training data to the learning algorithm, which will be run in a separate thread (see startLearningThread()). The default value is 0.

  • QString learningError

    [read]

    A textual description of a learning error.

  • bool learningThreadRunning

    [read]

    A read-only property whose value is true when the learning thread is running, and false otherwise.

  • double progressStep

    [read, write]

    Progress required to emit the progressed() signal.

    Must be set to a value in [0,1]. Set to 0 to disable the signal. Set to 1 to send the signal only after training is complete. The default is 0.01, which means that every percent of progress will be reported (unless the training algorithm advances faster and checks the condition e.g. only every two percents).

Function details

  • PiiClassifierOperation

    (
    • PiiClassification::LearnerCapabilities capabilities
    )
    [protected]

    Constructs a new classifier operation.

  • ~PiiClassifierOperation

    ()

    Destroys the operation.

    The operation will not be destructed until the learning thread has finished.

  • virtual bool canContinue

    (
    • double progressPercentage
    )
    [virtual]

    Implementation of the PiiProgressController interface.

    This function is called by learning algorithms to check if they are still allowed to proceed. This function returns true if startLearningThread() has been called and stopLearningThread() has not been called. It also emits the progressed() signal if progressPercentage is not NaN and it is progressStep units larger than the previous recorded progress.

    Reimplemented from PiiProgressController.

  • virtual void check

    (
    • bool reset
    )
    [virtual]

    If reset is true and the learning thread is running, this function stops it.

    Otherwise just passes the call to the superclass.

    Reimplemented from PiiDefaultOperation.

  • virtual void aboutToChangeState

    ( )
    [protected, virtual]

    Called by setState() just before the operation changes to a new state.

    The function will be called independent of the cause of the state change (internal or external). Derived classes may implement this function to perform whatever functionality is needed when a state changes. The default implementation does nothing.

    Reimplemented from PiiBasicOperation.

  • virtual int bufferedSampleCount

    ()
    [protected, pure virtual]

    Returns the number of samples currently in buffer.

    Must be implemented by subclasses to return the number of samples currently in buffer.

  • PiiClassification::LearnerCapabilities capabilities

    ()
    [protected]
  • PiiOutputSocket * classificationOutput

    ()
    [protected]

    Get a pointer to the ouput that emits a class index for each incoming feature vector (also in learning mode).

  • virtual double classify

    ()
    [protected, pure virtual]

    Reads a feature vector from the features input and emits its classification to the classification output.

    May also send additional objects through other output sockets. This function is called by process() when an incoming sample must be classified.

    Returns

    the classification

  • template<class SampleSet>

    double classify

    ( )
    [protected]

    Reads a feature vector from the features input and calls classifier.classify() using it as the input.

  • virtual void collectSample

    (
    • double label
    • double weight
    )
    [protected, pure virtual]

    Reads a feature vector from the features input and stores it into a batch of samples that will be used as the training samples when the training thread is started.

    This function is called by process() during learning if the learning algorithm is not capable of on-line learning or if batch-based learning is requested by the user. Subclasses should respect the value of learningBatchSize. If the current number of samples in the batch is larger than learningBatchSize (and the batch size is not -1), the sample must be either be discarded or one of the older samples must be replaced, depending on fullBufferBehavior.

    Parameters
    label

    the classification of the training sample, or NaN if not applicable.

    weight

    the importance of the sample. If the weight input is not connected, this value will always be 1.0. Learning algorithms that are not capable of weighted learning will ignore this value.

  • virtual int featureCount

    ()
    [protected, pure virtual]

    Returns the number of features the classifier/learning algorithm is expecting.

    If no feature vectors have been seen so far, zero will be returned. Note that once the first sample have been received, the number of features in subsequent feature vectors must be the same.

  • PiiInputSocket * featureInput

    ()
    [protected]

    Get a pointer to the input that receives feature vectors.

  • virtual void finishOnlineLearning

    ()
    [protected, virtual]

    Called when the operation stops after on-line learning.

    Subclasses can override this function to clean up the resources allocated by on-line learning.

  • PiiClassification::FullBufferBehavior fullBufferBehavior

    ()
    [protected]
  • PiiInputSocket * labelInput

    ()
    [protected]

    Get a pointer to the (optional) input that receives class indices for the incoming feature vectors.

  • virtual bool learnBatch

    ()
    [protected, virtual]

    Trains a learning algorithm with the collected set of samples.

    This function is called by the learning thread and must be overridden by subclasses to feed the buffered samples to the learning algorithm. Typically, subclasses call the learnBatch(PiiLearningAlgorithm<SampleSet>*, const SampleSet&, const QVector<double>&, const QVector<double>&) function template.

    This function should not modify the currently operating classifier. If the learning thread is started while the operation is running, the normal functioning should not be changed. The old classifier must be replaced with the newly trained one in replaceClassifier().

    The default implementation returns false.

    Returns

    true if the training succeeded, false otherwise.

  • template<class SampleSet>

    bool learnBatch

    ( )
    [protected]

    A template function that installs this as the progress controller to algorithm and feeds it with the given samples and labels.

    Usually called from the implementation of the virtual learnBatch() function.

    Parameters
    algorithm

    the algorithm to train

    samples

    training samples

    labels

    labels for the samples, if applicable. An empty list can be provided for non-supervised learning algorithms.

    weights

    an importance factor for each sample. May be empty, in which case 1.0 will be used for all samples.

    Returns

    true if the algorithm was successfully trained, false otherwise.

  • int learningBatchSize

    ()
    [protected]
  • QString learningError

    ()
    [protected]
  • bool learningThreadRunning

    ()
    [protected]
  • template<class SampleSet>

    int learnOne

    ( )
    [protected]

    Reads a feature vector from the features input and calls algorithm.learnOne() using label as the class label and weight as the importance.

    Exceptions
    PiiExecutionException&

    if the features are of incorrect type or size.

  • virtual double learnOne

    (
    • double label
    • double weight
    )
    [protected, virtual]

    Reads a feature vector from the features input, sends it to an on-line learning algorithm, and emits the classification result to the classification output.

    May send additional objects through other output sockets. The default implementation emits and returns label.

    This function is called by process() when an incoming sample must be used for on-line learning (only if the learning algorithm is capable of on-line learning).

    Parameters
    label

    the classification of the training sample, or NaN if not applicable.

    weight

    the importance of the sample. If the weight input is not connected, this value will always be 1.0. Learning algorithms that are not capable of weighted learning will ignore this value.

    Returns

    the classification of the sample, if possible. NaN otherwise.

  • virtual bool needsThread

    ()
    [protected, virtual]

    Returns true if the learning algorithm needs a learning thread, and false otherwise.

    Some classifiers such as simple linear-search nearest neighbors don't need to be trained. In such a case this function returns false, no learning thread will be started, and the old classifier is immediately replaced by a new one. The default implementation returns true.

  • virtual void process

    ()
    [protected, virtual]

    Classifies an incoming feature vector (see classify()).

    If learningBatchSize is set to a non-zero value, and if the learning thread is not running, collects the incoming sample to a buffer (see collectSample()). If learningBatchSize is set to one and the learning algorithm is capable of on-line learning, the incoming sample will be sent directly to learning (see learnOne()).

    Reimplemented from PiiDefaultOperation.

  • double progressStep

    ()
    [protected]
  • double readLabel

    ()
    [protected]

    With supervised learning algorithms, this function reads the label input and returns the class label.

    With non-supervised learning algorithms, NaN will be returned.

    Exceptions
    PiiExecutionException&

    if the input object cannot be converted to a double.

  • double readWeight

    ()
    [protected]

    Returns the value read from the weight input, or 1.0 if the input is not connected.

    Exceptions
    PiiExecutionException&

    if the input object cannot be converted to a double.

  • virtual void replaceClassifier

    ()
    [protected, pure virtual]

    Replaces the current classifier with a newly trained one.

    This function is called if learnBatch() returns true. If the classifier provides information about itself as properties (such as the code book of an NN classifier), these property values need to be changed here.

  • virtual void resetClassifier

    ()
    [protected, pure virtual]

    Resets the classifier.

    This function is called by reset() after ensuring mutual exclusion with the learning thread.

  • virtual void resizeBatch

    (
    • int newSize
    )
    [protected, pure virtual]

    Resizes the batch of buffered samples.

    This function is called by #setLearningBatchSize() after ensuring mutual exclusion with the learning thread. The function will only be called if needed. If the new size is not smaller the current number of buffered samples, nothing needs to be done. The buffer will grow automatically to the target size when new samples are read.

    Parameters
    newSize

    the new size of the batch. The batch must be truncated to the requrested size.

  • void setFullBufferBehavior

    ( )
    [protected]
  • void setLearningBatchSize

    (
    • int learningBatchSize
    )
    [protected]
  • void setLearningError

    ( )
    [protected]

    Sets the learning error message.

  • void setProgressStep

    (
    • double progressStep
    )
    [protected]
  • PiiInputSocket * weightInput

    ()
    [protected]

    Get a pointer to the (optional) input that receives weights associated with incoming samples.

  • void learningFinished

    (
    • bool success
    )
    [signal]

    This signal is emitted when the learning thread finishes.

  • void progressed

    (
    • double percentage
    )
    [signal]

    Informs about the progress of a learning algorithm.

    This signal will be emitted from the learning thread started with #startLearning() every time progressStep is exceeded. Note that if the learning algorithm is not capable of estimating its progress, this signal will not be emitted until it is done.

  • bool learn

    ()
    [slot]

    Learns the batch of collected samples.

    This blocks until the learning algorithm finishes.

    Returns

    true if the samples were successfully learnt, false otherwise

  • void reset

    ()
    [slot]

    Resets the classifier.

    This function clears all training results and resets the classifier to its initial state. It should not change classifier configuration though. This function serves as as generic way of resetting a trained classifier. Subclasses may provide other features that have the same effect (such as setting the model samples of an NN classifier to an empty sample set).

    To clear buffered training data as well, set the learningBatchSize property to zero.

    Reimplemented from PiiFlowController::SyncListener.

  • bool startLearningThread

    ()
    [slot]

    Start the learning thread.

    If the number of buffered samples is less than two or the learning thread is already running, this function does nothing. Otherwise, it starts a thread that sends the buffered samples to the learning algorithm (see learnBatch()). The thread can be interrupted by calling stopLearningThread().

    The learning thread will stop once learnBatch() returns.

    Returns

    true if the learning thread was successfully started, false otherwise. The call will fail if the learning thread is already running or there are no buffered samples to learn.

  • void stopLearningThread

    ()
    [slot]

    Stop the learning thread.

    After this function has been called, canContinue() will return false, which interrupts the learning algorithm.

Notes (0)

Add a note

Not a single note added yet. Be the first, add yours.