classPiiClassifierOperation
#include <PiiClassifierOperation.h>
A superclass for classifier operations.
Inherits PiiDefaultOperation, PiiProgressController
Inherited by PiiBoostClassifierOperation, PiiPerceptronOperation, PiiPointMatchingOperation, PiiVectorQuantizerOperation
Description
This operation provides standard classification and learning facilities. In classification, a classification algorithm (usually a class derived from PiiClassifier) is used to map incoming feature vectors to real numbers. In learning, the operation will collect samples until the learning algorithm (usually a class derived from PiiLearningAlgorithm) is started.
Learning is usually an off-line process in which a batch of samples is first collected and a learning algorithm is applied to it. Certain algorithms such as the SOM are also capable of incremental (on-line) learning.
The learningBatchSize property is used as a learning/classification switch. Setting its value to zero disables learning and turns the operation into classification mode. If the learning algorithm is capable of on-line learning and learningBatchSize is set to one, each incoming sample will be directly sent to learning.
Batch learning must be initiated by the user by calling the startLearningThread() function. Although learning is usually done off-line, it is possible to start the learning thread while the operation is running. The old classifier will be replaced by the new one once the learning thread finishes. The downside of run-time learning is that the old classifier must be kept in memory while training. If you want to avoid this, reset the old classifier before learning.
Inputs
Outputs
NaN indicates failures.
float and double types. The reason
for this design pattern is that Qt's moc can't cope with template
classes. This pattern is implemented, for example, in PiiKnnClassifierOperation.
Once the first feature vector has been received, the number of features in subsequent feature vectors must stay the same. One needs to explicitly reset the classifier before samples with a different number of features can be used. If the operation has training data, both the current classifier and the collected training data must be cleared. This is done by calling reset() and setting learningBatchSize to zero.
Properties
|
int
|
The number of samples currently buffered. |
|
PiiClassification::LearnerCapabilities
|
A read-only property that specifies the capabilities of the learning algorithm. |
|
int
|
The number of features in each sample. |
|
The action to perform with new samples when learningBatchSize has been exceeded. |
|
|
int
|
The maximum number of training samples collected for learning. |
|
A textual description of a learning error. |
|
|
bool
|
A read-only property whose value is |
|
double
|
Progress required to emit the progressed() signal. |
Public Slots
|
bool
|
( )
Learns the batch of collected samples. |
|
void
|
( )
Resets the classifier. |
|
bool
|
Start the learning thread. |
|
void
|
Stop the learning thread. |
Signals
|
void
|
(
This signal is emitted when the learning thread finishes. |
|
void
|
(
Informs about the progress of a learning algorithm. |
Constructors and destructor
|
(
Constructs a new classifier operation. |
|
|
Destroys the operation. |
Public member functions
|
virtual bool
|
(
Implementation of the PiiProgressController interface. |
|
virtual void
|
(
If reset is |
Protected member functions
|
virtual void
|
Called by setState() just before the operation changes to a new state. |
|
virtual int
|
( )
= 0
Returns the number of samples currently in buffer. |
|
PiiClassification::LearnerCapabilities
|
( )
|
|
Get a pointer to the ouput that emits a class index for each incoming feature vector (also in learning mode). |
|
|
virtual double
|
( )
= 0
Reads a feature vector from the |
|
template<class SampleSet>
double
|
Reads a feature vector from the |
|
virtual void
|
(
Reads a feature vector from the |
|
virtual int
|
( )
= 0
Returns the number of features the classifier/learning algorithm is expecting. |
|
( )
Get a pointer to the input that receives feature vectors. |
|
|
virtual void
|
Called when the operation stops after on-line learning. |
|
( )
Get a pointer to the (optional) input that receives class indices for the incoming feature vectors. |
|
|
virtual bool
|
( )
Trains a learning algorithm with the collected set of samples. |
|
template<class SampleSet>
bool
|
(
A template function that installs |
|
int
|
|
|
( )
|
|
|
bool
|
|
|
template<class SampleSet>
int
|
Reads a feature vector from the |
|
virtual double
|
(
Reads a feature vector from the |
|
virtual bool
|
( )
Returns |
|
(
Constructs a new classifier operation. |
|
|
virtual void
|
( )
Classifies an incoming feature vector (see classify()). |
|
double
|
( )
|
|
double
|
( )
With supervised learning algorithms, this function reads the
|
|
double
|
( )
Returns the value read from the |
|
virtual void
|
( )
= 0
Replaces the current classifier with a newly trained one. |
|
virtual void
|
( )
= 0
Resets the classifier. |
|
virtual void
|
(
Resizes the batch of buffered samples. |
|
void
|
|
|
void
|
(
|
|
void
|
Sets the learning error message. |
|
void
|
(
|
|
( )
Get a pointer to the (optional) input that receives weights associated with incoming samples. |
Property details
-
int bufferedSampleCount
[read]The number of samples currently buffered.
-
PiiClassification::LearnerCapabilities capabilities
[read]A read-only property that specifies the capabilities of the learning algorithm.
-
int featureCount
[read]The number of features in each sample.
Initially, this value is unknown (0). This property will be set when the first sample is sent to the learning algorithm.
-
PiiClassification::FullBufferBehavior fullBufferBehavior
[read, write]The action to perform with new samples when learningBatchSize has been exceeded.
The default is
OverwriteRandomSample. -
int learningBatchSize
[read, write]The maximum number of training samples collected for learning.
This property is also used as a training/classification switch. Setting the value to zero means that no training samples will be collected, and the operation will only classify incoming samples. If
learningBatchSizeis set to one and the learning algorithm is capable of on-line learning, incoming samples will be used to train the algorithm one by one. IflearningBatchSizeis set to N (N > 1), a buffer of N first, last or randomly selected samples will be kept in memory. IflearningBatchSizeis -1, all incoming samples will be buffered without a limit. The buffered samples will be used as training data to the learning algorithm, which will be run in a separate thread (see startLearningThread()). The default value is 0. -
QString learningError
[read]A textual description of a learning error.
-
bool learningThreadRunning
[read]A read-only property whose value is
truewhen the learning thread is running, andfalseotherwise. -
double progressStep
[read, write]Progress required to emit the progressed() signal.
Must be set to a value in [0,1]. Set to 0 to disable the signal. Set to 1 to send the signal only after training is complete. The default is 0.01, which means that every percent of progress will be reported (unless the training algorithm advances faster and checks the condition e.g. only every two percents).
Function details
-
PiiClassifierOperation
(- PiiClassification::LearnerCapabilities capabilities
[protected]Constructs a new classifier operation.
-
~PiiClassifierOperation
()Destroys the operation.
The operation will not be destructed until the learning thread has finished.
-
virtual bool canContinue
(- double progressPercentage
[virtual]Implementation of the PiiProgressController interface.
This function is called by learning algorithms to check if they are still allowed to proceed. This function returns
trueif startLearningThread() has been called and stopLearningThread() has not been called. It also emits the progressed() signal if progressPercentage is notNaNand it is progressStep units larger than the previous recorded progress.Reimplemented from PiiProgressController.
-
virtual void check
(- bool reset
[virtual]If reset is
trueand the learning thread is running, this function stops it.Otherwise just passes the call to the superclass.
Reimplemented from PiiDefaultOperation.
-
Called by setState() just before the operation changes to a new state.
The function will be called independent of the cause of the state change (internal or external). Derived classes may implement this function to perform whatever functionality is needed when a state changes. The default implementation does nothing.
Reimplemented from PiiBasicOperation.
-
virtual int bufferedSampleCount
()[protected, pure virtual]Returns the number of samples currently in buffer.
Must be implemented by subclasses to return the number of samples currently in buffer.
-
PiiClassification::LearnerCapabilities capabilities
()[protected] -
Get a pointer to the ouput that emits a class index for each incoming feature vector (also in learning mode).
-
virtual double classify
()[protected, pure virtual]Reads a feature vector from the
featuresinput and emits its classification to theclassificationoutput.May also send additional objects through other output sockets. This function is called by process() when an incoming sample must be classified.
Returns
the classification
-
Reads a feature vector from the
featuresinput and calls classifier.classify() using it as the input. -
virtual void collectSample
(- double label
- double weight
[protected, pure virtual]Reads a feature vector from the
featuresinput and stores it into a batch of samples that will be used as the training samples when the training thread is started.This function is called by process() during learning if the learning algorithm is not capable of on-line learning or if batch-based learning is requested by the user. Subclasses should respect the value of learningBatchSize. If the current number of samples in the batch is larger than learningBatchSize (and the batch size is not -1), the sample must be either be discarded or one of the older samples must be replaced, depending on fullBufferBehavior.
Parameters
- label
-
the classification of the training sample, or
NaNif not applicable. - weight
-
the importance of the sample. If the
weightinput is not connected, this value will always be 1.0. Learning algorithms that are not capable of weighted learning will ignore this value.
-
virtual int featureCount
()[protected, pure virtual]Returns the number of features the classifier/learning algorithm is expecting.
If no feature vectors have been seen so far, zero will be returned. Note that once the first sample have been received, the number of features in subsequent feature vectors must be the same.
-
Get a pointer to the input that receives feature vectors.
-
virtual void finishOnlineLearning
()[protected, virtual]Called when the operation stops after on-line learning.
Subclasses can override this function to clean up the resources allocated by on-line learning.
-
-
Get a pointer to the (optional) input that receives class indices for the incoming feature vectors.
-
virtual bool learnBatch
()[protected, virtual]Trains a learning algorithm with the collected set of samples.
This function is called by the learning thread and must be overridden by subclasses to feed the buffered samples to the learning algorithm. Typically, subclasses call the learnBatch(PiiLearningAlgorithm<SampleSet>*, const SampleSet&, const QVector<double>&, const QVector<double>&) function template.
This function should not modify the currently operating classifier. If the learning thread is started while the operation is running, the normal functioning should not be changed. The old classifier must be replaced with the newly trained one in replaceClassifier().
The default implementation returns
false.Returns
trueif the training succeeded,falseotherwise. -
template<class SampleSet>
bool learnBatch
(- PiiLearningAlgorithm< SampleSet > & algorithm
- const SampleSet & samples
- const QVector< double > & labels
- const QVector< double > & weights = < double >()
[protected]A template function that installs
thisas the progress controller to algorithm and feeds it with the given samples and labels.Usually called from the implementation of the virtual learnBatch() function.
Parameters
- algorithm
-
the algorithm to train
- samples
-
training samples
- labels
-
labels for the samples, if applicable. An empty list can be provided for non-supervised learning algorithms.
- weights
-
an importance factor for each sample. May be empty, in which case 1.0 will be used for all samples.
Returns
trueif the algorithm was successfully trained,falseotherwise. -
int learningBatchSize
()[protected] -
-
bool learningThreadRunning
()[protected] -
template<class SampleSet>
int learnOne
[protected]Reads a feature vector from the
featuresinput and calls algorithm.learnOne() using label as the class label and weight as the importance.Exceptions
- PiiExecutionException&
-
if the features are of incorrect type or size.
-
virtual double learnOne
(- double label
- double weight
[protected, virtual]Reads a feature vector from the
featuresinput, sends it to an on-line learning algorithm, and emits the classification result to theclassificationoutput.May send additional objects through other output sockets. The default implementation emits and returns label.
This function is called by process() when an incoming sample must be used for on-line learning (only if the learning algorithm is capable of on-line learning).
Parameters
- label
-
the classification of the training sample, or
NaNif not applicable. - weight
-
the importance of the sample. If the
weightinput is not connected, this value will always be 1.0. Learning algorithms that are not capable of weighted learning will ignore this value.
Returns
the classification of the sample, if possible.
NaNotherwise. -
virtual bool needsThread
()[protected, virtual]Returns
trueif the learning algorithm needs a learning thread, andfalseotherwise.Some classifiers such as simple linear-search nearest neighbors don't need to be trained. In such a case this function returns
false, no learning thread will be started, and the old classifier is immediately replaced by a new one. The default implementation returnstrue. -
virtual void process
()[protected, virtual]Classifies an incoming feature vector (see classify()).
If learningBatchSize is set to a non-zero value, and if the learning thread is not running, collects the incoming sample to a buffer (see collectSample()). If learningBatchSize is set to one and the learning algorithm is capable of on-line learning, the incoming sample will be sent directly to learning (see learnOne()).
Reimplemented from PiiDefaultOperation.
-
double progressStep
()[protected] -
double readLabel
()[protected]With supervised learning algorithms, this function reads the
labelinput and returns the class label.With non-supervised learning algorithms,
NaNwill be returned.Exceptions
- PiiExecutionException&
-
if the input object cannot be converted to a
double.
-
double readWeight
()[protected]Returns the value read from the
weightinput, or 1.0 if the input is not connected.Exceptions
- PiiExecutionException&
-
if the input object cannot be converted to a
double.
-
virtual void replaceClassifier
()[protected, pure virtual]Replaces the current classifier with a newly trained one.
This function is called if learnBatch() returns
true. If the classifier provides information about itself as properties (such as the code book of an NN classifier), these property values need to be changed here. -
virtual void resetClassifier
()[protected, pure virtual]Resets the classifier.
This function is called by reset() after ensuring mutual exclusion with the learning thread.
-
virtual void resizeBatch
(- int newSize
[protected, pure virtual]Resizes the batch of buffered samples.
This function is called by #setLearningBatchSize() after ensuring mutual exclusion with the learning thread. The function will only be called if needed. If the new size is not smaller the current number of buffered samples, nothing needs to be done. The buffer will grow automatically to the target size when new samples are read.
Parameters
- newSize
-
the new size of the batch. The batch must be truncated to the requrested size.
-
-
void setLearningBatchSize
(- int learningBatchSize
[protected] -
Sets the learning error message.
-
void setProgressStep
(- double progressStep
[protected] -
Get a pointer to the (optional) input that receives weights associated with incoming samples.
-
void learningFinished
(- bool success
[signal]This signal is emitted when the learning thread finishes.
-
void progressed
(- double percentage
[signal]Informs about the progress of a learning algorithm.
This signal will be emitted from the learning thread started with #startLearning() every time progressStep is exceeded. Note that if the learning algorithm is not capable of estimating its progress, this signal will not be emitted until it is done.
-
bool learn
()[slot]Learns the batch of collected samples.
This blocks until the learning algorithm finishes.
Returns
trueif the samples were successfully learnt,falseotherwise -
void reset
()[slot]Resets the classifier.
This function clears all training results and resets the classifier to its initial state. It should not change classifier configuration though. This function serves as as generic way of resetting a trained classifier. Subclasses may provide other features that have the same effect (such as setting the model samples of an NN classifier to an empty sample set).
To clear buffered training data as well, set the learningBatchSize property to zero.
Reimplemented from PiiFlowController::SyncListener.
-
bool startLearningThread
()[slot]Start the learning thread.
If the number of buffered samples is less than two or the learning thread is already running, this function does nothing. Otherwise, it starts a thread that sends the buffered samples to the learning algorithm (see learnBatch()). The thread can be interrupted by calling stopLearningThread().
The learning thread will stop once learnBatch() returns.
Returns
trueif the learning thread was successfully started,falseotherwise. The call will fail if the learning thread is already running or there are no buffered samples to learn. -
void stopLearningThread
()[slot]Stop the learning thread.
After this function has been called, canContinue() will return
false, which interrupts the learning algorithm.
Add a note
Not a single note added yet. Be the first, add yours.