Into

Modules

Documentation

classPiiFeatureCombiner

#include <PiiFeatureCombiner.h>

Feature combiner is an operation that combines feature vectors into a larger compound feature vector.

Inherits PiiDefaultOperation

Description

This is useful when one wants to use many different features together in classification. The operation reads from 2 (the default) up to 64 feature vectors, concatenates them into one compound vector and emits the result. Since the incoming vectors may vary in length, the operation also emits a vector that indicates the boundaries of the input vectors.

Feature combiner has the capability of learning scaling factors for otherwise incompatible distance measures. Given a list of distance measures and a batch of samples, PiiFeatureCombiner estimates the variances of distance measure outcomes by calculating pairwise distances between samples in the batch. It provides the inverse values of the variances as scaling factors that can be used as weights for a PiiMultiFeatureDistance (see PiiVectorQuantizerOperation::distanceWeights).

To use the learning facilities one needs to set the distanceMeasures property so that each incoming feature vector has a distance measure. The learningBatchSize property must be set to a value larger than one, or to -1 to start buffering samples. Once enough samples have been buffered, learning is started by calling startLearningThread(). While learning, the operation regularly emits the progressed() signal. When the learning thread quits, the distanceWeights property contains scaling factors that should make the individual distance measures commensurable.

Inputs

featuresXa feature vector. X varies from 0 to N, where N is the number of different feature vectors to combine. Any primitive matrix type. Only row matrices are accepted.

Outputs

featuresa concatenated feature vector. If any input is double, the result is double, then float, then int. If AAAA is read from input0 and BBBBB from input1, features will emit AAAABBBBB.

boundariesthe indices of vector ends (PiiMatrix<int>). The first vector is always at index 0, and the first boundary value is the index of the start of the second one. In the previous example, the boundaries would contain two values: 4 and 9.

An example

Let us assume that dynamicInputCount is set to three, and the following inputs are received: The compound feature vector will be a PiiMatrix<double> [ 0.1, 0.2, 0.3, 4, 5, 6, 7, 80.0 ]. The boundaries output will emit a PiiMatrix<int> [ 3, 7, 8 ].

Properties

The names of distance measures used in measuring inter-sample distances in classification.

QVariantList

Calculated scaling factors for the distance measures.

int

The number of inputs, i.e.

int

The total number of features to be scaled, or zero if not known yet.

The action to perform with new samples when learningBatchSize has been exceeded.

int

The maximum number of training samples collected for learning.

bool

A read-only property whose value is true when the learning thread is running, and false otherwise.

Public types

enum
{ OverwriteRandomSample, OverwriteOldestSample, DiscardNewSample }

Public Slots

void

Start the learning thread.

void

Stop the learning thread.

Signals

void
(
  • double percentage
)

Emitted time to time as the learning algorithm calculates distance measure variances.

Constructors and destructor

Public member functions

virtual void
(
  • bool reset
)

Checks the operation for execution.

QVariantList
int
int
int
bool
void
( )
void
(
  • const QVariantList & distanceWeights
)
void
(
  • int cnt
)
void
( )
void
(
  • int size
)

Protected member functions

virtual void
( )

Executes one round of processing.

Property details

  • QStringList distanceMeasures

    [read, write]

    The names of distance measures used in measuring inter-sample distances in classification.

    See PiiVectorQuantizerOperation::distanceMeasures. In this list is non-empty, its length must match dynamicInputCount.

  • QVariantList distanceWeights

    [read, write]

    Calculated scaling factors for the distance measures.

    Once this operation is trained, the scaling factors can be copied to PiiVectorQuantizerOperation to make distance measures commensurable. The list contains doubles, and its length equals featureCount. If the operation has not been trained, the list is empty.

  • int dynamicInputCount

    [read, write]

    The number of inputs, i.e.

    the number of feature vectors to combine. Valid values are 1-64. The default is two.

  • int featureCount

    [read]

    The total number of features to be scaled, or zero if not known yet.

    This value will be reset to zero at check() and set again when the first feature vector arrives.

  • FullBufferBehavior fullBufferBehavior

    [read, write]

    The action to perform with new samples when learningBatchSize has been exceeded.

    The default is OverwriteRandomSample.

  • int learningBatchSize

    [read, write]

    The maximum number of training samples collected for learning.

    Zero means that no training samples will be collected, and the operation will only join incoming feature vectors. If learningBatchSize is set to N (N > 1), at most N samples will be kept in memory. If learningBatchSize is -1, all incoming samples will be buffered without limit. The buffered samples will be used to estimate the variance of inter-sample distances. The default value is 0.

    Note that the time it takes to learn distance distributions is proportional to learningBatchSize squared. The total number of distance measure evaluations is equal to , where N is the number of feature vectors and M is the number of samples.

  • bool learningThreadRunning

    [read]

    A read-only property whose value is true when the learning thread is running, and false otherwise.

Enumeration details

Function details

  • PiiFeatureCombiner

    ()
  • ~PiiFeatureCombiner

    ()
  • virtual void check

    (
    • bool reset
    )
    [virtual]

    Checks the operation for execution.

    This function creates a suitable flow controller by calling createFlowController(). It then sets the flow controller to the active processor and sets the processor as the input controller for all inputs.

    If you change socket groupings in your overridden implementation, please call PiiDefaultOperation::check() after that. Otherwise, your new groupings will not be in effect.

    Reimplemented from PiiDefaultOperation.

  • QStringList distanceMeasures

    ()
  • QVariantList distanceWeights

    ()
  • int dynamicInputCount

    ()
  • int featureCount

    ()
  • FullBufferBehavior fullBufferBehavior

    ()
  • int learningBatchSize

    ()
  • bool learningThreadRunning

    ()
  • void setDistanceMeasures

    ()
  • void setDistanceWeights

    (
    • const QVariantList & distanceWeights
    )
  • void setDynamicInputCount

    (
    • int cnt
    )
  • void setFullBufferBehavior

    ()
  • void setLearningBatchSize

    (
    • int size
    )
  • virtual void process

    ()
    [protected, virtual]

    Executes one round of processing.

    This function is invoked by the processor if the necessary preconditions for a new processing round are met. This function does all the necessary calculations to create output objects and sends them to output sockets.

    Calls to process(), syncEvent(), and setProperty() are synchronized and cannot occur simultaneously. PiiDefaultOperation ensures this by locking processLock() for reading before calling process().

    Note: With time-consuming operations, one should occasionally check that the operation hasn't been interrupted, i.e. that state() returns Running.

    Exceptions
    PiiExecutionException

    whenever an unrecoverable error occurs during a processing round, the operation is interrupted, or finishes execution due to end of input data.

    Reimplemented from PiiDefaultOperation.

  • void progressed

    (
    • double percentage
    )
    [signal]

    Emitted time to time as the learning algorithm calculates distance measure variances.

  • void startLearningThread

    ()
    [slot]

    Start the learning thread.

    If the number of buffered samples is less than two or the learning thread is already running, this function does nothing. Otherwise, it starts a thread that calculates the variances of distances between buffered samples. The thread can be interrupted by calling stopLearningThread(). The progressed() signal will be emitted every once in a while during the calculation.

  • void stopLearningThread

    ()
    [slot]

    Stop the learning thread.

    After this function has been called, #canContinue() will return false, which interrupts the learning algorithm.

Notes (0)

Add a note

Not a single note added yet. Be the first, add yours.