classPiiSampleBalancer
#include <PiiSampleBalancer.h>
An operation that balances training sets by giving more weight to rare samples.
Inherits PiiDefaultOperation
Description
The weighting is based on the distribution of individual feature
values. The balancer works in two modes:
ProbabilitySelection and
WeightCalculation. In the former mode, the operation
either passes feature vectors to the features output
or does nothing, based on the estimated weight of the sample. In
the latter mode, all features will be passed, and the weight of the
sample will be sent to the weight output.
The graph below illustrates sample weighting on one-dimensional Gaussian data. The (normalized) distribution of a feature value is shown in blue. Its inverse (green) is used as a weight. The red curve illustrates the effect of setting emphasis to three.
For multi-dimensional features, PiiSampleBalancer uses marginal distributions, based on the assumption that all features are independent. This is often not the case, but gives a reasonable approximation without huge memory requirements.
Inputs
Outputs
ProbabilitySelection, the
features will be emitted only if a generated random number is less
than weight. The select output will
indicate whether the sample was selected or not. In
WeightCalculation mode, this output will always pass
the incoming features.
WeightCalculation mode, this is output will always
emit true.
Properties
|
double
|
The speed of adaptation to changing conditions. |
|
int
|
The default number of quantization levels. |
|
int
|
By default, the operation tries to flatten out the variations in feature distribution. |
|
int
|
The number of features required for a reliable estimate. |
|
QVariantList
|
A list of quantization levels for each feature value. |
|
Operation mode. |
Public types
|
enum
|
{ ProbabilitySelection, WeightCalculation }
Operation modes. |
Constructors and destructor
Public member functions
|
double
|
( )
|
|
virtual void
|
(
Checks the operation for execution. |
|
int
|
( )
|
|
int
|
( )
|
|
int
|
|
|
QVariantList
|
( )
|
|
( )
|
|
|
void
|
(
|
|
void
|
(
|
|
void
|
(
|
|
void
|
(
|
|
void
|
(
|
|
void
|
Protected member functions
|
virtual void
|
( )
Executes one round of processing. |
Property details
-
double adaptationRatio
[read, write]The speed of adaptation to changing conditions.
The operation initially assumes a uniform feature distribution. The estimate of the distribution is updated once every learningBatchSize samples. The adaptation ratio tells how much the new measurements affect the learnt model. 0 means that the initial uniform approximation will never be changed. 1 means that the new estimate will fully replace the old one. The default value is 0.1.
-
int defaultLevels
[read, write]The default number of quantization levels.
This value is used for all features whose quantization levels have not been explicitly set by levels. The default value is 256.
-
int emphasis
[read, write]By default, the operation tries to flatten out the variations in feature distribution.
If the common samples need to be given even less weight,
emphasiscan be set to a larger value. The operation will raise the weight estimate to this power. -
int learningBatchSize
[read, write]The number of features required for a reliable estimate.
The estimate is updated every
learningBatchSizesamples. The default value is 25600 (100 samples / histogram bin). -
QVariantList levels
[read, write]A list of quantization levels for each feature value.
For three-dimensional feature vectors, the default can be changed as follows:
balancer->setProperty("levels", QVariantList() << 128 << 256 << 64);
The minimum number of quantization levels is one.
-
Mode mode
[read, write]Operation mode.
Enumeration details
-
enum Mode
Operation modes.
-
ProbabilitySelection- pass those feature vectors that are likely to be important with a higher probability than the others. -
WeightCalculation- pass every incoming vector accompanied with selection probability.
-
Function details
-
PiiSampleBalancer
() -
~PiiSampleBalancer
() -
double adaptationRatio
() -
virtual void check
(- bool reset
[virtual]Checks the operation for execution.
This function creates a suitable flow controller by calling createFlowController(). It then sets the flow controller to the active processor and sets the processor as the input controller for all inputs.
If you change socket groupings in your overridden implementation, please call PiiDefaultOperation::check() after that. Otherwise, your new groupings will not be in effect.
Reimplemented from PiiDefaultOperation.
-
int defaultLevels
() -
int emphasis
() -
int learningBatchSize
() -
QVariantList levels
() -
Mode mode
() -
void setAdaptationRatio
(- double adaptationRatio
-
void setDefaultLevels
(- int defaultLevels
-
void setEmphasis
(- int emphasis
-
void setLearningBatchSize
(- int learningBatchSize
-
void setLevels
(- const QVariantList & levels
-
void setMode
-
virtual void process
()[protected, virtual]Executes one round of processing.
This function is invoked by the processor if the necessary preconditions for a new processing round are met. This function does all the necessary calculations to create output objects and sends them to output sockets.
Calls to process(), syncEvent(), and setProperty() are synchronized and cannot occur simultaneously. PiiDefaultOperation ensures this by locking processLock() for reading before calling process().
Note: With time-consuming operations, one should occasionally check that the operation hasn't been interrupted, i.e. that state() returns
Running.Exceptions
- PiiExecutionException
-
whenever an unrecoverable error occurs during a processing round, the operation is interrupted, or finishes execution due to end of input data.
Reimplemented from PiiDefaultOperation.
Add a note
Not a single note added yet. Be the first, add yours.