analysis: add standard deviation to the segment store statistics
authorMatthew Khouzam <matthew.khouzam@ericsson.com>
Wed, 2 Dec 2015 22:15:55 +0000 (17:15 -0500)
committerMatthew Khouzam <matthew.khouzam@ericsson.com>
Wed, 9 Dec 2015 16:18:08 +0000 (11:18 -0500)
This patch will be useful for extracting more key metrics and flagging
outlyer segments.

Some reminders:

The standard deviation is the square root of the variance. The variance
is the sum of squared elements divided by the cardinality of a list. To
calculate the standard deviation incrementally, we keep an accumulator
of all the squares of a latency. This is then divided when needed by the
number of elements (segments) and square rooted.

Also, this patch calculates an online mean with less rounding errors.

Change-Id: Ia918f08f2351d7086bd05aac1ad645cfff13eb58
Signed-off-by: Matthew Khouzam <matthew.khouzam@ericsson.com>
Reviewed-on: https://git.eclipse.org/r/61824
Reviewed-by: Hudson CI
Reviewed-by: Bernd Hufmann <bernd.hufmann@ericsson.com>
Tested-by: Bernd Hufmann <bernd.hufmann@ericsson.com>
analysis/org.eclipse.tracecompass.analysis.os.linux.core/src/org/eclipse/tracecompass/internal/analysis/os/linux/core/latency/statistics/SegmentStoreStatistics.java

index c70c3f6e242ba7bb41586130bf8f6c2615333504..4644dc811e34501b3ad4de4d84e65feb8b19a173 100644 (file)
@@ -21,17 +21,19 @@ import org.eclipse.tracecompass.segmentstore.core.ISegment;
 public class SegmentStoreStatistics {
     private long fMin;
     private long fMax;
-    private long fSum;
     private long fNbSegments;
+    private double fAverage;
+    private double fVariance;
 
     /**
      * Constructor
      */
     public SegmentStoreStatistics() {
-        this.fMin = Long.MAX_VALUE;
-        this.fMax = Long.MIN_VALUE;
-        this.fSum = 0;
-        this.fNbSegments = 0;
+        fMin = Long.MAX_VALUE;
+        fMax = Long.MIN_VALUE;
+        fNbSegments = 0;
+        fAverage = 0.0;
+        fVariance = 0.0;
     }
 
     /**
@@ -67,20 +69,44 @@ public class SegmentStoreStatistics {
      * @return arithmetic average
      */
     public double getAverage() {
-        return ((double) fSum) / fNbSegments;
+        return fAverage;
+    }
+
+    /**
+     * Gets the standard deviation of the segments, uses the online algorithm
+     * shown here <a href=
+     * "https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online_algorithm">
+     * Wikipedia article of dec 3 2015 </a>
+     *
+     * @return the standard deviation of the segment store, will return NaN if
+     *         there are less than 3 elements
+     */
+    public double getStdDev() {
+        return fNbSegments > 2 ? Math.sqrt(fVariance / (fNbSegments - 1)) : Double.NaN;
     }
 
     /**
      * Update the statistics based on a given segment
+     * <p>
+     * This is an online algorithm and must retain a complexity of O(1)
      *
      * @param segment
      *            the segment used for the update
      */
-    public void update (ISegment segment) {
+    public void update(ISegment segment) {
         long value = segment.getLength();
+        /*
+         * Min and max are trivial, as well as number of segments
+         */
         fMin = Math.min(fMin, value);
         fMax = Math.max(fMax, value);
-        fSum += value;
+
         fNbSegments++;
+        /*
+         * The running mean is not trivial, see proof in javadoc.
+         */
+        double delta = value - fAverage;
+        fAverage += delta / fNbSegments;
+        fVariance += delta * (value - fAverage);
     }
 }
This page took 0.025683 seconds and 5 git commands to generate.