Call Us: US - +1 845 478 5244 | UK - +44 20 7193 7850 | AUS - +61 2 8005 4826

Machine learning and data mining

This module provides functions for calculating mathematical statistics of numeric (Real-valued) data.


Unless explicitly noted otherwise, these functions support intfloatdecimal.Decimal andfractions.Fraction. Behaviour with other types (whether in the numeric tower or not) is currently unsupported. Mixed types are also undefined and implementation-dependent. If your input data consists of mixed types, you may be able to use map() to ensure a consistent result, e.g. map(float, input_data).

Averages and measures of central location

These functions calculate an average or typical value from a population or sample.

mean()Arithmetic mean (“average”) of data.
harmonic_mean()Harmonic mean of data.
median()Median (middle value) of data.
median_low()Low median of data.
median_high()High median of data.
median_grouped()Median, or 50th percentile, of grouped data.
mode()Mode (most common value) of discrete data.

Measures of spread

These functions calculate a measure of how much the population or sample tends to deviate from the typical or average values.

pstdev()Population standard deviation of data.
pvariance()Population variance of data.
stdev()Sample standard deviation of data.
variance()Sample variance of data.

Function details

Note: The functions do not require the data given to them to be sorted. However, for reading convenience, most of the examples show sorted sequences.statistics.mean(data)

Return the sample arithmetic mean of data which can be a sequence or iterator.

The arithmetic mean is the sum of the data divided by the number of data points. It is commonly called “the average”, although it is only one of many different mathematical averages. It is a measure of the central location of the data.

If data is empty, StatisticsError will be raised.

Some examples of use:

>>> mean([1, 2, 3, 4, 4])
>>> mean([-1.0, 2.5, 3.25, 5.75])

>>> from fractions import Fraction as F
>>> mean([F(3, 7), F(1, 21), F(5, 3), F(1, 3)])
Fraction(13, 21)

>>> from decimal import Decimal as D
>>> mean([D("0.5"), D("0.75"), D("0.625"), D("0.375")])