Machine learning and data mining
This module provides functions for calculating mathematical statistics of numeric (Real
-valued) data.
Note
Unless explicitly noted otherwise, these functions support int
, float
, decimal.Decimal
andfractions.Fraction
. Behaviour with other types (whether in the numeric tower or not) is currently unsupported. Mixed types are also undefined and implementation-dependent. If your input data consists of mixed types, you may be able to use map()
to ensure a consistent result, e.g. map(float, input_data)
.
Averages and measures of central location
These functions calculate an average or typical value from a population or sample.
mean() | Arithmetic mean (“average”) of data. |
harmonic_mean() | Harmonic mean of data. |
median() | Median (middle value) of data. |
median_low() | Low median of data. |
median_high() | High median of data. |
median_grouped() | Median, or 50th percentile, of grouped data. |
mode() | Mode (most common value) of discrete data. |
Measures of spread
These functions calculate a measure of how much the population or sample tends to deviate from the typical or average values.
pstdev() | Population standard deviation of data. |
pvariance() | Population variance of data. |
stdev() | Sample standard deviation of data. |
variance() | Sample variance of data. |
Function details
Note: The functions do not require the data given to them to be sorted. However, for reading convenience, most of the examples show sorted sequences.statistics.
mean
(data)
Return the sample arithmetic mean of data which can be a sequence or iterator.
The arithmetic mean is the sum of the data divided by the number of data points. It is commonly called “the average”, although it is only one of many different mathematical averages. It is a measure of the central location of the data.
If data is empty, StatisticsError
will be raised.
Some examples of use:
>>> mean([1, 2, 3, 4, 4]) 2.8 >>> mean([-1.0, 2.5, 3.25, 5.75]) 2.625 >>> from fractions import Fraction as F >>> mean([F(3, 7), F(1, 21), F(5, 3), F(1, 3)]) Fraction(13, 21) >>> from decimal import Decimal as D >>> mean([D("0.5"), D("0.75"), D("0.625"), D("0.375")]) Decimal('0.5625')