Call Us: US - +1 845 478 5244 | UK - +44 20 7193 7850 | AUS - +61 2 8005 4826

Descriptive Statistics

Descriptive Statistics

In statistics, data analysis could be categorised in to major groups, the descriptive statistics and the inferential statistics. Descriptive statistics involves the basic description of the data features in a study. It involves summarizing and organizing the data in a manner that it could be easily understood. Unlike inferential statistics which makes inferences of the whole population based on the sample, descriptive statistics seeks to describe the data in the sample. They provide summaries of the data samples and their measures.

Types of Descriptive Statistics

There are two major categories of descriptive statistics. These are the measures of central tendency and the measures of variability (spread). These are discussed below.

Measures of Central Tendency

Measure of central tendency, or rather central tendency implies that there is a single number which best summarizes the features of the set of measurement. It is a number that in a way is ‘central’ to the set. These include:

 Mean or average: refers to the number around which the data is spread out

Median: this is the number that divides the data in to two equal parts

Mode: this is the number with the highest frequency in a data set

Measures of Spread or Dispersion

These are the factors that measures the idea of variability or dispersion of the data

Standard deviation:  this measures the average distance between each quantity and mean. How the data is spread out from its mean. It is calculated using the following formula:

Mean deviation: this is an average of the absolute difference between each of the values in a set of values, and the mean of all the values in the concerned dataset. It is calculated using the following formula:

Variance: variance implies the average distance between the mean and each variable in a dataset. Is is the square of standard deviation.

Range: this implies the distance between the largest variable and the lowest variable in a dataset

Percentile: this implies the manner of representation of the values in a dataset. The data is expressed in terms of percentages.

Quartiles: this are the values which divides the dataset into quarters, considering the data is arranged in ascending orders.

Skewness: this measures the asymmetry of the probability distribution of the data around its mean. Skewness could be positive, negative o symmetry as shown in the figure below.

Kurtosis: kurtosis is used to measure the presence of outliers in the data. It assesses whether the data is heavily-tailed or light-tailed in relation to the normal distribution.The forms of kurtosis are shown in the figure below.