numpy.histogram() in Python

The numpy module of Python provides a function called numpy.histogram(). This function represents the frequency of the number of values that are compared with a set of values ranges. This function is similar to the hist() function of matplotlib.pyplot.

In simple words, this function is used to compute the histogram of the set of data.

Syntax:

numpy.histogram(x, bins=10, range=None, normed=None, weights=None, density=None)

Parameters:

x: array_like

This parameter defines a flattened array over which the histogram is computed.

bins: int or sequence of str or scalars(optional)

If this parameter is defined as an integer, then in the given range, it defines the number of equal-width bins. Otherwise, an array of bin edges which monotonically increased is defined. It also includes the rightmost edge, which allows for non-uniform bin widths. The latest version of numpy allows us to set bin parameters as a string, which defines a method for calculating optimal bin width.

range : (float, float)(optional)

This parameter defines the lower-upper ranges of the bins. By default, the range is (x.min(), x.max()). The values are ignored, which are outside the range. The ranges of the first element should be equal to or less than the second element.

normed : bool(optional)

This parameter is the same as the density argument, but it can give the wrong output for unequal bin widths.

weights : array_like(optional)

This parameter defines an array which contains weights and has the same shape as 'x'.

density : bool(optional)

If it is set to True, will result in the number of samples in every bin. If its value is False, the density function will result in the value of the probability density function in the bin.

Returns:

hist: array

The density function returns the values of the histogram.

edge_bin: an array of float dtype

This function returns the bin edges (length(hist+1)).

Example 1:

import numpy as np
a=np.histogram([1, 5, 2], bins=[0, 1, 2, 3])
a

Output:

(array([0, 1, 1], dtype=int64), array([0, 1, 2, 3]))

In the above code

We have imported numpy with alias name np.
We have declared the variable 'a' and assigned the returned value of np.histogram() function.
We have passed an array and the value of the bin in the function.
Lastly, we tried to print the value of 'a'.

In the output, it shows a ndarray that contain the values of the histogram.

Example 2:

import numpy as np
x=np.histogram(np.arange(6), bins=np.arange(7), density=True)
x

Output:

(array([0.16666667, 0.16666667, 0.16666667, 0.16666667, 0.16666667,
       0.16666667]), array([0, 1, 2, 3, 4, 5, 6]))

Example 3:

import numpy as np
x=np.histogram([[1, 3, 1], [1, 3, 1]], bins=[0,1,2,3])
x

Output:

(array([0, 4, 2], dtype=int64), array([0, 1, 2, 3]))

Example 4:

import numpy as np
a = np.arange(8)
hist, bin_edges = np.histogram(a, density=True)
hist
bin_edges

Output:

array([0.17857143, 0.17857143, 0.17857143, 0.        , 0.17857143,
       0.17857143, 0.        , 0.17857143, 0.17857143, 0.17857143])
array([0. , 0.7, 1.4, 2.1, 2.8, 3.5, 4.2, 4.9, 5.6, 6.3, 7. ])

Example 5:

import numpy as np
a = np.arange(8)
hist, bin_edges = np.histogram(a, density=True)
hist
hist.sum()
np.sum(hist * np.diff(bin_edges))

Output:

array([0.17857143, 0.17857143, 0.17857143, 0.        , 0.17857143,
       0.17857143, 0.        , 0.17857143, 0.17857143, 0.17857143])
1.4285714285714288
1.0

In the above code

We have imported numpy with alias name np.
We have created an array 'a' using np.arange() function.
We have declared variables 'hist' and 'bin_edges' and then assigned the returned value of np.histogram() function.
We have passed the array 'a' and set 'density' to True in the function.
We tried to print the value of 'hist'.
And lastly, we tried to calculate the sum of histogram values using hist.sum() and np.sum() in which we passed histogram values and edges of the bin.