Reindex

The main task of the Pandas reindex is to conform DataFrame to a new index with optional filling logic and to place NA/NaN in that location where the values are not present in the previous index. It returns a new object unless the new index is produced as an equivalent to the current one, and the value of copy becomes False.

Reindexing is used to change the index of the rows and columns of the DataFrame. We can reindex the single or multiple rows by using the reindex() method. Default values in the new index are assigned NaN if it is not present in the DataFrame.

Syntax:

Parameters:

labels: It is an optional parameter that refers to the new labels or the index to conform to the axis that is specified by the 'axis'.

index, columns : It is also an optional parameter that refers to the new labels or the index. It generally prefers an index object for avoiding the duplicate data.

axis : It is also an optional parameter that targets the axis and can be either the axis name or the numbers.

method: It is also an optional parameter that is to be used for filling the holes in the reindexed DataFrame. It can only be applied to the DataFrame or Series with a monotonically increasing/decreasing order.

None: It is a default value that does not fill the gaps.

pad / ffill: It is used to propagate the last valid observation forward to the next valid observation.

backfill / bfill: To fill the gap, It uses the next valid observation.

nearest: To fill the gap, it uses the next valid observation.

copy: Its default value is True and returns a new object as a boolean value, even if the passed indexes are the same.

level : It is used to broadcast across the level, and match index values on the passed MultiIndex level.

fill_value : Its default value is np.NaN and used to fill existing missing (NaN) values. It needs any new element for successful DataFrame alignment, with this value before computation.

limit : It defines the maximum number of consecutive elements that are to be forward or backward fill.

tolerance : It is also an optional parameter that determines the maximum distance between original and new labels for inexact matches. At the matching locations, the values of the index should most satisfy the equation abs(index[indexer] ? target) <= tolerance.

Returns :

It returns reindexed DataFrame.

Example 1:

The below example shows the working of reindex() function to reindex the dataframe. In the new index,default values are assigned NaN in the new index that does not have corresponding records in the DataFrame.

Note: We can use fill_value for filling the missing values.

Output:

         A    B    D    E
Parker	NaN  NaN  NaN  NaN
William	NaN  NaN  NaN  NaN
Smith	NaN  NaN  NaN  NaN
Terry	NaN  NaN  NaN  NaN
Phill	NaN  NaN  NaN  NaN

Now, we can use the dataframe.reindex() function to reindex the dataframe.

Output:

	P	Q	R	S
A	NaN	NaN	NaN	NaN
B	NaN	NaN	NaN	NaN
C	NaN	NaN	NaN	NaN
D	NaN	NaN	NaN	NaN
E	NaN	NaN	NaN	NaN

Notice that the new indexes are populated with NaN values. We can fill in the missing values using the fill_value parameter.

Output:

	P	Q	R	S
A	100	100	100	100
B	100	100	100	100
C	100	100	100	100
D	100	100	100	100
E	100	100	100	100

Example 2:

This example shows the working of reindex() function to reindex the column axis.

Output:

        A     B    D    E
Parker	NaN  NaN  NaN  NaN
William	NaN  NaN  NaN  NaN
Smith	NaN  NaN  NaN  NaN
Terry	NaN  NaN  NaN  NaN
Phill	NaN  NaN  NaN  NaN

Notice that NaN values are present in the new columns after reindexing, we can use the argument fill_value to the function for removing the NaN values.

Output:

        A   B   D   E
Parker	37  37  37  37
William	37  37  37  37
Smith	37  37  37  37
Terry	37  37  37  37
Phill	37  37  37  37

Next TopicReset Index




Latest Courses