Pandas Index

Pandas Index is defined as a vital tool that selects particular rows and columns of data from a DataFrame. Its task is to organize the data and to provide fast accessing of data. It can also be called a Subset Selection.

The values are in bold font in the index, and the individual value of the index is called a label.

If we want to compare the data accessing time with and without indexing, we can use %%timeit for comparing the time required for various access-operations.

We can also define an index like an address through which any data can be accessed across the Series or DataFrame. A DataFrame is a combination of three different components, the index, columns, and the data.

Axis and axes

An axis is defined as a common terminology that refers to rows and columns, whereas axes are collection of these rows and columns.

Creating index

First, we have to take a csv file that consist some data used for indexing.

Output:

    Name             Hire Date    Salary      Leaves Remaining
0  John Idle          03/15/14    50000.0       10
1  Smith Gilliam      06/01/15    65000.0       8
2  Parker Chapman     05/12/14    45000.0       10
3  Jones Palin        11/01/13    70000.0       3
4  Terry Gilliam      08/12/14    48000.0       7
5   Michael Palin     05/23/13    66000.0       8

Example1

Output:

    Name            Hire Date     Salary
0  John Idle         03/15/14     50000.0
1  Smith Gilliam     06/01/15     65000.0
2  Parker Chapman    05/12/14     45000.0
3  Jones Palin       11/01/13     70000.0
4  Terry Gilliam     08/12/14     48000.0
5  Michael Palin     05/23/13     66000.0

Example2:

Output:

     Name            Salary         
0  John Idle         50000.0 
1  Smith Gilliam     65000.0 
2  Parker Chapman    45000.0 
3  Jones Palin       70000.0 
4  Terry Gilliam     48000.0 
5   Michael Palin    66000.0 

Set index

The 'set_index' is used to set the DataFrame index using existing columns. An index can replace the existing index and can also expand the existing index.

It set a list, Series or DataFrame as the index of the DataFrame.

Output:

            Name       Year    Leaves
1   1      Parker      2011     10
2   4      Terry       2009     15
3   9      Smith       2014     9 
4   16     William     2010     4

Multiple Index

We can also have multiple indexes in the data.

Example1:

Output:

MultiIndex(levels=[[nan, None, NaT, 128, 2]],
codes=[[0, -1, 1, 2, 3, 4]])

Reset index

We can also reset the index using the 'reset_index' command. Let's look at the 'cm' DataFrame again.

Example:

Output:

   index   name      Language
0	1      William     C
1	2      Smith      Java
2	3      Parker     Python
3	4      Phill      NaN

Next TopicMultiple Index




Latest Courses