How to set index

Here’s the tutorial on how to set index in Python Pandas library.

Pandas offers a dedicated set_index function that allows you to create indexes in a dataframe.

import pandas as pd

my_df = pd.DataFrame({'Id': ['1001', '1002', '1003', '1007'],
                      'Column1': [2, 5, None, 46],
                      'Column2': [12, None, 5, 22]})

print(f'My dataframe before I set the index: \n{my_df}')

This is my example dataframe to which I will add an index on the Id column.

How to set index in Pandas

In order to add an index to the dataframe, I use the set_index function.

import pandas as pd

my_df = pd.DataFrame({'Id': ['1001', '1002', '1003', '1007'],
                      'Column1': [2, 5, None, 46],
                      'Column2': [12, None, 5, 22]})

print(f'My dataframe before I set the index: \n{my_df}')

my_df.set_index(['Id'], drop=True, inplace=True)

print(f'My dataframe with the index: \n {my_df}')

# print(my_df.index)

how to set index in Pandas

As you can see, I have set the index in the “Id” column. Additionally, I dropped the Id column because otherwise it would be double. The Inplace parameter saves the change in the dataframe.

How to verify integrity

In addition, the set_index function allows us to keep an eye on the uniqueness of the index. Setting this parameter to True will prevent Pandas from adding a duplicate index to the dataframe.

my_df.set_index(['Id'], drop=True, inplace=True, verify_integrity=True)

Note: Before setting the verify_integrity parameter, make sure that the values in the column are unique. Otherwise, the error “ValueError: Index has duplicate keys: Index” will appear.

See also:
Documentation of set_index method

This Post Has 2 Comments

Leave a Reply