Here’s the tutorial on how to set index in Python Pandas library.
Pandas offers a dedicated set_index function that allows you to create indexes in a dataframe.
import pandas as pd
my_df = pd.DataFrame({'Id': ['1001', '1002', '1003', '1007'],
'Column1': [2, 5, None, 46],
'Column2': [12, None, 5, 22]})
print(f'My dataframe before I set the index: \n{my_df}')
This is my example dataframe to which I will add an index on the Id column.
How to set index in Pandas
In order to add an index to the dataframe, I use the set_index function.
import pandas as pd
my_df = pd.DataFrame({'Id': ['1001', '1002', '1003', '1007'],
'Column1': [2, 5, None, 46],
'Column2': [12, None, 5, 22]})
print(f'My dataframe before I set the index: \n{my_df}')
my_df.set_index(['Id'], drop=True, inplace=True)
print(f'My dataframe with the index: \n {my_df}')
# print(my_df.index)

As you can see, I have set the index in the “Id” column. Additionally, I dropped the Id column because otherwise it would be double. The Inplace parameter saves the change in the dataframe.
How to verify integrity
In addition, the set_index function allows us to keep an eye on the uniqueness of the index. Setting this parameter to True will prevent Pandas from adding a duplicate index to the dataframe.
my_df.set_index(['Id'], drop=True, inplace=True, verify_integrity=True)
Note: Before setting the verify_integrity parameter, make sure that the values in the column are unique. Otherwise, the error “ValueError: Index has duplicate keys: Index” will appear.
See also:
Documentation of set_index method

Pingback: How To Set Multiindex • Pandas How To
Pingback: How To Reset Index • Pandas How To