Pandas How To Uncategorized How to join two dataframes on multiple columns

How to join two dataframes on multiple columns

You can join two pandas DataFrames on multiple columns using the merge method in pandas.

Here’s an example of how to perform an inner join on two DataFrames based on two common columns, key1 and key2:


import pandas as pd

df1 = pd.DataFrame({'key1': [1, 2, 3, 4, 5], 'key2': [10, 20, 30, 40, 50], 'col1': [100, 200, 300, 400, 500]})
df2 = pd.DataFrame({'key1': [2, 4, 6, 8, 10], 'key2': [20, 40, 60, 80, 100], 'col2': [1000, 2000, 3000, 4000, 5000]})

result = pd.merge(df1, df2, on=['key1', 'key2'], how='inner')

The merge method takes two DataFrames as input and performs an inner join on the columns specified by the on parameter. In this example, the on parameter is set to [‘key1’, ‘key2’] to specify that the join should be performed on both key1 and key2 columns. The how parameter is used to specify the type of join. In this example, the how parameter is set to ‘inner’ to perform an inner join.

The result of the join will be a new DataFrame that contains only the rows where there is a match in both input DataFrames on both key1 and key2 columns. In this example, the result will have three columns: key1, key2, col1, and col2.

Tags:

1 thought on “How to join two dataframes on multiple columns”

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Post