How to join two dataframes

You can join two pandas DataFrames by using the merge method. The merge method takes two DataFrames as input and combines them into a single DataFrame based on a common column or columns.

Here’s an example of how to perform an inner join on two DataFrames based on a column named key:

import pandas as pd

df1 = pd.DataFrame({'key': [1, 2, 3, 4, 5], 'col1': [10, 20, 30, 40, 50]})
df2 = pd.DataFrame({'key': [2, 4, 6, 8, 10], 'col2': [100, 200, 300, 400, 500]})

result = pd.merge(df1, df2, on='key', how='inner')

how to join dataframes

In this example, the on parameter is set to ‘key’ to specify that the join should be based on the key column. The how parameter is set to ‘inner’ to specify that an inner join should be performed.

The result of the join will be a new DataFrame that contains only the rows where there is a match in both input DataFrames on the key column. In this example, the result will have three columns: key, col1, col2.

Types of joins

You can perform different types of joins by specifying a different value for the how parameter. The following are the most common types of joins:

  • Inner join: Returns only the rows where there is a match in both input DataFrames on the key column.
  • Left join: Returns all rows from the left DataFrame, even if there is no match in the right DataFrame.
  • Right join: Returns all rows from the right DataFrame, even if there is no match in the left DataFrame.
  • Outer join: Returns all rows from both DataFrames, even if there is no match in either DataFrame.

This Post Has One Comment

Leave a Reply