Pandas How To Uncategorized How to merge two dataframes

How to merge two dataframes

Here is a short tutorial on how to merge two dataframes using Pandas Python module.
Inner merge dataframes

The most common way of merging dataframes is to use merge Pandas function.

Inner merge

import pandas as pd

df1 = pd.DataFrame({"strings": ["string1", "string2", "string3", "string4"],
                    "numbers": [103, 105, 201, 122]})

df2 = pd.DataFrame({"strings": ["string1", "string2", "string3", "string5"],
                    "numbers": [105, 144, 195, 101]})

print(pd.merge(df1, df2, on="strings", how="inner"))

Firstly, I create two dataframes. Next, I merge them using the merge method of Pandas.

   strings  numbers_x  numbers_y
0  string1        103        105
1  string2        105        144
2  string3        201        195

I chose inner method so Pandas merges only common strings which are present in both dataframes.

There are different methods or merges.

Left merge

import pandas as pd

df1 = pd.DataFrame({"strings": ["string1", "string2", "string3", "string4"],
                    "numbers": [103, 105, 201, 122]})

df2 = pd.DataFrame({"strings": ["string1", "string2", "string3", "string5"],
                    "numbers": [105, 144, 195, 101]})

print(pd.merge(df1, df2, on="strings", how="left"))
   strings  numbers_x  numbers_y
0  string1        103      105.0
1  string2        105      144.0
2  string3        201      195.0
3  string4        122        NaN

Left method merges the first dataframe with the corresponding part of right dataframe.

In case you needed the right dataframe use right method.

Right merge

import pandas as pd

df1 = pd.DataFrame({"strings": ["string1", "string2", "string3", "string4"],
                    "numbers": [103, 105, 201, 122]})

df2 = pd.DataFrame({"strings": ["string1", "string2", "string3", "string5"],
                    "numbers": [105, 144, 195, 101]})

print(pd.merge(df1, df2, on="strings", how="right"))
   strings  numbers_x  numbers_y
0  string1      103.0        105
1  string2      105.0        144
2  string3      201.0        195
3  string5        NaN        101

In case you needed every record use outer method.

Outer merge

import pandas as pd

df1 = pd.DataFrame({"strings": ["string1", "string2", "string3", "string4"],
                    "numbers": [103, 105, 201, 122]})

df2 = pd.DataFrame({"strings": ["string1", "string2", "string3", "string5"],
                    "numbers": [105, 144, 195, 101]})

print(pd.merge(df1, df2, on="strings", how="outer"))

   strings  numbers_x  numbers_y
0  string1      103.0      105.0
1  string2      105.0      144.0
2  string3      201.0      195.0
3  string4      122.0        NaN
4  string5        NaN      101.0

Outer merge dataframes

Now you know how to merge two dataframes in Pandas.

Here is the link to the documentation to get to know more advanced options:

Tags:

2 thoughts on “How to merge two dataframes”

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Post