Pandas has a powerful feature called merge(), which lets you easily perform SQL-style joins for your data analysis tasks.
Let's dive into the four main types of SQL joins: Inner Join, Left Join, Right Join, and Outer Join.
Pandas Inner Join
Pandas' merge function to combine two DataFrames, df_left and df_right, based on a common column called 'key' using an inner join.
In simpler terms, it's like combining two sets of data where we only include the rows that have matching values in the 'key' column in both sets.
pd.merge(df_left, df_right, on='key', how='inner')
Pandas Left Join
A Left Join takes all the data from the left DataFrame (in this case, df_left) and adds any matching data from the right DataFrame (df_right). If there's no match for a particular row in the left data frame, that part of the result will be empty (NaN).
pd.merge(df_left, df_right, on='key', how='left')
Pandas Right Join
A Right Join gives you all the data from the right DataFrame (df_right) and includes any matching data from the left DataFrame (df_left). If there's no match for a specific row in the right data frame, that part of the result will be empty and marked as NaN.
pd.merge(df_left, df_right, on='key', how='right')
Pandas Outer Join
An Outer Join, also known as a Full Outer Join, includes all records from either the left or the right data frame when there's a match in at least one of them. If a row from the left DataFrame doesn't match with any in the right DataFrame, its right DataFrame columns will show NaN, and vice versa.
pd.merge(df_left, df_right, on='key', how='outer')
For those of you new to this newsletter - stay connected on LinkedIn, follow on Twitter, and follow on Facebook Page.
To support Data with Vividus in delivering weekly content, Click HERE to support a coffee. Your support means the world!
This post is public so feel free to share it, Thank you for reading…