VLOOKUP is a powerful function used extensively in spreadsheet software like Microsoft Excel to search for a value in a table and return the corresponding data from another column. In this article, we will discuss how to achieve similar functionality in Python using the popular data manipulation library, Pandas, with the help of .map() and .merge() methods. 

Loading a Sample Pandas DataFrame

I’ve included a sample Pandas DataFrame below so that you can follow the instruction line-by-line. Simply copy the code and paste it in your preferred code editor. Although your results will undoubtedly differ, feel free to use your own DataFrame if you have one.

				
					import pandas as pd

# create a dictionary of data
data = {'Name': ['John', 'Emma', 'Peter', 'Hannah'],
        'Age': [25, 30, 21, 35],
        'Country': ['USA', 'Canada', 'Australia', 'UK']}

# create a dataframe from the dictionary
df = pd.DataFrame(data)

# print the dataframe
print(df)

				
			

Output:

				
					#      Name  Age    Country
# 0    John   25        USA
# 1    Emma   30     Canada
# 2   Peter   21  Australia
# 3  Hannah   35         UK
				
			

In this example, the DataFrame has three columns: ‘name’, ‘age’, and ‘country, and four rows with the corresponding values. You can customize the data and column names to fit your need.

Using the .map() Method to Replicate VLOOKUP

The .map() function in Pandas enables us to associate values with a Pandas series or a column within a DataFrame. To map values to a column in a Pandas DataFrame, we can use a dictionary. In this case, the dictionary’s key represents the value in the column, while the dictionary’s value is the new value we want to assign.

Here’s an example:

				
					import pandas as pd

# create a dictionary of lookup values and results
country_map = {'USA': 'United States', 'Canada': 'Canada', 'Australia': 'Australia', 'UK': 'United Kingdom'}

# create a dataframe
data = {'Name': ['John', 'Emma', 'Peter', 'Hannah'],
        'Age': [25, 30, 21, 35],
        'Country': ['USA', 'Canada', 'Australia', 'UK']}
df = pd.DataFrame(data)

# use .map() method to replace values in the 'Country' column
df['Country'] = df['Country'].map(country_map)

# print the updated dataframe
print(df)

				
			

Output:

				
					     Name  Age         Country
0    John   25   United States
1    Emma   30          Canada
2   Peter   21       Australia
3  Hannah   35  United Kingdom
				
			

In this code, we first create a dictionary country_map with the lookup values and their corresponding results. We then create a dataframe df with the ‘Name’, ‘Age’, and ‘Country’ columns.

Next, we use the .map() method to replace the values in the ‘Country’ column with their corresponding results from the country_map dictionary. The updated values are then stored back in the ‘Country’ column of the dataframe.

Finally, we print the updated dataframe using the print() function.

This method can be used to replace any set of values in a column with their corresponding results.

Using Pandas .merge() Method to Replicate VLOOKUP

There might be instances when you’re dealing with extensively normalized data tables that need to be combined. Thankfully, Pandas supplies a remarkably useful function called .merge(), which permits us to merge two DataFrames effectively.

Here’s an example:

				
					import pandas as pd

# create a dataframe with lookup values and results
country_df = pd.DataFrame({'Country': ['USA', 'Canada', 'Australia', 'UK'],
                           'Country Name': ['United States', 'Canada', 'Australia', 'United Kingdom']})

# create a dataframe
data = {'Name': ['John', 'Emma', 'Peter', 'Hannah'],
        'Age': [25, 30, 21, 35],
        'Country': ['USA', 'Canada', 'Australia', 'UK']}
df = pd.DataFrame(data)

# merge the two dataframes based on the 'Country' column
merged_df = pd.merge(df, country_df, on='Country', how='left')

# print the merged dataframe
print(merged_df)

				
			

Output:

				
					     Name  Age    Country    Country Name
0    John   25        USA   United States
1    Emma   30     Canada          Canada
2   Peter   21  Australia       Australia
3  Hannah   35         UK  United Kingdom
				
			

In this code, we first create a separate dataframe country_df with the lookup values and their corresponding results. The dataframe has two columns – ‘Country’ and ‘Country Name’.

Next, we create the original dataframe df with the ‘Name’, ‘Age’, and ‘Country’ columns.

Then, we use the .merge() method to merge the two dataframes based on the ‘Country’ column. We use the on parameter to specify the common key to merge on, and the how parameter to specify the type of merge we want (in this case, a left join).

The merged dataframe is stored in the variable merged_df. We can then print it using the print() function.

This method can be used to merge any two dataframes based on a common key, and replace any set of values in a column with their corresponding results from the merged dataframe.

Wrap up

Pandas simplifies the process of replicating VLOOKUP-style functions, addressing many of the limitations VLOOKUP has, such as only merging on the leftmost column.

In this tutorial, you discovered how to utilize Python and Pandas to mimic the widely used Excel VLOOKUP function. The VLOOKUP function facilitates a left-join between two tables, enabling you to retrieve values from an additional table.

You discovered how to apply the Pandas .map() method for associating a dictionary with another Pandas DataFrame column. Furthermore, you learned the usage of the Pandas merge() function, which permits the combination of two DataFrames based on either a single key or multiple keys.

Both methods can be used to replace any set of values in a column with their corresponding results, and which method to choose depends on the specific use case and the structure of the data.

Here you can find the official documentation for .map() and .merge().

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.map.html
https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.merge.html


Thanks for reading. Happy coding!