Skip to content

2882. Drop Duplicate Rows

DataFrame customers

Column NameType
customer_idint
nameobject
emailobject

Instructions

  • There are some duplicate rows in the DataFrame based on the email column.
  • Write a solution to remove these duplicate rows and keep only the first occurrence.
  • The result format is in the following example.

Example

Input:

Output:

Explanation:

Alic (customer_id = 4) and Finn (customer_id = 5) both use john@example.com, so only the first occurrence of this email is retained.

Submissions

python
import pandas as pd

def dropDuplicateEmails(customers: pd.DataFrame) -> pd.DataFrame:
    return customers.drop_duplicates(subset='email', keep='first')

Explanation

Python (Pandas)
Submitted by @noeyislearning
  • import pandas as pd: Import the pandas library to work with DataFrames.
  • def dropDuplicateEmails(customers: pd.DataFrame) -> pd.DataFrame:: Define a function called dropDuplicateEmails that takes a DataFrame customers as input and returns a DataFrame.
  • return customers.drop_duplicates(subset='email', keep='first'): Remove duplicate rows based on the email column and keep only the first occurrence.