Pandas DataFrame.drop_duplicates()

The drop_duplicates() function performs common data cleaning task that deals with duplicate values in the DataFrame. This method helps in removing duplicate values from the DataFrame.

Syntax

Parameters

subset: It takes a column or the list of column labels. It considers only certain columns for identifying duplicates. Default value None.
keep: It is used to control how to consider duplicate values. It has three distinct values that are as follows:
- first: It drops the duplicate values except for the first occurrence.
- last: It drops the duplicate values except for the last occurrence.
- False: It drops all the duplicates.
inplace: Returns the boolean value. Default value is False.

If it is true, it removes the rows with duplicate values.

Return

Depending on the arguments passed, it returns the DataFrame with the removal of duplicate rows.

Example

import pandas as pd
emp = {"Name": ["Parker", "Smith", "William", "Parker"],
"Age": [21, 32, 29, 21]}
info = pd.DataFrame(emp)
print(info)

Output

        Name     Age
0     Parker     21
1     Smith      32
2     William    29
3     Parker     21

import pandas as pd
emp = {"Name": ["Parker", "Smith", "William", "Parker"],
"Age": [21, 32, 29, 21]}
info = pd.DataFrame(emp)
info = info.drop_duplicates()
print(info)

Output

       Name    Age
0    Parker    21
1    Smith     32
2    William   29

Next TopicDataFrame.groupby()

← prev next →

For Videos Join Our Youtube Channel: Join Now

Feedback

Send your Feedback to [email protected]

Help Others, Please Share

Learn Latest Tutorials

Splunk

SPSS

Swagger

Transact-SQL

Tumblr

ReactJS

Regex

Reinforcement Learning

R Programming

RxJS

React Native

Python Design Patterns

Python Pillow

Python Turtle

Keras

Preparation

Aptitude

Reasoning

Verbal Ability

Interview Questions

Company Questions

Trending Technologies

Artificial Intelligence

AWS

Selenium

Cloud Computing

Hadoop

ReactJS

Data Science

Angular 7

Blockchain

Git

Machine Learning

DevOps

B.Tech / MCA

DBMS

Data Structures

DAA

Operating System

Computer Network

Compiler Design

Computer Organization

Discrete Mathematics

Ethical Hacking

Computer Graphics

Software Engineering

Web Technology

Cyber Security

Automata

C Programming

C++

Java

.Net

Python

Programs

Control System

Data Mining

Data Warehouse

^{Like/Subscribe us for latest updates or newsletter}

Pandas Tutorial

Pandas Series