How to Remove Duplicate Contacts When Exporting Gmail
Open a raw export of your inbox and the first thing you notice is repetition. The colleague you email daily is there forty times. Your accountant appears in every invoice thread. A newsletter shows up for every issue it ever sent. None of that is an error — it is simply what happens when you list emails one per row. But it makes for a useless contact list, so this guide explains where the duplicates come from and the fastest way to get rid of them.
Why duplicates happen in the first place
An email export and a contact list are two different things, and the confusion between them is the root cause. A message-level export answers "what emails do I have?" — naturally one row per email. A contact list answers "who do I correspond with?" — one row per person. When you want the second but export the first, every repeated conversation becomes a repeated row.
Several specific patterns inflate the count:
- Long threads. A single back-and-forth can be twenty messages, each from the same two people.
- Sent and received. The same contact appears both as a sender and as a recipient.
- Casing differences.
Maria@acme.comandmaria@acme.comare the same mailbox but look different to a naive match. - Display name drift. "Maria Lopez", "Maria L." and "M. Lopez" can all sit on one address.
The fast fix: de-duplicate before export
The cleanest approach is to remove duplicates while the list is being built, not after. With Gmail Exporter the de-duplication is a single click before the CSV is written, matching on the email address so each unique contact appears once:
- Open the inbox, label or search you want to turn into a contact list.
- Click the de-duplicate option. Repeat addresses collapse to one row.
- Export to CSV. The file you download is already clean — no spreadsheet surgery needed.
Because this happens inside your browser, the matching and cleanup never leave your device. It is the same flow used when building a clean email list from Gmail; de-duplication is the step that makes that list trustworthy.
Get a de-duplicated contact list in one click — free
One row per person, cleaned in your browser before the file ever downloads.
Add to Chrome — It's FreeDe-duplicating an existing CSV in a spreadsheet
If you already have a messy file, you can still clean it. The trick is to normalize before you de-duplicate so near-matches collapse properly.
In Google Sheets
- Add a helper column with
=LOWER(A2)and fill it down to lowercase every address. - Copy that column and paste it back as values.
- Select the data and choose Data → Data cleanup → Remove duplicates, matching on the lowercased column.
In Excel
- Use
=LOWER(A2)in a helper column, then paste as values. - Select your range and click Data → Remove Duplicates.
- Tick only the email column so rows are judged by address, not by name.
Doing the lowercasing first is what catches the casing duplicates that a plain Remove Duplicates would otherwise miss. For more on getting the file into a spreadsheet cleanly, see exporting Gmail to Excel.
Choosing which duplicate to keep
When two rows share an address but differ elsewhere — one has a name, one does not — you want to keep the richer row. A simple way:
- Sort so the best row is on top. Sort by name (non-blank first) or by date (most recent first) before de-duplicating.
- Most tools keep the first occurrence, so sorting puts the row you want in that position.
- Spot-check role addresses. Decide deliberately whether to keep or drop info@, sales@ and noreply@.
Before vs after
| Raw export | De-duplicated list |
|---|---|
| One row per message | One row per contact |
| Frequent contacts repeated dozens of times | Each person once |
| Mixed casing inflates the count | Casing normalized |
| Hard to import into a CRM | Clean, mappable columns |
Duplicates vs. near-duplicates
It is worth separating two problems people lump together. An exact duplicate is the same address appearing twice — easy to collapse on the email column. A near-duplicate is trickier: the same human reached at two different addresses, such as maria@acme.com and m.lopez@gmail.com. No tool can know those are the same person from the address alone, so de-duplication by email will, correctly, keep both. If merging people across addresses matters to you, add a "person" column and reconcile by name manually after the automatic pass. Trying to automate that step usually merges the wrong rows, so a quick human review is safer than an aggressive rule.
The same logic applies to role addresses. maria@acme.com and sales@acme.com may both reach Maria, but they are distinct mailboxes and should usually stay as separate rows unless you have a reason to fold them together. Treat de-duplication as removing literal repeats, and treat "merging people" as a separate, deliberate cleanup.
Keep it clean going forward
De-duplication is most valuable on contact lists, where repeats are pure noise. If your goal is instead a full record of every message — for a backup or an archive — you actually want every row, so do not de-duplicate that export. Match the cleanup to the job: read extracting all email addresses from Gmail for contact-style exports, and backing up your entire Gmail inbox when completeness matters more than tidiness.
Frequently asked questions
Why does my Gmail export have so many duplicates?
A raw export lists one row per message, so a contact you exchanged 30 emails with appears 30 times. De-duplication collapses those into one row per address.
How do I remove duplicates in one click?
Gmail Exporter de-duplicates by email address before the file is created, so the CSV you download already has one row per unique contact.
How do I remove duplicates in a spreadsheet?
In Sheets use Data → Remove duplicates on the email column; in Excel use Data → Remove Duplicates. Lowercase the column first so different casings collapse.
Why do different capitalizations count as duplicates?
Email addresses are case-insensitive, so Maria@acme.com and maria@acme.com are the same mailbox. Lowercasing before de-duplicating merges them.
Will de-duplication delete data I need?
It removes repeat rows for the same contact, not your emails. For full message history, keep the message-level export; for a contact list, one row per person is correct.
Is the de-duplication done privately?
Yes. With a local extension the matching happens in your browser before the file is written — nothing is uploaded.