Tool Comparison

How to Clean Phone Numbers in Google Sheets (And Why It Breaks at Scale)

Google Sheets can handle basic data cleanup with formulas, but E.164 phone formatting, smart deduplication, and large datasets expose its limits. Here is what Sheets can do, what it cannot, and how to handle the rest.

March 2026·12 min read

What Google Sheets Can Do for Data Cleaning

Google Sheets is the first tool most people reach for when they need to clean data, and for good reason. It is free, it is familiar, it runs in any browser, and it includes a set of built-in functions that handle basic cleaning tasks competently. Before we discuss its limitations, it is important to acknowledge what Sheets does well.

TRIM and CLEAN: Removing Whitespace and Control Characters

The TRIM() function removes leading and trailing spaces from text. The CLEAN() function removes non-printable ASCII characters (control characters with codes 0-31). Together, they handle the most basic level of text cleanup. For simple whitespace issues, these work perfectly.

=TRIM(CLEAN(A2))

Removes spaces and control characters from cell A2

SUBSTITUTE: Find and Replace

SUBSTITUTE() replaces specific text strings. It is useful for removing unwanted characters from phone numbers, like dashes, dots, and parentheses. You can nest multiple SUBSTITUTE calls to strip several characters at once.

=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A2,"-",""),".","")," ","")

Removes dashes, dots, and spaces from a phone number

REGEXREPLACE: Pattern-Based Cleanup

REGEXREPLACE() is the most powerful built-in cleaning function. It uses regular expressions to find and replace patterns in text. You can strip all non-numeric characters from a phone number with a single formula.

=REGEXREPLACE(A2, "[^0-9+]", "")

Strips everything except digits and + from a phone number

UPPER, LOWER, PROPER: Case Standardization

Case functions convert text to uppercase, lowercase, or title case. PROPER() capitalizes the first letter of each word, which works well for names in most cases (though it fails on names like "McDonald" or "O'Brien").

Remove Duplicates: Built-In Menu Option

Google Sheets has a built-in Data > Remove duplicates menu option that identifies and removes duplicate rows based on the columns you select. For simple exact-match deduplication, it works. You select your range, choose which columns to compare, and Sheets deletes the duplicates.

What Google Sheets Cannot Do (Or Does Poorly)

The functions above handle surface-level cleaning. But real-world data cleaning requires capabilities that Google Sheets fundamentally lacks. Here are the critical gaps.

E.164 Phone Number Formatting

Stripping formatting characters from a phone number is easy in Sheets. Actually converting a number to proper E.164 format is not. E.164 conversion requires knowing the country of origin to add the correct country code, removing trunk prefixes (the leading 0 used in UK, Australian, German, and many other countries' domestic dialing), and validating that the resulting number has the correct length for its country. No combination of REGEXREPLACE, SUBSTITUTE, and IF statements can reliably handle this for mixed international data.

Consider the challenge: your CSV has a mix of US numbers formatted as (555) 123-4567, UK numbers as 07700 900000, and Australian numbers as 0412 345 678. To convert these to E.164, you need to detect the country of each number, strip the trunk prefix where applicable, prepend the country code, and validate the result. In Google Sheets, this would require a nested IF formula checking number length and patterns, plus a lookup table of country codes, plus manual intervention for ambiguous cases. It would be fragile, slow, and wrong for edge cases.

Side-by-side: Google Sheets vs NoSheet

Google Sheets: 5 formulas across 3 columns

B2: =REGEXREPLACE(A2,"[^0-9]","")

C2: =IF(LEN(B2)=10,"+1"&B2, IF(LEN(B2)=11,"+"&B2, IF(LEFT(B2,2)="44","+"&B2, "MANUAL")))

D2: =IF(C2="MANUAL","REVIEW",C2)

Then: Copy formulas down all rows

Then: Paste-as-values, delete helper columns

NoSheet: One click

1. Upload CSV

2. Select phone column

3. Click "Format to E.164"

4. Download cleaned file

Handles all countries, trunk prefixes, validation

Smart Date Format Detection Across Mixed Formats

Google Sheets attempts to parse dates automatically, but its parser makes assumptions that can silently corrupt your data. The date "01/02/2026" will be interpreted as January 2nd in a US-locale sheet and February 1st in a UK-locale sheet. If your CSV contains dates from international sources with mixed formats (some MM/DD/YYYY, some DD/MM/YYYY, some YYYY-MM-DD), Sheets will parse each according to its locale setting with no way to specify per-row formatting. Dates that fall between 1-12 for both month and day (like 03/04/2026) will be silently misinterpreted with no warning.

NoSheet's date standardizer detects the actual format of each value in the column using pattern analysis and converts all dates to your chosen output format. It flags ambiguous dates for review rather than guessing. Read more in our guide to standardizing dates in spreadsheets.

Email Validation Beyond Syntax

Google Sheets has no built-in email validation function. You can write a REGEXMATCH formula to check basic syntax (does it contain @ and a dot?), but that catches only the most egregious errors. It will not detect common domain typos (gmial.com, yaho.com, outlok.com), disposable email addresses, or role-based addresses (info@, admin@, noreply@) that will harm your deliverability. Real email validation requires domain-level checks that are impossible within a spreadsheet formula.

Intelligent Deduplication

The built-in Remove Duplicates feature does exact matching only. If your CSV has "John Smith" and "john smith" and "JOHN SMITH", Sheets treats these as three distinct records. If one row has "john@example.com" and another has "john@example.com " (with a trailing space), Sheets will keep both. Real deduplication needs case-insensitive matching, whitespace normalization, and the ability to choose which duplicate to keep (most complete, most recent, first occurrence). Sheets offers none of this.

Handling 100K+ Rows

This is the hard ceiling. Google Sheets has a limit of 10 million cells per spreadsheet, but the real performance limit hits much earlier. At 50,000 rows, applying formulas to every row causes noticeable lag. At 100,000 rows, the sheet becomes sluggish to the point of being unusable. Scrolling stutters, formulas take seconds to recalculate, and the browser tab may crash entirely. If you are cleaning a dataset with more than 50,000 rows, Google Sheets is not a viable tool.

The 10 Million Cell Limit and the Performance Cliff

Google Sheets officially supports 10 million cells per spreadsheet. For a file with 20 columns, that means a maximum of 500,000 rows. But the official limit and the practical limit are very different things. Sheets was designed as a collaborative spreadsheet tool, not a data processing engine. Every cell is tracked for real-time collaboration, version history, and formula dependencies. That overhead adds up fast.

In practice, performance degrades in a predictable pattern. At 10,000 rows, Sheets works fine for most operations. At 25,000 rows, complex formulas (especially ARRAYFORMULA applied to entire columns) start to slow down. At 50,000 rows, you will notice lag on every operation, including scrolling, sorting, and filtering. At 100,000 rows, the sheet becomes essentially non-functional for interactive use. Formula recalculation can take 30 seconds or more, and the browser tab may show "Page Unresponsive" warnings.

This matters for data cleaning because cleaning operations are inherently row-by-row. If you need to apply TRIM, REGEXREPLACE, and a date conversion formula to every row, you are adding three formula columns. For a 50,000-row dataset, that is 150,000 additional cells of live formulas, all recalculating every time you edit a cell. The sheet grinds to a halt.

Apps Script Custom Functions: Powerful but Limited

Google Apps Script lets you write custom JavaScript functions that run inside Sheets. In theory, you could write a custom function that formats phone numbers to E.164, validates email addresses, or performs intelligent deduplication. In practice, Apps Script has severe limitations that make it impractical for data cleaning at scale.

Execution time limits: Custom functions have a 30-second execution timeout. For a function that processes one cell at a time (as custom functions must), this is not an issue. But if you try to process thousands of cells with a custom function, you will hit the daily quota limits quickly. Google limits Apps Script to 90 minutes of total execution time per day for consumer accounts and 6 hours for Workspace accounts.

Speed: Custom functions are interpreted JavaScript running on Google's servers, invoked once per cell. Each call has network latency overhead. Processing 10,000 cells with a custom function can take minutes, during which the sheet is unresponsive. Compare that to a compiled Rust backend that processes 500,000 rows in under 5 seconds.

No external API access from custom functions: Custom functions (those invoked with =MYFUNCTION()) cannot make HTTP requests. If you want to validate email addresses against a real-time API or look up country codes from an external database, you need to use a trigger-based script instead, which adds complexity and removes the formula-like simplicity.

Why NoSheet Fills the Gap

NoSheet was built to handle exactly the tasks that Google Sheets cannot do well or cannot do at all. It is browser-based like Sheets, so there is nothing to install. It is no-code like Sheets, so there are no formulas to write. But under the surface, it uses a Rust-powered backend that processes data at native speed, with no cell count limits and no formula recalculation overhead.

Where Google Sheets requires 5 formulas across 3 helper columns to attempt phone number formatting (and still fails on international numbers), NoSheet does it with a single click. Where Sheets' date parser silently misinterprets ambiguous dates, NoSheet detects the actual format and flags ambiguous values. Where Sheets' Remove Duplicates only does exact matching, NoSheet deduplicates with case-insensitive, whitespace-normalized matching and lets you choose which row to keep.

Most importantly, NoSheet handles large files without breaking a sweat. A 200,000-row CSV that would crash a Google Sheet processes in seconds. There is no 10 million cell limit, no formula recalculation lag, and no browser tab crashes. You upload the file, apply your cleaning operations, and download the result. The data never hits a performance cliff because it is processed by compiled code on the server, not by a JavaScript spreadsheet engine in your browser.

Google Sheets remains an excellent tool for collaborative editing, quick calculations, and working with small datasets. For data cleaning specifically, its limitations become apparent the moment your requirements go beyond basic text functions. NoSheet picks up exactly where Sheets leaves off. Also check out our comparison against Excel Power Query to see how NoSheet compares to Microsoft's data cleaning tools, or browse our guide to no-code data cleaning for a broader perspective on the space.

Clean data that Google Sheets cannot handle

E.164 phone formatting, smart deduplication, email validation, and date standardization at any scale. No formulas, no limits.

Try NoSheet Free

Related Resources