Data Cleaning
Fix Zip Codes Losing Leading Zeros in CSV Files
Excel silently strips leading zeros from zip codes, turning "01234" into "1234" and breaking your imports. Learn which states are affected, how to prevent the problem, and how to restore lost zeros in bulk.
Why Excel Strips Leading Zeros from Zip Codes
When you open a CSV file in Microsoft Excel, Excel analyzes each column and automatically assigns a data type. If a column contains values that look like numbers, Excel formats them as numbers. The zip code "01234" looks like the number 1234 to Excel, so it silently strips the leading zero. The same thing happens to "00501" (Holtsville, NY), "02134" (Boston, MA), and every other zip code that starts with zero.
This is not a bug. It is Excel working exactly as designed. Numbers do not have leading zeros in mathematics, so Excel removes them to normalize the value. The problem is that zip codes are not numbers. They are identifiers. A five-digit zip code is a fixed-length string where every character matters, including leading zeros. "01234" and "1234" are not the same zip code. In fact, "1234" is not a valid US zip code at all.
The damage happens the moment you open the CSV in Excel. Even if you immediately save and close the file without making any changes, the leading zeros are gone. Excel modifies the data on open, not on save. This catches people off guard because they assume that opening a file to "just take a look" is a read-only operation. With Excel and CSVs, it is not.
Which US States and Territories Are Affected
Any zip code starting with zero is affected, which covers the entire northeastern United States and several territories. The problem impacts a significant percentage of the US population. Here is the complete list of affected zip code ranges and their states:
| Zip Range | State / Territory | Example |
|---|---|---|
| 00501-00544 | New York (IRS) | 00501 Holtsville |
| 00600-00988 | Puerto Rico | 00901 San Juan |
| 01001-02791 | Massachusetts | 02134 Boston (Allston) |
| 02801-02940 | Rhode Island | 02860 Pawtucket |
| 03031-03897 | New Hampshire | 03101 Manchester |
| 03901-04992 | Maine | 04101 Portland |
| 05001-05907 | Vermont | 05401 Burlington |
| 06001-06928 | Connecticut | 06101 Hartford |
| 07001-08989 | New Jersey | 07102 Newark |
| 09000-09898 | Military (APO/FPO) | 09001 APO AE |
That is over 40 million people whose zip codes start with zero. If your dataset contains contacts from the northeastern US, Puerto Rico, or military addresses, you almost certainly have affected records. And the problem compounds: ZIP+4 codes (like "01234-5678") also lose their leading zero, becoming "1234-5678" which fails validation in most systems.
How to Prevent Excel from Stripping Zeros
The best strategy is to prevent the damage in the first place. There are several approaches, depending on your workflow.
Method 1: Use Excel's Data Import Wizard
Instead of double-clicking a CSV to open it, use Excel's import function. In Excel, go to Data → Get Data → From File → From Text/CSV. In the import wizard, you can set each column's data type. Set the zip code column to "Text" and Excel will preserve the leading zeros. This works reliably but requires you to remember to use the import wizard every time instead of double-clicking.
Method 2: Pre-format the Column as Text
Open a blank workbook, select the column where zip codes will go, right-click, choose Format Cells, and set the format to "Text." Then paste or import your data into that column. Because the column is already formatted as text, Excel will not strip the zeros. The drawback is that this only works if you set up the formatting before the data arrives.
Method 3: Quote Values in the CSV
If you control how the CSV is generated, wrapping zip code values in double quotes and prepending an equals sign forces Excel to treat them as text. The value would look like this in the raw CSV: ="01234". This is a hack, and it embeds Excel-specific formatting into what should be a platform-neutral data file, but it works.
Method 4: Use Google Sheets Instead
Google Sheets does not aggressively auto-format columns. When you import a CSV into Google Sheets, it typically preserves leading zeros in zip codes. If your workflow allows it, using Google Sheets as your CSV viewer and editor avoids the problem entirely. However, downloading from Google Sheets and opening in Excel brings the problem back.
How to Restore Leading Zeros After They Have Been Stripped
If the damage is already done and your zip codes have been truncated, restoration is possible but requires care. The approach depends on whether you have a three-digit, four-digit, or five-digit value in the column.
The Padding Approach
Since US zip codes are always five digits, any zip code with fewer than five digits is missing leading zeros. You can pad them back using a formula or a text function.
// Excel formula to pad zip codes to 5 digits:
=TEXT(A2, "00000")
// Google Sheets equivalent:
=TEXT(A2, "00000")
// Python (pandas):
df['zip'] = df['zip'].astype(str).str.zfill(5)
// Results:
1234 → 01234
501 → 00501
2134 → 02134
90210 → 90210 (unchanged, already 5 digits)
The Validation Problem
Simple padding works in most cases, but there is a subtle risk. If the original data contained non-US postal codes (Canadian codes like "K1A 0B1", UK codes like "EC1A 1BB"), those will have been mangled by Excel in different ways. Padding a Canadian postal code fragment to five digits produces nonsense. Before bulk-padding, verify that the column actually contains US zip codes.
Another edge case is ZIP+4 codes. If Excel stripped the leading zero from "01234-5678," the result is "1234-5678." Padding the numeric portion to five digits gives "01234-5678," which is correct. But if the value was stored as a number and the hyphen caused Excel to interpret it differently, you may have a completely corrupted value that padding cannot fix.
Cross-Referencing with City and State
If your dataset includes city and state columns alongside zip codes, you can validate and correct zip codes by cross-referencing. A zip code of "2134" in a row with the city "Boston" and state "MA" is almost certainly "02134." The USPS maintains a complete database mapping zip codes to cities and states, and several free APIs provide lookup functionality.
This cross-reference approach is especially valuable when you are not sure whether a four-digit value originally had one leading zero or if a three-digit value had two. The city and state give you the context to make the right call.
Beyond Zip Codes: Other Data Excel Destroys
Leading zeros in zip codes are the most well-known casualty of Excel's auto-formatting, but they are not the only one. The same behavior affects several other data types that teams regularly handle in CSV files:
Phone numbers: A phone number like "0044 7911 123456" (UK format) loses the leading zero. In some countries, the leading zero is a critical part of the number. Our guide on how Excel changes phone numbers covers this problem in detail.
Product codes and SKUs: A SKU like "007842" becomes "7842," which may not match your inventory system. Product codes are identifiers, not numbers, and leading zeros matter.
Account numbers: Bank account numbers, insurance policy numbers, and employee IDs often have fixed-length formats with leading zeros. Excel strips all of them.
Dates: Excel also converts date-like strings (like "1-2" which might be a part number) into date values. Our date standardization guide addresses this related issue.
The Impact on Imports and Integrations
Stripped zip codes do not just look wrong. They cause real failures in downstream systems. Here are the most common consequences:
CRM import rejection: Salesforce, HubSpot, and most CRMs validate zip codes on import. A four-digit value that should be five digits will fail validation, and the entire row may be skipped. If you are importing 10,000 contacts and 15% have northeastern zip codes, you just lost 1,500 contacts from your import.
Shipping calculation errors: E-commerce platforms and shipping APIs use zip codes to calculate rates and delivery times. An invalid zip code means no rate quote, a wrong rate quote, or a failed shipment.
Tax calculation failures: Sales tax is determined by zip code in many jurisdictions. Wrong zip codes mean wrong tax calculations, which can create compliance issues.
Audience targeting problems: Facebook Ads, Google Ads, and direct mail campaigns that use zip code targeting will miss or mis-target customers in the entire northeastern US. Read our guide on increasing Facebook audience match rates to understand how data quality affects ad targeting.
How NoSheet Protects Your Zip Codes
NoSheet is not Excel. When you upload a CSV, NoSheet reads the file as raw text without any auto-formatting. Zip codes with leading zeros are preserved exactly as they appear in the original file. There is no type inference that silently converts text to numbers.
For files that have already been damaged by Excel, NoSheet's CSV Cleaner includes automatic zip code detection and repair. It identifies columns that contain zip code data (based on column headers, value patterns, and context from adjacent city/state columns), detects truncated values, and pads them back to five digits. For ZIP+4 codes, it handles the extended format as well.
The tool also validates repaired zip codes against the USPS database when city and state data is available, giving you confidence that the restored values are correct. This is significantly more reliable than blind padding, which can produce valid-looking but incorrect zip codes in edge cases.
If your zip code issues are part of a larger data quality problem, our complete CSV cleaning guide covers all the common issues you are likely to encounter, from encoding problems to duplicate records.
Never Lose a Leading Zero Again
Upload your CSV and NoSheet will detect truncated zip codes, restore leading zeros, and validate against city/state data. No formulas, no Excel damage.
Fix Your Zip Codes Now