How to validate CSV (find broken rows fast)
Most “CSV conversion” failures are actually CSV structure failures: wrong delimiter, broken quoting, embedded newlines, or rows that don’t match the header column count. This guide shows a practical workflow to find the exact row that breaks parsing.
What “invalid CSV” usually means
CSV doesn’t have one strict universal standard in the wild. “Invalid” usually means: a parser can’t produce a stable table because it can’t reliably tell where a row ends or how many columns belong in each row.
In practice, almost every CSV problem is one of these:
- Wrong delimiter (comma vs semicolon vs tab vs pipe).
- Broken quotes (missing closing quote, or unescaped quotes).
- Embedded newlines in a field without quoting.
- Row/column mismatch (some rows split into more/fewer columns than the header).
Step 1: confirm the delimiter
If your file looks like it has only one column after conversion, the delimiter is probably wrong. Excel exports often use semicolons in some regions, and TSV uses a tab.
Quick check: split the header row by the delimiter and see if the column count makes sense.
Step 2: validate quoting rules
Double quotes are special in CSV. If a field contains the delimiter or a newline, the whole field must
be wrapped in quotes. A literal quote inside a quoted field is escaped by doubling it:
"".
A common hard error is: Unterminated quoted field (missing the closing quote). Once a quote is “opened”, the parser keeps consuming text until it finds the closing quote, which can make the rest of the file look corrupted.
Step 3: verify a stable column count
If the header has 12 columns, most rows should parse into 12 fields. Rows with 11 or 13 usually point to a specific break: an extra delimiter inside a value, a missing delimiter, or a newline splitting a row in half.
Use the CSV Validator tool
- Paste CSV into CSV Validator.
- Keep delimiter on Auto (or force the correct one).
- Leave First row is header on if your CSV has headers.
- Read the report: expected columns, distribution, and the first mismatched rows.
- Fix the first failing row (or re-export), then validate again.
Interpreting the report (patterns)
- Most rows have 1 column: delimiter is wrong.
- Some rows have more columns: an unquoted delimiter or a stray quote.
- Some rows have fewer columns: embedded newline or missing delimiter.
- Unterminated quotes: broken export; fix/re-export is usually faster than manual edits.
After validation: convert safely
Once the structure is stable, conversion becomes predictable. Convert CSV to JSON locally, then spot-check a few rows near the previously failing row number.
FAQ
Do I need a “real” CSV spec? Usually no. You just need a consistent delimiter and consistent quoting rules so every row parses the same way.
Is it safe to use online “CSV fixers”? If the file contains private data, avoid uploads. Use local/no-upload tools and verify via the browser Network tab.
Local verification snippet
Run a quick local check before export/convert:
import csv
from io import StringIO
sample = text[:50000] # keep first chunk for fast local triage
rows = list(csv.reader(StringIO(sample)))
print('rows:', len(rows), 'columns(first row):', len(rows[0]) if rows else 0)
Related by winning cluster
Linked from a winner family to push crawl and first-impression conversion.
Related by intent
Useful follow-up pages selected from real search impressions and no-click opportunities.
Next pages to check
Closest crawled pages without impressions yet. Added to speed first-impression conversion.
Quick fix checklist
- Reproduce the error on a minimal input.
- Check type/format and field mapping.
- Apply the smallest safe fix.
- Validate on production-like payload.