CSV Data Profiler
Drop a CSV and get an instant per-column profile: type, fill rate, distinct values, min / max and the most common values.
Quick answer: Drop a CSV and get an instant per-column profile: type, fill rate, distinct values, min / max and the most common values.
Last updated
Frequently asked questions
- What does the data profiler tell me?
- Per column: the inferred type (number, boolean, date, string or empty), how many cells are filled, how many are empty, how many distinct values, the min and max for numeric columns, and the top-3 most common values with counts.
- How does type inference work?
- We sample up to 200 non-empty values per column. A column is typed `number` / `boolean` / `date` only when ≥80% of those samples parse cleanly; otherwise it falls back to `string`. The 80% threshold tolerates a few stray values without misclassifying a clean column.
- What date formats are detected?
- ISO (YYYY-MM-DD), ISO slash (YYYY/MM/DD), `DD-Mon-YYYY` (e.g. `02-Jan-2026`) and ISO datetime (we use the date prefix). Ambiguous formats like US/EU slashes need the dedicated date converter.
- Will my CSV be uploaded?
- No. The whole profile runs in your browser using Papaparse — your data never reaches our server.
- Why is fill rate useful?
- Empty cells are a leading indicator of optional fields, broken pipelines or upstream filter bugs. A column expected to be 100% filled but profiling at 87% almost always means data is being dropped somewhere.
- How big a file can I profile?
- Hundreds of thousands of rows usually finish in a few seconds. The footer counts every row exactly; only type sampling is bounded (200 samples per column).
- Why might 'distinct' count seem too high?
- Whitespace differences and casing variants count as separate distinct values. Run the file through the Cleaner with 'trim whitespace' on for a more honest count.
- What does 'top values' show me?
- The three most common non-empty cell values per column, with their counts. Excellent for spotting categorical columns and outliers (e.g. a status column showing `paid: 9,812 · pending: 41 · paif: 2`).
- Are min/max shown for non-numeric columns?
- No — they only make sense for numeric columns and are omitted everywhere else.
- Can I export the profile as a report?
- Not yet. Copy the table into a sheet for now; a downloadable Markdown / PDF report is on the roadmap.