Zebra AI employs a Data Cleaning Assistant which helps you quickly prepare uploaded datasets for analysis and dashboarding by detecting common formatting issues and applying safe, reversible fixes. It’s designed to reduce manual cleanup so you can get to insights faster.
What it can detect
- Multiple header rows: Extra header lines before the real data starts.
- Dates in column headers: Wide-format time periods like 01-2025, Q1-2024 used as separate columns.
- Totals/sums as rows or columns: Aggregated totals embedded in the data grid.
- Empty “spacing” cells/rows: Cosmetic separators that don’t carry data.
- Non-repeating values that should carry down: Header-like values that appear once and implicitly apply to following rows (common in hierarchical tables).
- Non-standard missing values in numeric columns: Values like “NA”, “N/A”, “null”, “none” used in place of true empties.
- Non-numeric characters in numeric columns: Currency symbols, percent signs, or mixed separators (e.g., 1.000,00) that block numeric parsing.
It doesn’t invent problems. If your file is already clean, it will skip cleaning entirely.

What it can fix automatically
- Merge and standardize headers to one clear header row.
- Unpivot date headers (wide → long) so time periods become rows.
- Remove totals rows/columns that pollute analysis.
- Forward-fill non-repeating values to the appropriate rows.
- Normalize “missing” markers to true missing values.
- Clean numeric columns by removing symbols and normalizing thousand/decimal separators.
- Drop empty spacer rows/columns.
It works iteratively and conservatively. If a step doesn’t improve the data, it automatically reverts and tries a different approach.

What it cannot fix (you’ll need to handle)
- Business logic decisions: Complex deduplication rules, merging entities, or interpreting domain-specific hierarchies.
- Ambiguous data entry errors: Misspellings, mislabeled categories, or inconsistent codes that require human judgment.
- Schema redesigns: Major model changes (joins across spreadsheets, custom keys, or multi-table restructuring).
- Unit conversions and semantics: Converting mixed units (e.g., kg vs lbs) or aligning currencies without clear guidance.
- Source-system fixes: Issues originating in a database, Power BI model, or ETL that should be corrected at the source.
When in doubt, make sure to check and follow the manual Data Preparation requirements and best practices. This will ensure that Zebra AI consistently handles and understands your data properly.
Where it works (and where it doesn’t)
- Supported: Uploaded files (e.g., CSV/XLSX) you bring into the app.
- Not supported: Direct SQL connections and Power BI connections. The assistant won’t modify data from those sources—make fixes upstream or export a file, clean it, and re-import.
How to use it
- Upload your dataset: Add your CSV/XLSX to start.
- Data Quality Check runs automatically: The assistant scans for the issues above.
- Review and select issues to resolve: The assistant plans, executes, and evaluates changes automatically.
- Inspect results: You’ll see progress updates and a cleaned preview.
- Download the cleaned dataset and resume the analysis: Save a copy locally once you’re satisfied.
Download step: why it matters
- Keep a stable, reusable copy: The cleaned dataset preserves standardized headers, types, and structure.
- Resume the same story later: Re-upload the downloaded cleaned file in future sessions to continue working on the same charts and dashboards without re-cleaning.
- Share with teammates: Ensure everyone uses the same “golden” version to avoid drift.
Tips for best results
- Upload the rawest practical file: The assistant is built for messy spreadsheets.
- Let it run end-to-end: It may solve multiple issues in one go.
- If something looks off: Download, review, and make minor manual adjustments; then re-upload and proceed.
- If the assistant fails to clean your data automatically, clean it manually using the Data Preparation guide.
If you’re working from SQL or Power BI, consider exporting the data to a file, clean it with the assistant, then bring it back into your workflow. This ensures you can continue the same stories consistently across sessions.