About · CureData

A workbench for people who care what's in the file.

Most data never gets validated. It gets reviewed by a human who opens a sheet, eyeballs it, and hopes. CureData replaces the hoping with a deterministic pass: upload, compose, run, export.

§ Mission

Make it faster to trust a dataset than to mistrust it. Keep rules readable, runs reproducible, and the files on your own machine.

Principles

Six commitments.

  1. 01Trust

    Deterministic by design

    The same file plus the same ruleset produces the same report — every time, with issues keyed to rows.

  2. 02Composition

    Rules are small, named, and namespaced

    Compose from primitives. No opaque SDK; no hidden side effects.

  3. 03Reading

    Optimized for review

    Rule files are JSON-first. Diffs read cleanly in a pull request.

  4. 04Motion

    Feedback in the flow

    Streaming progress, column profiling, and issue tables that stay fast on large files.

  5. 05Residency

    Local before remote

    Phase 1 runs entirely in the browser. Files never leave the device unless you save a ruleset.

  6. 06Boundary

    Interoperable at the edge

    CSV today. XLSX, JSON. Tomorrow: HTTP, SQL, and warehouse lookups with explicit controls.

How it works

Three movements.

  1. 01

    Source

    Drop a CSV, XLSX, or JSON. Auto-detect headers, sample rows, infer types.

  2. 02

    Compose

    Pick rules from packs or author your own. Test against the sample inline.

  3. 03

    Run & export

    Validate the full file, inspect KPIs and column profiles, export issues to CSV.

Roadmap

Where we're going.

  • Phase 01

    Local rules

    now

    Client-side validation, profiling, packs, CSV export, saved rulesets.

  • Phase 02

    External sources

    HTTP, SQL, Snowflake, CSV lookups — with TTL caching and per-rule timeouts.

  • Phase 03

    Pipelines & scoring

    Weighted severities, quality scores, CI hooks, scheduled runs.