CureData Documentation
Define, automate, and enforce business-grade data validations — all in a single page app.
Getting started
Upload CSV or XLSX (most common), or JSON. We auto-detect headers and sample five rows. You can also start with a blank schema.
Email,Amount,StartDate,EndDate,State alice@example.com,12.50,2024-01-10,2024-01-12,CA bob@example.com,0,2024-02-01,2024-02-01,WA ,19.99,2024-03-05,2024-03-06,TX
- Header row required (first line).
- Delimiter auto-detects (, default). Supports quotes and newlines in fields.
- Dates: prefer ISO (YYYY-MM-DD).
- Numbers: plain decimal (no currency symbols).
- Saved rulesets map to headers using a strategy: EXACT / CI / FUZZY.
Schemas & columns
A schema is a named collection of columns. Each column carries 0..n rules. Order doesn’t matter — rules are evaluated in an optimized pass.
{
name: "StartDate",
rules: [
{ type: "type", expected: "date" },
{ type: "noFutureDate" }
]
}{
name: "EndDate",
rules: [
{ type: "type", expected: "date" },
{ type: "dateOrder", otherField: "StartDate", op: ">=" }
]
}Header matching strategy
When you apply a saved ruleset to a new upload, we need to match your file’s headers to the ruleset’s column keys. The header_match_strategy controls how strictly we match names.
| Strategy | Match rule | Example | When to use |
|---|---|---|---|
| EXACT | Names must match exactly | ruleset key NPI ⇄ header NPI | Strict internal feeds with stable headers |
| CI | Case-insensitive match | ruleset key NPI ⇄ header npi, Npi | Vendors who vary capitalization only |
| FUZZY | Loose/approximate name match | ruleset key ProviderNPI ⇄ header NPI Number | External files with inconsistent naming |
Rule Packs (one-click presets)
Walkthrough: Sample Orders
- orderId → required, type:string, length 8–36, unique_in_file
- email → required, type:string, email
- phone → type:string, phoneUS
- amount → type:number, range ≥ 0, precision 2
- order_date → type:date, noFutureDate
- ship_date → type:date, dateOrder: ship_date ≥ order_date
- status → enum [Pending, Processing, Shipped, Delivered, Cancelled]
- ifThen → If status="Cancelled" then ship_date blank
- state / zip → stateCodeUS, zipUS
Rules reference
Core rules ship built-in. Add as many as needed per column. regex/enum can repeat. Note: Cross-field rules like dateOrder and ifThen reference other columns.
| Rule | What it checks | Parameters | Expected value | Good example | Bad example |
|---|---|---|---|---|---|
| required Field cannot be blank or missing. | Value is present (not empty, not null). | — | any | "ORD-123" | (empty) |
| type:string Force the value to be text. | Value parses as a string. | expected = string | text | "hello" | 42 (number) |
| type:number Force the value to be numeric. | Value parses as a number. | expected = number | number | 19.99 | "N/A" |
| type:date Force the value to be a valid date. | Value parses as a date. | expected = date | date | 2024-05-01 | "not-a-date" |
| type:boolean Force the value to be yes/no. | Value is boolean-ish (true/false, yes/no). | expected = boolean | boolean | true | "maybe" |
| length Restrict minimum / maximum characters. | Text length is within bounds. | min, max | text | orderId length 12 | orderId length 2 |
| enum Only allow these values. | Value is one of the provided list. | values: [Pending, Processing, Shipped, Delivered, Cancelled] | text (categorical) | "Shipped" | "Unknown" |
| regex Advanced pattern rule. | Value matches a regular expression. | pattern, flags, name | text | "ORD-202501-0042" | "ORD-Jan-42" |
| range Limit number to a range. | Number is ≥ min and ≤ max (inclusive by default). | min, max, inclusive=true | number | 0 ≤ amount ≤ 100000 | -4 |
| precision Limit digits after the decimal. | Number has at most N decimals. | scale: 2 | number | 19.99 | 19.9999 |
| dateOrder Ensure dates are in logical order. | This date compares correctly to another column. | otherField, op: >, ≥, =, ≤, < | date | ship_date ≥ order_date | ship_date 2025-01-01 < order_date 2025-02-01 |
| noFutureDate Date must be today or earlier. | Date is ≤ today. | — | date | yesterday | next week |
| notExpired Date must be today or later. | Date is ≥ today. | — | date | next month | last year |
| ifThen Apply rules conditionally. | IF condition is true THEN target must satisfy requirement. | if:{field, op, value} → then:{field, mustBe/op} | varies | IF status=Cancelled THEN ship_date blank | Cancelled with ship_date present |
| email Must look like name@example.com. | Basic email shape. | — | text | "user@example.com" | "name@@example" |
| phoneUS Valid 10-digit US phone. | Accepts common US formats with 10 digits. | — | text | "(415) 555-0123" | "555-12-1234" |
| stateCodeUS Two-letter US state like CA, TX. | Matches official 2-letter postal code. | — | text | "CA" | "California" |
| zipUS 5-digit or ZIP+4. | ##### or #####-#### | — | text | "94107-1234" | "9410A" |
| Luhn Checksum validation (Luhn). | Passes Luhn checksum (often 10+ digits). | — | text/number | "79927398713" | "79927398710" |
Operators & Glossary
- == / = equal to
- != not equal
- >, ≥, ≤, < greater/less (dates and numbers)
- exists value present (not empty)
- blank value must be empty
- string any text; length/regex/enum apply
- number decimal; range/precision apply
- date ISO recommended (YYYY-MM-DD)
- boolean true/false (1/0, yes/no accepted)
Validation context & phases
Phase 1 runs client-only with local rules. A lightweight ctx object provides things like now.
const ctx = { now: new Date() }Exporting results
The demo exports a CSV of issues. Columns: rowIndex, column, code, message, value.
rowIndex,column,code,message,value 12,Email,required,"Value is required","" 42,Amount,range,"Must be between 0 and 1000","-12"