AMZN Mktp US*1A2B3C4D5 Amzn.com/bill WA to ParseTx, it passes through a five-stage pipeline before you receive a clean, structured result. Understanding this pipeline helps you predict response times, interpret status values, and build confidently on top of the API — because the same input will always produce the same output, every time.
The Enrichment Pipeline
Input Normalization
Before anything else, ParseTx sanitizes your raw transaction string. This stage has no network calls and completes in under 1ms on the edge.The normalizer performs the following operations in sequence:
- Strips prompt injection vectors — scans for 12+ known jailbreak signatures (e.g.
ignore all previous instructions,[INST]) and removes them entirely. - Removes numeric PII — any numeric sequence longer than 4 digits (card numbers, order IDs, routing numbers) is stripped before the string ever reaches the AI engine.
- Cleans noisy substrings — trailing dates (
12/03,06-11), location fragments, and processor-appended order-ID patterns are removed. - Truncates to 64 characters — enforces a hard cap to prevent Denial-of-Wallet (DoW) attacks that pad inputs to maximise AI token consumption.
- Uppercases the result — normalizes casing so that
Netflix.com on demandandNETFLIX.COM ON DEMANDresolve to the same cache key.
"status": "rejected". Do not retry these items.Cache Lookup
The normalized string is looked up in ParseTx’s global edge cache, distributed worldwide for low-latency access from any region.
- Cache hit → the stored result is returned immediately. No AI call is made. Response time is typically under 50ms.
- Cache miss → the request proceeds to the AI enrichment stage.
"source": "cache" and "confidence": 1 in the response. The cache is pre-warmed with the top 50,000 global merchants, so the most common transaction strings resolve instantly on first request.AI Enrichment
On a cache miss, the normalized string is forwarded to ParseTx’s AI inference engine. A strict structured schema is enforced — categories, booleans, and confidence scores are all validated before the result is used.
- Results with confidence ≥ 0.80 are written back to the edge cache for 30 days. Results below this threshold are returned to you but not cached, so the next request for the same string will try again.
"status": "retry" — safe to resubmit.Deterministic Output
Once a merchant string is resolved and cached, ParseTx guarantees deterministic output: the same input always returns the same
merchant, category, mcc_code, and all other fields. There is no LLM variance on repeat requests.This is the property that makes ParseTx suitable for production bookkeeping, analytics, and reporting. If RECURRING PMT AUTHORIZED ON 05/01 NETFLIX.COM maps to "category": "Entertainment" today, it maps to "category": "Entertainment" six months from now.Unlike calling an LLM directly, ParseTx’s canonical cache means identical inputs are resolved once and fanned out consistently to every caller. You never pay for the same merchant string twice, and you never get a different answer on a subsequent call.
Fault-Isolated Batch Response
For batch requests, every item in your array is processed inside its own isolated try/catch block. A single item that times out, triggers a rate limit, or encounters a transient upstream error does not fail the rest of the batch.
- Items that succeed return
"status": "complete". - Items that fail transiently return
"status": "retry"— you can resubmit them. - Items that are garbage return
"status": "rejected"— do not retry these.
200 OK. You must inspect each item’s status field to determine whether it needs action. If any item in the batch returns retry, the envelope-level status is "partial" instead of "complete".Pipeline Flow
Batch Deduplication
If you send the same transaction string multiple times within a single batch request, ParseTx resolves it once and fans the result back to each position in your array. You are billed once per unique string, not once per array element. Deduplication happens after normalization, so
NETFLIX.COM ON DEMAND and netflix.com on demand are treated as the same string.Response Times at a Glance
Cache Hit
Under 50msNormalized string found in the global edge cache. No AI call. Confidence is always
1. Source is "cache".AI Enrichment
1.5 – 3.5 secondsCache miss forwarded to the AI inference engine. Result is cached for 30 days after the first call. Source is
"llm".