ParseTx Transaction Enrichment API: Common Questions

How is ParseTx different from calling ChatGPT directly?

Calling ChatGPT (or any LLM) directly gets you roughly the same raw intelligence — but none of the infrastructure that makes it production-ready. Here’s what ParseTx adds on top:A pre-warmed cache. ParseTx maintains a global canonical merchant database seeded with tens of thousands of verified merchant records. When you send a transaction string we’ve seen before, you get the result back in under 50ms from the edge — no model inference, no latency, no cost to you. A raw LLM call charges you for every single request, including duplicates you’ve already paid for.Deterministic output. LLMs are probabilistic by nature. The same input can return "McDonald's", "McDonalds", or "McDonald's Restaurant" on different runs. For downstream bookkeeping, analytics, or categorization pipelines, that inconsistency is catastrophic. ParseTx’s cache-first architecture guarantees that once a merchant string is resolved and cached, it returns the identical structured result every time.True batch processing. You can POST up to 500 transactions in a single API call via POST /v1/enrich/async. No raw LLM API supports that natively — you’d be managing 500 sequential HTTP requests, rate limits, timeouts, and partial failures yourself.PII sanitization before inference. ParseTx strips card numbers, account numbers, and personal names from transaction strings before they leave your infrastructure boundary. A direct LLM call sends everything to the model as-is.A structured, validated schema. Every response includes merchant, domain, category, mcc_code, is_subscription, and confidence — validated against a strict schema on every call. Raw LLM output requires you to parse, validate, and handle hallucinated fields yourself.

Do cache hits cost money?

No. You are only charged for transactions that require AI inference — that is, cache misses where the merchant string hasn’t been seen before.When a transaction string matches an entry in our global edge cache, the result is served instantly at no cost to you. The $0.005 per-transaction fee applies exclusively to the subset of your requests that trigger a live Gemini API call.In practice, the most common consumer merchant strings (Amazon, Netflix, Uber, Starbucks, and thousands more) are pre-seeded in the cache on day one, so a significant portion of typical production traffic will never incur an inference charge at all. As our canonical merchant database grows with every new request we process, the cache hit rate increases over time — meaning your effective cost per transaction goes down the longer you use the API.

What happens if a transaction string can't be parsed?

ParseTx isolates failures at the item level so that a single bad input never takes down your entire batch. Each item in the results array includes a status field that tells you exactly what happened:

status: "complete" — The transaction was successfully enriched.
status: "rejected" — The input was determined to be garbage (too short, pure entropy, no recognizable merchant signal). Do not retry this item — it will be rejected again.
status: "retry" — A transient error occurred (typically an upstream timeout from the Gemini API). It is safe to resubmit this item in a subsequent request.

The rest of your batch is processed normally regardless of any individual failures. If you submit 50 transactions and two of them are rejected, you receive 48 enriched results and two clearly flagged rejections in the same response object.

Per-request timeouts on Gemini API calls are capped at 10 seconds. If the upstream model doesn’t respond in time, affected items return with status: "retry" rather than blocking the entire request.

Is my transaction data used to train AI models?

No. ParseTx routes all production API requests through Google’s paid Cloud API tier, which is governed by Google’s Data Processing Addendum (DPA). Under that agreement, Google explicitly prohibits the use of your request data for model training or human review beyond abuse detection.This is a critical distinction from the free Google AI Studio tier, where training use is permitted. We enforce paid-tier routing in production as a hard architectural requirement — not as an optional configuration.Additionally, ParseTx itself never logs raw transaction strings. Our usage logs store only the anonymized SHA-256 hash of the normalized input, not the original text. This means there is no internal ParseTx data store that contains your users’ raw transaction descriptors.

What is the maximum batch size?

The limit depends on which endpoint you use:

Endpoint	Mode	Max Transactions
`POST /v1/enrich`	Synchronous	10
`POST /v1/enrich/async`	Asynchronous	500

The synchronous endpoint returns results directly in the response body. The asynchronous endpoint immediately returns a 202 Accepted with a job_id; poll GET /v1/enrich/jobs/:id to retrieve the completed results.For volumes exceeding 500 transactions, submit multiple async jobs in parallel. There’s no limit on the number of concurrent jobs per API key under the standard plan.

If you’re processing large bank statement imports or batch report exports, the async endpoint is the right choice even for smaller volumes — it’s more resilient to upstream latency spikes and returns immediately so you don’t block your application thread.

How do I render merchant logos?

ParseTx returns the merchant’s verified canonical domain field (e.g., "netflix.com", "amazon.com") rather than serving logo image files directly. You use that domain to render logos client-side via a favicon or logo service of your choice.A few common approaches:

<!-- Google Favicons (free, no auth required) -->
<img src="https://www.google.com/s2/favicons?domain=netflix.com&sz=128" />

<!-- Logo.dev (higher quality, token required) -->
<img src="https://img.logo.dev/netflix.com?token=YOUR_TOKEN" />

<!-- Brandfetch (premium quality) -->
<img src="https://cdn.brandfetch.io/netflix.com/w/400/h/400" />

This approach is intentional. Storing and serving trademarked brand logos would expose ParseTx (and your application) to intellectual property liability. By returning only the domain, ParseTx acts as a textual data standardization tool — the rendering decision, and the associated legal responsibility, stays with your application layer.See the Logos guide in the sidebar for a full walkthrough of logo rendering patterns, fallback strategies, and placeholder handling for merchants with no resolvable domain.

Can I cancel my subscription?

Yes, at any time. Visit the Stripe Customer Portal at billing.stripe.com/p/login/14AeVfcyM3Ucd6DeisdIA00 to:

Update or replace your payment method
Download past invoices
Cancel your subscription

Cancellations take effect at the end of your current billing cycle. You retain full API access until that date, and you won’t be charged again after it. Because ParseTx uses Stripe Metered Billing, your final invoice will reflect only the transactions you actually processed during that cycle — there are no cancellation fees or early termination charges.

Cancelling your subscription deactivates your API key at the end of the billing period. If you want to re-subscribe later, you’ll receive a new API key after completing sign-up again.

What does the confidence score mean?

The confidence field is a float between 0.0 and 1.0 that represents how certain ParseTx is about the merchant identification it returned.

Score	What it means
`1.0`	Cache hit — result is from the verified canonical merchant database
`≥ 0.80`	High-confidence AI inference — result was cached for future requests
`< 0.80`	Low-confidence inference — result was returned but not cached

Results below 0.80 indicate that the AI model was uncertain about the merchant identification. The result is still returned so you can handle it in your application, but it was not written to the cache to prevent cache poisoning from potentially inaccurate mappings.In your application, you might choose to surface low-confidence results differently — for example, showing them to your user for manual review, or flagging them in your analytics pipeline. You can filter on confidence threshold in your response handler:

const highConfidence = results.filter(r => r.confidence >= 0.80);
const needsReview = results.filter(r => r.confidence < 0.80);

Is there a free tier or trial?

There is no free tier, but there are also no minimum charges — ever. You pay exactly $0.005 per transaction processed, billed monthly based on actual usage. If you make zero API calls in a month, your bill is $0.Before adding a payment method, you can explore the API behavior interactively using the Enrichment Sandbox at parsetx.dev. The sandbox lets you type any raw transaction string and see the full structured JSON response in real time.A credit card is required to generate a production API key. This requirement exists to prevent automated abuse of the infrastructure — not to create a paywall. Your card won’t be charged until you actually enrich transactions.

What is an MCC code?

MCC stands for Merchant Category Code — a four-digit numeric code defined by the ISO 18245 standard and assigned to merchant types by the major payment networks (Visa, Mastercard, American Express, and Discover).Examples:

MCC	Merchant Type
`5411`	Grocery Stores, Supermarkets
`5812`	Eating Places, Restaurants
`5734`	Computer and Software Stores
`7011`	Hotels, Motels, Resorts
`4816`	Computer Network Services

ParseTx returns the mcc_code field when it can be reliably determined. MCC codes are particularly valuable for:

Expense categorization in accounting and ERP integrations
Tax reporting where expense type affects deductibility
Compliance in financial applications that must classify spend by merchant type
Subscription detection logic in personal finance management apps

When the MCC code cannot be confidently determined, the field returns null rather than guessing. A hallucinated MCC code in a financial compliance context is worse than a missing one.

MCC codes describe merchant types, not individual merchants. Two completely different businesses can share the same MCC. For merchant-level identification, use the merchant and domain fields.