Testing Tools

OCR & Redaction Tool

Upload an image or PDF, extract text locally with Tesseract.js, and apply regex-based redaction overlays—no server round trips or uploads required.

Upload document
Drop an image (PNG, JPG, WebP) or PDF. For PDFs, the first three pages are processed for speed.

Processing happens entirely in your browser. PDFs are rendered locally via pdf.js, and OCR uses Tesseract.js—no data ever leaves this page.

Redaction patterns
Provide one regex per line. We default to common PII patterns; adjust or add your own.
Need examples? Try \\b\\d4\\b for four-digit codes or Invoice\\s+#\\d+ to capture invoice references.
Upload a document to view redaction previews, match summaries, and extracted text.
Redaction best practices

Review every document before sharing. Automated masking helps, but manual verification ensures no sensitive data slips through.

Export match CSVs for audit logs or to feed into integration tests. Combine with form fuzzing to stress-test both client and server validation.

Upcoming enhancements

Queue multiple documents, batch export redacted images, and share saved regex patterns with teammates—coming soon.

Interested in full client-side redaction of PDFs including metadata? Let us know, and we’ll prioritise it on the roadmap.