FileKit

Extract Text (OCR) Beta

Your files never leave your device

Language

Drop files here

or click to browse

Max 50.0 MB per file·Supports: JPG · PNG · WebP · BMP · TIFF · PDF

You might also need

How OCR works

FileKit uses Tesseract.js, a WebAssembly port of the Tesseract OCR engine, to recognise text entirely in your browser. The language model is downloaded once (~4 MB for English) and cached locally — nothing is uploaded. For best results, use high-contrast images with clearly printed text at a resolution of at least 150 DPI.

How to OCR a Document

  1. 1

    Upload an image or scanned PDF

    Drag and drop a scanned document, photo of a page, or screenshot. Supported formats include JPG, PNG, WebP, and PDF.

  2. 2

    Select the language

    Choose the primary language of the document: English, Chinese (Simplified), Japanese, or English+Chinese combined. Correct language selection improves accuracy significantly.

  3. 3

    Extract and copy text

    FileKit runs Tesseract.js (WebAssembly OCR) entirely in your browser. The recognised text appears in an editable area — copy it or download as a .txt file.

Frequently Asked Questions

Related Guides