Before you convert: check what type of PDF you have
Some PDFs already contain selectable text (you can highlight it). Others are scanned images (you can’t). OCR is specifically for scanned/image-only PDFs.
You can’t select or copy any text
Your PDF is likely scanned/image-only. Use OCR to extract the text.
Ctrl+F finds nothing
Search won’t work on image PDFs. OCR converts the page image into real characters.
Text copies as gibberish
Wrong language, low-quality scan, skew, blur, or heavy shadows can break OCR.
Long PDF? Convert only the pages you need (500-page cap)
Fastest workflow: split the PDF to keep only relevant pages, then OCR the smaller file. This avoids page limits and reduces processing time.
Split pages first
Use Split PDF to extract the pages you care about, then upload that smaller PDF here.
Compress if the scan is huge
If the file size is very large, run Compress PDF to speed up uploads and processing.
How to convert PDF to text (3 steps)
1) Upload your PDF
Upload a PDF file. If it’s longer than 500 pages, split it and upload only the required pages.
2) Choose the correct language (recommended)
Selecting the right language improves OCR accuracy, especially for accents and similar-looking letters.
3) Copy or export the extracted text
Copy the extracted text and reuse it anywhere. If you want a searchable PDF instead of raw text, use Make PDF Searchable or OCR PDF.
OCR accuracy tips (so the extracted text is clean)
- Select the correct language before OCR (biggest easy win).
- Prefer sharp scans: blur + shadows hurt OCR.
- Rotate/deskew pages so lines are straight.
- If the text is tiny, re-scan higher quality when possible.
- For tables, consider exporting to Excel after OCR for cleanup.
Security & privacy
OCR requires processing your document to recognize text. If you’re testing, use sample PDFs or redact sensitive data. Review the policies for retention/deletion details.
FAQs
How do I extract text from a PDF?
Upload your PDF. If it’s a scanned/image-only PDF, OCR will recognize the characters and produce real text you can copy and reuse.
How do I extract text from a scanned PDF (scanned PDF to text)?
Scanned PDFs are images. OCR reads the page image and converts it to text. Select the correct language for best accuracy, then copy the extracted text.
Why can’t I copy text from my PDF?
Many PDFs are scans or flattened images with no text layer. OCR is required to convert the page image into selectable/copyable text.
Is there a page limit?
Yes — OCR is capped to the first 500 pages for fast processing. If your file is longer, split the PDF and OCR only the pages you need.
Will OCR preserve formatting (tables/columns)?
OCR focuses on recognizing text. Simple paragraphs often extract well, but complex formatting like tables and multi-column layouts may need cleanup after extraction.
How can I improve OCR accuracy?
Use clear scans, avoid blur/shadows, keep pages straight, and pick the correct language before OCR. If text is tiny, re-scan at higher quality if possible.
Related tools
Convert, edit, and prepare PDFs for OCR using these tools:
Extract text from your PDF now
Upload a PDF, choose the language, and copy the extracted text. If it’s long, split pages first to stay under the cap.
Upload PDF