Guntosha

FAQ

OCR-embedded text disappears

PDFs with embedded OCR text generated by Adobe Acrobat may have that text corrupted when the PDF is saved. We have confirmed that this happens at a high rate with non-Latin-script PDFs, especially Japanese.

This is an issue in iOS's built-in PDF framework (PDFKit) and also occurs in the "Preview" app on iPadOS/iOS 26. In shelff, the PDF is saved when you write markers or annotations, and the corruption happens at that point.

When writing annotations in a PDF, we recommend backing up the PDF beforehand, then closing and reopening it after annotating to confirm that the OCR text remains selectable. For PDFs where OCR text is corrupted, please refrain from writing annotations in shelff.

If the OCR text in a PDF has been corrupted, please re-run OCR using the software that originally performed it (such as Adobe Acrobat).

We are continuously investigating ways to resolve this issue at its core in shelff.