New to Telerik Document ProcessingStart a free 30-day trial

Extracting Text from PDF Documents

Updated on Jun 5, 2026

Environment

VersionProductAuthor
Q1 2025RadPdfProcessingDesislava Yordanova

Description

Learn how to extract the text content from a PDF document.

Solution

Follow these steps:

  1. Import the PDF document with the PdfFormatProvider.

  2. Export the RadFixedDocument content to text with the TextFormatProvider. If the PDF document contains text fragments, the provider exports them to a plain text result.

csharp
            string filePath = "input.pdf";
            PdfFormatProvider pdf_provider = new PdfFormatProvider();
            RadFixedDocument fixed_document;
            using (Stream stream = File.OpenRead(filePath))
            {
                fixed_document = pdf_provider.Import(stream);
            }
            Telerik.Windows.Documents.Fixed.FormatProviders.Text.TextFormatProvider provider = new Telerik.Windows.Documents.Fixed.FormatProviders.Text.TextFormatProvider();

            string documentContent = provider.Export(fixed_document);
            Debug.WriteLine(documentContent);

The TextFormatProvider may not cover all scenarios depending on the internal document content. A common case is a document with scanned images that contain text information. In this case, the above approach does not parse the content to plain text because the text is represented as Path elements. Use the OcrFormatProvider to convert images of typed, handwritten, or printed text into machine-encoded text from a scanned document.

See Also