New to Telerik Document Processing? Start a free 30-day trial
Extracting Text from PDF Documents
Updated on Jun 5, 2026
Environment
| Version | Product | Author |
|---|---|---|
| Q1 2025 | RadPdfProcessing | Desislava Yordanova |
Description
Learn how to extract the text content from a PDF document.
Solution
Follow these steps:
-
Import the PDF document with the
PdfFormatProvider. -
Export the
RadFixedDocumentcontent to text with theTextFormatProvider. If the PDF document contains text fragments, the provider exports them to a plain text result.
csharp
string filePath = "input.pdf";
PdfFormatProvider pdf_provider = new PdfFormatProvider();
RadFixedDocument fixed_document;
using (Stream stream = File.OpenRead(filePath))
{
fixed_document = pdf_provider.Import(stream);
}
Telerik.Windows.Documents.Fixed.FormatProviders.Text.TextFormatProvider provider = new Telerik.Windows.Documents.Fixed.FormatProviders.Text.TextFormatProvider();
string documentContent = provider.Export(fixed_document);
Debug.WriteLine(documentContent);
The
TextFormatProvidermay not cover all scenarios depending on the internal document content. A common case is a document with scanned images that contain text information. In this case, the above approach does not parse the content to plain text because the text is represented asPathelements. Use theOcrFormatProviderto convert images of typed, handwritten, or printed text into machine-encoded text from a scanned document.