This is a migrated thread and some comments may be shown as answers.

PDF Viewer - table recognition

2 Answers 281 Views

PDFViewer

This is a migrated thread and some comments may be shown as answers.

2 Answers, 1 is accepted

Accepted

answered on 07 Dec 2017, 08:23 AM

Hello Mohamed,

PDF format is optimized for viewing, and not for preserving of the semantics of the content. That said, in the general case there is no information in the content indicating that part of the content is a table. The table is, in most cases, just a bunch of paths (borders) and text fragments (words); or even could be an image.

More sophisticated software, like MS Word for example, have the ability to do an OCR analysis of the PDF content elements, and tries to "detect" that a certain content is a table.

That said, the answer to both of your question is no. You can try to detect semantics yourself using RadPdfProcessing and its document model, but this would be tricky and will most probably work only for certain classes of PDF documents.

Regards,
Boby
Progress Telerik

Want to extend the target reach of your WPF applications, leveraging iOS, Android, and UWP? Try UI for Xamarin, a suite of polished and feature-rich components for the Xamarin framework, which allow you to write beautiful native mobile apps using a single shared C# codebase.