Is there a way to enumerate objects in a pdf?

1 Answer 137 Views
PdfProcessing
Ed
Top achievements
Rank 1
Iron
Veteran
Iron
Ed asked on 04 Oct 2022, 07:44 AM

Hi,

I want to be able extract embedded images and text from a pdf. Is this even possible?

If so, can you point me in the right direction?

Thanks... Ed

 

1 Answer, 1 is accepted

Sort by
1
Dimitar
Telerik team
answered on 04 Oct 2022, 12:47 PM

Hi Ed,

I wanted to ask you what is your application type? If you are using Net Framework this can be achieved with the following approach:

static void Main(string[] args)
{ 
    var pdfProvider = new PdfFormatProvider();
    var docuemnt = pdfProvider.Import(File.ReadAllBytes(@"..\..\sampledoc.pdf"));

    int count = 0;
    foreach (var page in docuemnt.Pages)
    {
        foreach (var item in page.Content)
        {
            if (item is TextFragment)
            {
                var text = ((TextFragment)item).Text;
                Console.WriteLine(text);
            }
            if (item is Image)
            {
                var image = (Image)item;
                BitmapSource source = image.ImageSource.GetBitmapSource();

                SaveClipboardImageToFile(@"C:\my_temp\image" + count++ + ".png", source);
            }
        }
    }
}

public static void SaveClipboardImageToFile(string filePath, BitmapSource image)
{
    using (var fileStream = new FileStream(filePath, FileMode.Create))
    {
        BitmapEncoder encoder = new PngBitmapEncoder();
        encoder.Frames.Add(BitmapFrame.Create(image));
        encoder.Save(fileStream);
    }
}
 I am looking forward to your reply.

Regards,
Dimitar
Progress Telerik

Love the Telerik and Kendo UI products and believe more people should try them? Invite a fellow developer to become a Progress customer and each of you can get a $50 Amazon gift voucher.

Ed
Top achievements
Rank 1
Iron
Veteran
Iron
commented on 04 Oct 2022, 12:53 PM

Looks almost exactly what the doctor ordered! Many thanks. I will play with it and get back to you.

Thanks again ... Ed

 

Ed
Top achievements
Rank 1
Iron
Veteran
Iron
commented on 07 Oct 2022, 06:52 AM

All I can say is you get a Snickers bar!
Tags
PdfProcessing
Asked by
Ed
Top achievements
Rank 1
Iron
Veteran
Iron
Answers by
Dimitar
Telerik team
Share this question
or