how to rebuild a page

12 posts, 2 answers
  1. Charles
    Charles avatar
    7 posts
    Member since:
    Feb 2021

    Posted 22 Mar Link to this post

    Hello, I have a project adopted from the code in the ManipulatePages example project from Telerik, the FitAndPositionMultiplePagesOnSinglePage routine to be precise.

    Except instead of taking four pages and putting them on one, I'm taking two 5.5x8.5 pages and putting them on a new 11x8x5 page, left to right.

    It's working great, when the input PDF is properly formatted.

    I have a handful of PDFs that were created in some unknown way that appear to be malformed.  When they are fed through my project, the odd pages never make it to the right side of the output.  It looks like, among other things, the CropBox size isn't the same as the MediaBox size on the source doc, which I suspect is a source (maybe not THE source) of the problem.

    It seems I should be able to read the content of the pages on the source doc and insert them into a new page on the output doc, instead of just copying the page from PdfFileSource.Pages[] to essentially rebuild the page, instead of copying the page?

    However, I'm at a loss as to how to read the source page, as PdfFileSource.Pages don't seem to expose the content?

    Help is appreciated! :)

    Thanks,

    Charles

  2. Answer
    Dimitar
    Admin
    Dimitar avatar
    2983 posts

    Posted 23 Mar Link to this post

    Hi Charles,

    Yes, this is expected, the content is read dynamically and is not loaded into the memory. This is why you cannot access it.

    To access the content in code you have to import the file to a RadFixedDocument. Here is an example of this:

    var provider = new PdfFormatProvider();
    var document = provider.Import(File.ReadAllBytes(@"..\..\SampleDoc.pdf"));
    
    foreach (var page in document.Pages)
    {
        foreach (var item in page.Content)
        {
            Console.WriteLine(item);
        }
    }
    

    After this is done you can create a new page with the desired content and pass it to the PdfStreamWriter: 

    using (PdfStreamWriter fileWriter = new PdfStreamWriter(File.OpenWrite(@"..\..\result.pdf")))
    {
        RadFixedPage newPage = new RadFixedPage();
        var position = new SimplePosition();
        position.Translate(100, 100);
        newPage.Content.Add(new TextFragment("TextFragment") { Position = position });
    
        fileWriter.WritePage(newPage);
    }
    

    I hope this helps. Should you have any other questions do not hesitate to ask.

    Regards,
    Dimitar
    Progress Telerik

    Love the Telerik and Kendo UI products and believe more people should try them? Invite a fellow developer to become a Progress customer and each of you can get a $50 Amazon gift voucher.

  3. Charles
    Charles avatar
    7 posts
    Member since:
    Feb 2021

    Posted 23 Mar in reply to Dimitar Link to this post

    Thanks, this is very helpful.

    Apparently the PdfFormatProvider has strict rules on the format of the imported PDF, as the Import call fails on my problematic Pdf:

    'StartXRef keyword cannot be found.'

    Charles

  4. Charles
    Charles avatar
    7 posts
    Member since:
    Feb 2021

    Posted 23 Mar in reply to Charles Link to this post

    I get the same error on other PDFs as well, including those that I don't otherwise have problems with in the above stated two pages into one processing.

    Charles

  5. Dimitar
    Admin
    Dimitar avatar
    2983 posts

    Posted 24 Mar Link to this post

    Hello Charles,

    In order to further investigate what is causing this error, I need a specific file. This way I will be able to determine if the file is invalid and we can handle this case so the file is imported correctly. Since this is a forum thread and is public I will suggest opening a new ticket (which is a private thread) and attaching one of the files that cause this issue.

    Thank you in advance for your patience and cooperation.  

    Regards,
    Dimitar
    Progress Telerik

    Love the Telerik and Kendo UI products and believe more people should try them? Invite a fellow developer to become a Progress customer and each of you can get a $50 Amazon gift voucher.

  6. Charles
    Charles avatar
    7 posts
    Member since:
    Feb 2021

    Posted 24 Mar in reply to Dimitar Link to this post

    I understand.  The challenge is every PDF viewer I've used renders them fine.  It's just your app that says it's invalid.

    I'm going to try having the PDFs recreated, and only as last resort will I open a ticket.

    Thanks.

  7. Dimitar
    Admin
    Dimitar avatar
    2983 posts

    Posted 25 Mar Link to this post

    Hi Charles,

    Yes, some invalid scenarios are handled by the PDF viewers. We are trying to handles such cases as well, but there are still documents that cannot be loaded. Providing the document will allow me to determine the exact cause and log it for improvement in our feedback portal. 

    Let me know if I can assist you further.

    Regards,
    Dimitar
    Progress Telerik

    Virtual Classroom, the free self-paced technical training that gets you up to speed with Telerik and Kendo UI products quickly just got a fresh new look + new and improved content including a brand new Blazor course! Check it out at https://learn.telerik.com/.

  8. Charles
    Charles avatar
    7 posts
    Member since:
    Feb 2021

    Posted 25 Mar in reply to Dimitar Link to this post

    Hello, I'm sorry, I just double checked my work and found I was using the wrong input stream.  The above code works just fine with loading my PDFs, included those that are malformed, when I use the correct input stream. :)

    My apologies!

  9. Charles
    Charles avatar
    7 posts
    Member since:
    Feb 2021

    Posted 25 Mar Link to this post

    I haven't had any success so far in recreating a page.

    While I can access the content now from page.Content (all TextFragments), replicating or transferring those TextFragments to a new page isn't working.

    The TextFragment can't be added to a new page because the Parent property is already defined.

    I tried creating a new TextFragment with all the same properties as the source item and adding that to the new page, but the output was mangled.  (There wasn't an easy way to "clone" a TextFragment, as far as I could tell..?)

    When I say mangled, everything was in the right place, the words are there, but there are special characters (E with a tilde above it) all over the place).

    At the moment I don't think I need a way to do this, but it could come in handy in the future if you can quickly show how it's done!

    Thanks.

  10. Answer
    Dimitar
    Admin
    Dimitar avatar
    2983 posts

    Posted 29 Mar Link to this post

    Hi Charles,

    There is an internal clone method that can be used for such cases. You can get it with reflection. Here is an example of this: 

    var provider = new PdfFormatProvider();
    var document = provider.Import(File.ReadAllBytes(@"..\..\SampleDoc.pdf"));
    
    RadFixedDocument newDocument = new RadFixedDocument();
    var cloneMethod = typeof(TextFragment).GetMethod("CreateClonedInstance", System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance);
    
    foreach (var page in document.Pages)
    {
        var newPage = newDocument.Pages.AddPage();
    
        foreach (var item in page.Content)
        {
            var textFragment = item as TextFragment;
            if (textFragment != null)
            {
                var newFragment = cloneMethod.Invoke(textFragment, null) as TextFragment;
                newPage.Content.Add(newFragment);
            }
    
        }
    }
    
    var resultBytes = provider.Export(newDocument);
    File.WriteAllBytes(@"..\..\result.pdf", resultBytes);
    
    

    In addition, it seems that a font is missing and this is why you are getting invalid characters. If your document is using a specific embedded font that is not available on the operating system you need to manually register it. This is necessary if you are using the NET Standard version of the assemblies as well. 

    Let me know if I can assist you further.

    Regards,
    Dimitar
    Progress Telerik

    Love the Telerik and Kendo UI products and believe more people should try them? Invite a fellow developer to become a Progress customer and each of you can get a $50 Amazon gift voucher.

  11. Charles
    Charles avatar
    7 posts
    Member since:
    Feb 2021

    Posted 29 Mar in reply to Dimitar Link to this post

    Thanks!  Maybe an enhancement you can change the modifier on that method from internal to public? :)

    Charles

  12. Dimitar
    Admin
    Dimitar avatar
    2983 posts

    Posted 31 Mar Link to this post

    Hello Charles,

    I have forwarded your request to the team and they will consider it. 

    Do not hesitate to contact us if you have other questions.

    Regards,
    Dimitar
    Progress Telerik

    Virtual Classroom, the free self-paced technical training that gets you up to speed with Telerik and Kendo UI products quickly just got a fresh new look + new and improved content including a brand new Blazor course! Check it out at https://learn.telerik.com/.

Back to Top