This is a migrated thread and some comments may be shown as answers.

PDF Device Information Settings - Local chars damaged in PDF file properties

6 Answers 304 Views

General Discussions

This is a migrated thread and some comments may be shown as answers.

Jozef asked on 09 Nov 2020, 07:59 AM

When you generate pdf you can set various pdf descriptions (DocumentTitle, DocumentAuthor, DocumentSubject, DocumentKeywords). If you use special chars "\().." Telerik automatically escape these chars with "\" or writes them direct if it possible.

We have problem when we use some our local chars. (some works some not)

Example: we put deviceInfo.Add("DocumentSubject", "Účtovný denník"); and the result is stored inside pdf file as /Subject(Ú\rtovný denník).

So Telerik automatically convert č -> \r, but only this one char.

We try:

1) set also deviceInfo.Add("DocumentNaturalLanguage", "sk-SK") but this doesn't help. (A language identifier in accordance to RFC 1766 that specifies the natural language for all text in the document. If absent, the language will be retrieved from report's culture. Example value: en-US.)

2) we try put this char as numeric sequence "\226" but Telerik automatically correct it to "\\226" and result is string "\226"

Any clues? :-)

6 Answers, 1 is accepted

answered on 11 Nov 2020, 09:25 PM

Hi Jozef,

Thank you for using the Telerik Reporting Forums. I was able to reproduce this issue and need to reach out to the development team for further investigation.

Once I have more information either I or someone from the team will reply with an update.

In the meantime, please let me know if you need any additional information. Thank you for developing with Telerik Reporting.

Regards,

Eric R | Senior Technical Support Engineer
Progress Telerik

Virtual Classroom, the free self-paced technical training that gets you up to speed with Telerik and Kendo UI products quickly just got a fresh new look + new and improved content including a brand new Blazor course! Check it out at https://learn.telerik.com/.

answered on 12 Nov 2020, 10:56 AM

Hi Jozef,

I investigated the current issue and determined where the problem is. The PDF Subject is not output correctly because of the default ASCII encoding used to write the PDF metadata contents. In the current scenario the characters cannot be displayed correctly with ASCII encoding - they would either require an encoding that represents Unicode characters (usually UTF-8) or a specific ANSI codepage like Windows-1252.

According to PDF compliance standards, the PDF metadata, being part of XMP (Extensible Metadata Platform) must be encoded using UTF-8. This will allow to correctly show the characters in metadata fields. Our PDF rendering extension supports three types of compliance standards: PDF/A-1b, PDF/A-2b and PDF/A-3b. All of them will produce a PDF with metadata encoded using UTF-8, which in the current scenario will result in correctly displayed subject field. Please check the PDF Compliance specification to determine which one to use and add a line that initializes it in the deviceInfo instance, as shown below (example with A-1b standard):

deviceInfo.Add("ComplianceLevel", "PDF/A-1b");

As a result, the PDF description will show the subject as expected:

Hope this helps.

Regards,
Ivan Hristov
Progress Telerik

answered on 13 Nov 2020, 09:23 AM

PDF-subject-correct-encoding.rar

Hello Jozef,

I'm sorry to hear you're still having troubles with the encoding of the PDF subject. Thanks for attaching a sample report - I examined it, but I was unable to find a problem in the subject field contents on my side. When I open the produced test.pdf file, its subject field seems OK to me, as it shows Účtovný denník, as expected.

I also ran the code and the produced PDF files also seem correct on mine and other 2 machines I used in my test.

I shot a short video demonstrating how it works on my computer. Please check it and let me know if I'm missing something.

Regards,
Ivan Hristov
Progress Telerik

answered on 13 Nov 2020, 01:18 PM

chromeb9ab6ea93a254511bc353059e7fb646e.png

firefox352a93be1c6a44bd9ae02c5661be927b.png

foxit.png

I find where is problem, it's viewer based error:

Few examples

Chrome: Bad - wrong caption in title bar. You need add DocumentTitle property - deviceInfo.Add("DocumentTitle", "Účtovný denník");

Firefox: Bad, when you press Document properties.

Edge: Doesn't have info in Titlebar or Document properties functionality - i don't know

Acrobat reader - OK

Foxit Reader - Bad, when you press Document properties.

See attached screenshots.

answered on 17 Nov 2020, 09:57 AM

Hi Jozef,

Thanks for investigating it further. Yes, it seems that the PDF viewers interpret differently such characters. I can confirm that the Acrobat Reader and Adobe Acrobat DC display the characters correctly while Chrome and FoxIt Reader fail on the second character. In our tests we extensively use Adobe Acrobat and Reader to examine the produced PDF files that's why I was unable to see the problem in the first place. I'm not sure why the other viewers are unable to render the characters from the metadata information correctly, but after a quick search I found FoxIt had similar problems before, as you can see here: https://forums.foxitsoftware.com/forum/portable-document-format-pdf-tools/foxit-reader/152566-diacritics-croatian. Unfortunately I can't tell if this particular issue relates to the current scenario, but I believe their support can provide a helpful advice.

Regards,
Ivan Hristov
Progress Telerik