HtmlFormatProvider import only unformatted text

1 Answer 43 Views
General Discussions
Deltaohm
Top achievements
Rank 3
Bronze
Iron
Iron
Deltaohm asked on 14 Nov 2023, 10:18 AM

Hi

I attached a simple program that import a html file (with format and images) and export it.
The exported file is very different from the original, without format and without images.

Thank you 

Luigi

1 Answer, 1 is accepted

Sort by
0
Accepted
Vladislav
Telerik team
answered on 16 Nov 2023, 01:21 PM | edited on 16 Nov 2023, 01:25 PM

Hi Gianluca,

Thank you for providing the sample app, with it I was able to observe the formatting issues. I have split my answer into several points, covering the main discrepancies that I was able to pinpoint while comparing both HTML files.

  1. The differences are mostly related to incompatibilities between the different formats. WordsProcessing is based on the Office Open XML standard (used by DOCX documents) and while importing, it matches the HTML elements to its model. However, this standard doesn't have the same elements and definitions as in HTML. Examples of these are the div elements which are imported as paragraphs. This leads to ignoring the element's width, causing it to be centered inside the window and not respecting the input dimensions of the rectangle it should fit. Also, this is what causes the indented text to be rendered to the left, as the paragraph does have positioning properties.
    As a possible workaround, if you have control over the generation of the HTML document, you can use tables to format the text.
  2. Another issue I have spotted is that the placeholder hyperlink tag (at the start of every "page" inside the source HTML document) spans over all the content of the imported document. I have logged a new bug on your behalf - WordsProcessing: HtmlFormatProvider: Importing a document containing a hyperlink (<a>) tag without content extends it over the trailing elements. You can subscribe to the item to receive updates on its status. In appreciation for pointing this issue to us, I have updated your Telerik points. The only workaround until the issue is fixed is to remove the empty hyperlinks from the source HTML.
  3. The missing background color is a known issue - WordsProcessing: HtmlFormatProvider: The style set in a <div> is not applied to the content of its children. You can subscribe to the public item to receive status updates on its progress. The only workaround is to apply the same style to the child elements if you have control over the creation of the HTML document.
  4. The image not showing is caused by this known missing feature - WordsProcessing: Add support for the background tag. You can vote for its implementation and subscribe to receive notifications when the status changes. Until the feature is implemented you can ordinary <img> element.

I would like to apologize for the inconvenience these issues might be causing you.

Regards,
Vladislav
Progress Telerik

Love the Telerik and Kendo UI products and believe more people should try them? Invite a fellow developer to become a Progress customer and each of you can get a $50 Amazon gift voucher.

Tags
General Discussions
Asked by
Deltaohm
Top achievements
Rank 3
Bronze
Iron
Iron
Answers by
Vladislav
Telerik team
Share this question
or