I am doing some document processing and want to use Telerik to display and edit. I need access to the paraId and textId elements of the docx "document" XML. I have explored the object model in the debugger and do not see these attributes. Are they available in the imported model?
Thanks.
1 Answer, 1 is accepted
0
Dess | Tech Support Engineer, Principal
Telerik team
answered on 26 Mar 2025, 06:55 AM
Hi, John,
The WordsProcessing library allows you to create and modify various document formats like DOCX, RTF, HTML, TXT, convert from one format to another, and export to PDF. Using the DocxFormatProvider, you can import the document's content in to the WordsProcessing model.
All properties that the Paragraph object offers are listed in the Modifying a Paragraph section of the online documentation. However, the "paraId" attribute is not accessible as a separate property which can be extracted. Could you please share what is the exact requirement that you are trying to achieve? Once we get better understanding of the precise case, we would be able to think about an appropriate solution and provide further assistance. Thank you for your cooperation.
I am looking forward to your reply.
Regards,
Dess | Tech Support Engineer, Principal
Progress Telerik
We are working on a product that aligns original and updated DOCX documents, developing a dictionary that (basically) says "doc 1 paragraph 5 aligns to doc 2 paragraphs 11, 31". This information is then used to auto-answer (and auto-generate) test questions between the documents. For example, an organization updates a policy document and wants to know if all the answers from the old quiz are answered in the new document, AND see the passages that contain the answers. To that end, we need to identify DOCX paragraphs. It turns out that "paraId" is not necessarily unique, so our processing actually adds a custom XML attribute during processing.
Once processing is complete, I need to display the two docs side-by-side and show the alignments. So I really need the paragraph attributes. I was hoping to use Telerik, as I am most familiar with it, to display the formatted documents. Although Words Processing will import the DOCX files, the attributes are lost so I can't implement the functionality to seek to aligned passages.
Reading the attributes and adding them to the Words Processing model might allow me to do that. Maybe it would allow others to create applications from annotated DOCX files and use Telerik to display the formatted document.
(Right now, I'm basically re-implementing Words Processing (!) and converting the DOCX to a flow document.)
Thanks for your attention.
Dess | Tech Support Engineer, Principal
Telerik team
commented on 07 Apr 2025, 01:13 PM
Hi, John,
Thank you for the provided additional information. It helped me to get the basic context about the scenario. However, it is still not crystal clear to me whether you need to use any UI control for displaying the changes or not.
Regarding the part related to the Document processing, "For example, an organization updates a policy document and wants to know if all the answers from the old quiz are answered in the new document, AND see the passages that contain the answers." - I would suggest to consider using bookmarks. A Bookmark refers to a location in the document and has a unique name. Every Bookmark has a corresponding BookmarkRangeStart and BookmarkRangeEnd, which are inline elements. Thus, you can define the desired ranges and get the content within the defined range. More information is available in the following article: Inserting a Bookmark
Off topic, if you consider running your application in Windows environments, an appropriate UI control for displaying the document's updates is the WPF RichTextBox which offers Track Changes functionality.
Please, let me know if there is anything else I can assist you with.