Convert Docx to HTML, modify online and convert bak to Docx

5 posts, 0 answers
  1. Pierre Yves
    Pierre Yves avatar
    3 posts
    Member since:
    Jul 2016

    Posted 13 Jul Link to this post

    Hi,

    as a proof of concept, I want to load a Docx in a Kendo Editor and save it back. So far, I'm able to

    1. load a Docx, convert it to HTML
    2. then using Telerik ASP.Net MVC Editor I'm able to modify the HTML
    3. Save it back to a new Docx document

    The issues I have is that the new docx don't have the same margins as the original document and it also takes 4 pages instead of 2. You can take a look at the screenshots attached (before/after).

     

    Is there a way to configure the styles and the page setup ?

     

    here is the code that I use. Basically, I have a Kendo Editor in my view that loads the HTML and a button to save back the data

     

    [HttpPost]
    public ActionResult Editor(string editor)
    {
        // HTML decode the value before using it.
        var notes = System.Web.HttpUtility.HtmlDecode(editor);
          
        var htmlProvider = new HtmlFormatProvider();
        RadFlowDocument document = htmlProvider.Import(notes);
     
        SaveDocx(document);
         
        return View();
    }
     
    public ActionResult Editor()
    {
        RadFlowDocument sourceDoc = GetSourceDocument();
        SaveDocx(sourceDoc);
        var myHtmlContent = GetHtmlFromDocument(sourceDoc);
     
        ViewBag.MyHtmlContent = myHtmlContent;
     
        return View();
    }
     
    private RadFlowDocument GetSourceDocument()
    {
        var docxProvider = new DocxFormatProvider();
        var stream = new MemoryStream(System.IO.File.ReadAllBytes(@"c:\Temp\TelerikMvcApp1\TelerikMvcApp1\Fiche.docx"));
        RadFlowDocument document;
     
        using (Stream input = System.IO.File.OpenRead(@"c:\Temp\TelerikMvcApp1\TelerikMvcApp1\Fiche.docx"))
        {
            document = docxProvider.Import(stream);
        }
     
        return document;
    }
     
    private string GetHtmlFromDocument(RadFlowDocument document)
    {
        var myHtmlContent = "";
        using (MemoryStream ms = new MemoryStream())
        {
            HtmlFormatProvider htmlProvider = new HtmlFormatProvider();
            myHtmlContent = htmlProvider.Export(document);
        }
     
        return myHtmlContent;
    }
     
    private void SaveDocx(RadFlowDocument document)
    {
        using (Stream output = System.IO.File.OpenWrite(String.Format(@"c:\Temp\TelerikMvcApp1\TelerikMvcApp1\Fiche{0}.docx", DateTime.Now.Ticks)))
        {
            DocxFormatProvider docxProvider = new DocxFormatProvider();
            docxProvider.Export(document, output);
        }
    }

     

    thanks

  2. Boby
    Admin
    Boby avatar
    595 posts

    Posted 15 Jul Link to this post

    Hello Pierre,

    By the way, this is a planned feature for the Kendo Editor, so you can follow the issue on GitHub here:
    Import from RTF and DOCX along with Export to RTF, DOCX and PDF
    or the UserVoice item here:
    Export Kendo UI Editor content to PDF and word file

    Otherwise, I suppose that the document you are you are importing has some features which are unsupported when importing from DOCX,  or when subsequently importing from HTML and exporting to DOCX. Could you please open a support ticket and send us the problematic document, as well as the exact version of the editor and the Document Processing binaries, so that we can investigate the issue further?

    Regards,
    Boby
    Telerik by Progress

  3. Pierre Yves
    Pierre Yves avatar
    3 posts
    Member since:
    Jul 2016

    Posted 15 Jul in reply to Boby Link to this post

    Hi Boby, 

    thanks for your answer. Since we havent purchased a licence yet, I'm not able to create a support ticket.

     

    for the binaries, I used the following versions from the trial based on UI for ASP.NET MVC Q2 2016 SP1

    Telerik.Windows.Documents.Core.dll 2016.2.421.45

    Telerik.Windows.Documents.Flow.dll 2016.2.421.45

    Kendo.Mvc 2016.2.607.545

     

    You can get my docx file here : https://drive.google.com/open?id=0Bxxu6yivT-LMNXNoTDdQbUJEdG8

  4. Boby
    Admin
    Boby avatar
    595 posts

    Posted 18 Jul Link to this post

    Hello Pierre Yves,

    I am seeing some differences, but not exactly as the ones from the screenshot. Please find attached the document produced without any customizations of the editor produced document:
    - Hidden text is not supported
    - Section margins are not preserved on export to HTML
    - Cell paddings in the converted documents are set as local properties, instead of inherited from the whole table.

    Note that such differences are sometimes expected. Our document model resembles the one of MS Word, and it supports more properties than HTML, sometimes there are more complex differences between the models (e.g. styles and styles inheritance) - so there isn't always possible for a document to preserve it original look after HTML export - HTML import roundtrip.


    Regards,
    Boby
    Telerik by Progress

  5. Pierre Yves
    Pierre Yves avatar
    3 posts
    Member since:
    Jul 2016

    Posted 19 Jul in reply to Boby Link to this post

    Thanks, 

    I will take a look at this.

Back to Top