This is a migrated thread and some comments may be shown as answers.

Style problems with imported HTML

5 Answers 483 Views
RichTextBox
This is a migrated thread and some comments may be shown as answers.
Evan
Top achievements
Rank 1
Evan asked on 19 Oct 2012, 02:40 PM
Hello,

I have noticed two separate issues when loading html documents into the WPF RadRichTextBox

The first issue is that loading a document with capitalized <STYLE> tags will not render at all, even if there are no styles specified.

The second issue is that the CSS style for "a:link, span.MsoHyperlink" that Microsoft Outlook adds to emails is being applied to the body tag of the resulting document.  This causes all text within the RadRichTextBox to be underlined, not just hyperlinks.

I have created a sample application to reproduce these issues.

<Window x:Class="RadRichTextBoxTestApp.MainWindow"
        Title="MainWindow" Height="350" Width="525"
        SizeToContent="WidthAndHeight">
    <Window.Resources>
      <Style TargetType="{x:Type telerik:RadRichTextBox}">
        <Style.Setters>
          <Setter Property="Height" Value="100" />
          <Setter Property="Width" Value="300" />
        </Style.Setters>
      </Style>
    </Window.Resources>
    <Grid>
      <Grid.RowDefinitions>
        <RowDefinition Height="Auto"/>
        <RowDefinition Height="Auto"/>
        <RowDefinition Height="Auto"/>
        <RowDefinition Height="Auto"/>
      </Grid.RowDefinitions>
      <telerik:HtmlDataProvider RichTextBox="{Binding ElementName=box0}" Html="{Binding DocumentWithCapitalizedStyleTags}" />
      <telerik:RadRichTextBox Grid.Row="0" x:Name="box0" Margin="10"/>
      <telerik:HtmlDataProvider RichTextBox="{Binding ElementName=box1}" Html="{Binding DocumentWithLowerCaseStyleTags}" />
      <telerik:RadRichTextBox Grid.Row="1" x:Name="box1" Margin="10"/>
      <telerik:HtmlDataProvider RichTextBox="{Binding ElementName=box2}" Html="{Binding DocumentWithMsoHyperlinkStyles}" />
      <telerik:RadRichTextBox Grid.Row="2" x:Name="box2" Margin="10"/>
      <telerik:HtmlDataProvider RichTextBox="{Binding ElementName=box3}" Html="{Binding DocumentWithInlineHyperlinkStyles}" />
      <telerik:RadRichTextBox Grid.Row="3" x:Name="box3" Margin="10"/>
    </Grid>
</Window>

using System.Windows;
 
namespace RadRichTextBoxTestApp
{
    /// <summary>
    /// Interaction logic for MainWindow.xaml
    /// </summary>
    public partial class MainWindow : Window
    {
        public MainWindow()
        {
            DataContext = new MainWindowViewModel();
            InitializeComponent();
        }
    }
 
    public class MainWindowViewModel
    {
        public string DocumentWithCapitalizedStyleTags
        {
            get { return "<html><head><STYLE></STYLE></head><body>HELLO WORLD!</body></html>"; }
        }
 
        public string DocumentWithLowerCaseStyleTags
        {
            get { return DocumentWithCapitalizedStyleTags.Replace("<STYLE", "<style").Replace("</STYLE","</style"); }
        }
 
        public string DocumentWithMsoHyperlinkStyles
        {
            get
            {
                return
@"
<html><head><style>
a:link, span.MsoHyperlink
    {mso-style-priority:99; color:blue; text-decoration:underline;}
</style></head><body>HELLO WORLD! <a href=""http://telerik.com"">Telerik</a></body></html>
";
            }
        }
 
        public string DocumentWithInlineHyperlinkStyles
        {
            get { return DocumentWithMsoHyperlinkStyles.Replace(", span.MsoHyperlink", ""); }
        }
    }
}

5 Answers, 1 is accepted

Sort by
0
Mihail
Telerik team
answered on 22 Oct 2012, 03:45 PM
Hi Evan,

Thank you for contacting us about these issues.

Our HTML parser has some problems with capitalized <STYLE> tag. We have logged this issue to our to do list, but it is not clear if and when it will be fixed. This is so because the main focus of HtmlFormatProvider is XHTML-compliant input and capitalized tags are not according to the XHTML specification. We have nonetheless upgraded your Telerik points for bringing up this topic, because this specific in the behavior should at least be documented.

As for the second issue, we couldn't reproduce it. The a:link is applied on the hyperlink and the rest of the text is shown without styling. Could you elaborate a bit on the details on reproducing the incorrect behavior?

Looking forward to your reply.

Kind regards,
Mihail
the Telerik team

Explore the entire Telerik portfolio by downloading Telerik DevCraft Ultimate.

0
Evan
Top achievements
Rank 1
answered on 22 Oct 2012, 06:09 PM
Attached is a screenshot of the sample application I provided.  The third RadRichTextBox is displaying the following html document:

<html><head><style>
a:link, span.MsoHyperlink
    {mso-style-priority:99; color:blue; text-decoration:underline;}
</style></head><body>HELLO WORLD! <a href="http://telerik.com">Telerik</a></body></html>

In my screenshot, the contents of the third RadRichTextBox are all underlined, including the "HELLO WORLD!" text, which should be un-formatted.  When you run the sample application, are the contents of the third and fourth RadRichTextBoxes identical?

Thanks for the explanation about the <STYLE> tags.  I am upgrading an old Mshtml-based email editor control to the RadRichTextBox, and the Mshtml editor capitalized all html tags.  My new control needs to be able to load html documents that were created by the old one, and also documents created by Outlook.  For this reason I have implemented the following workarounds on the Document dependency property of my Editor control.  If there is a more elegant workaround I would greatly appreciate any suggestions.

public static readonly DependencyProperty DocumentProperty = DependencyProperty.Register(
    "Document",
    typeof(string),
    typeof(Editor),
    new FrameworkPropertyMetadata(String.Empty, FrameworkPropertyMetadataOptions.BindsTwoWayByDefault, DocumentPropertyChanged));
 
private static void DocumentPropertyChanged(DependencyObject d, DependencyPropertyChangedEventArgs args)
{
    if (Equals(args.NewValue, args.OldValue)) return;
 
    string document;
     
    // Fix capitalized STYLE tags from Mshtml generated documents
    document = Regex.Replace(args.NewValue.ToString(), "<\\s?STYLE", "<style", RegexOptions.CultureInvariant);
    document = Regex.Replace(document, "</\\s?STYLE", "</style", RegexOptions.CultureInvariant);
 
    // Fix MsoHyperlink styles from Outlook generated documents
    document = Regex.Replace(document, "a:link\\s*,\\s*span.MsoHyperlink", "a:link", RegexOptions.IgnoreCase | RegexOptions.Singleline | RegexOptions.CultureInvariant);
    document = Regex.Replace(document, "a:visited\\s*,\\s*span.MsoHyperlinkFollowed", "a:visited", RegexOptions.IgnoreCase | RegexOptions.Singleline | RegexOptions.CultureInvariant);
 
    d.SetValue(DocumentProperty, document);
}
 
public string Document
{
    get
    {
        return (string)GetValue(DocumentProperty);
    }
    set
    {
        SetValue(DocumentProperty, value);
    }
}


0
Mihail
Telerik team
answered on 23 Oct 2012, 05:24 PM
Hi Evan,

We still can't reproduce the problem. As an attached screenshot you can see what is the result at our end.

Obviously there is something different in your application than in our test projects. Please, prepare a simple working project where the problem can be observed and attach it to a support ticket so that we can continue to investigate the problem.

Greetings,
Mihail
the Telerik team

Explore the entire Telerik portfolio by downloading Telerik DevCraft Ultimate.

0
Evan
Top achievements
Rank 1
answered on 23 Oct 2012, 10:36 PM
I have created the support ticket you requested for the second issue:  http://www.telerik.com/account/support-tickets/view-ticket.aspx?threadid=621506

I am also still looking for suggestions of a better way to work around the capitalized <STYLE> issue than the DependencyPropertyChanged implementation I described previously.

Thanks,
Evan
0
Boby
Telerik team
answered on 24 Oct 2012, 10:54 AM
Hi Evan,
We received the sample application in the support ticket you opened. The issue is fixed in the latest official release - 2012 Q3. The problem was that the 'span.MsoHyperlink" style was imported as the default span style. You can quickly test the fix in our online Silverlight demo, which uses implementation almost identical to the WPF one.
Data providers doesn't provide extensibility point for changing the document before importing, so you should replace the "STYLE" occurrences before setting the HTML string to the provider; so your approach is suitable.

Don't hesitate to contact us if you have other questions.



All the best,
Boby
the Telerik team

Explore the entire Telerik portfolio by downloading Telerik DevCraft Ultimate.

Tags
RichTextBox
Asked by
Evan
Top achievements
Rank 1
Answers by
Mihail
Telerik team
Evan
Top achievements
Rank 1
Boby
Telerik team
Share this question
or