Hi
I need to extract text from a ranges of DocumentElements.
For example the text between a BookmarkRangeStart and its paired BookmarkRangeEnd.
My solution is to walk from NextSibling to NextSibling, check the siblings to be of type Span
and get the text. If they are not Spans I recursively walk through the Children collection
and of the sibling, check these to be spans, if not recurse again etc or quit if there are no Children
unti I reach the BookmarkRangeEnd as the last Sibling.
This seems to work.
Questions:
1) is there an easier / more reliable way to get the text of "range" of elements?
(I use TextFormatter.Export for complete Documents, but not for parts)
2) is text always located inside Spans?
3) Can Spans be nested - If so, does the Text property of the "parent" Span contain
the Text content of all "child" spans? (Like InnerText in HTML)?
4) Is there a straight forward way to get the text from a Table as a CSV / TSV string?
Grettings,
Chris
I need to extract text from a ranges of DocumentElements.
For example the text between a BookmarkRangeStart and its paired BookmarkRangeEnd.
My solution is to walk from NextSibling to NextSibling, check the siblings to be of type Span
and get the text. If they are not Spans I recursively walk through the Children collection
and of the sibling, check these to be spans, if not recurse again etc or quit if there are no Children
unti I reach the BookmarkRangeEnd as the last Sibling.
This seems to work.
Questions:
1) is there an easier / more reliable way to get the text of "range" of elements?
(I use TextFormatter.Export for complete Documents, but not for parts)
2) is text always located inside Spans?
3) Can Spans be nested - If so, does the Text property of the "parent" Span contain
the Text content of all "child" spans? (Like InnerText in HTML)?
4) Is there a straight forward way to get the text from a Table as a CSV / TSV string?
Grettings,
Chris