Rich Text Format

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Rich Text Format
Filename extension .rtf
Internet media type text/rtf[1]
application/rtf[2]
Type code 'RTF.'[3][4][5]
Uniform Type Identifier public.rtf
Magic number {\rtf
Developed by Microsoft
Latest release 1.9.1 / 19 March 2008; 2 years ago (2008-03-19)
Type of format document file format
Open format? No

The Rich Text Format (often abbreviated RTF) is a proprietary[6][7] document file format with published specification developed by Microsoft Corporation since 1987 for Microsoft products and for cross-platform document interchange.

Most word processors are able to read and write some versions of RTF.[8] There are several different revisions of RTF specification and portability of files will depend on what version of RTF is being used.[7][9] RTF specifications are changed and published with major Microsoft Word/Microsoft Office versions.

It should not be confused with enriched text (mimetype "text/enriched" of RFC 1896) or its predecessor Rich Text (mimetype "text/richtext" of RFC 1341 and 1521); nor with IBM's RFT-DCA (Revisable Format Text-Document Content Architecture) which are completely different specifications.

Contents

[hide]

[edit] History

Richard Brodie, Charles Simonyi, and David Luebbert, members of the Microsoft Word development team, developed the original RTF in the middle to late 1980s. Its syntax was influenced by the TeX typesetting language.[citation needed] The first RTF reader and writer shipped in 1987 as part of Microsoft Word 3.0 for Macintosh, which implemented the RTF version 1.0 specification. All subsequent releases of Microsoft Word for the Macintosh and all versions for Windows can read and write files in RTF format.

Microsoft holds the copyright to RTF and maintains the format. As of March 2008, the current version is 1.9.1. According to Microsoft's Office 2010 resource kit documentation, Microsoft is discontinuing enhancements to the RTF specification. Further, some new features in Word 2010 and later versions will not save properly to the RTF format.[10]

[edit] Version timeline

[edit] Version changes

Rich Text Format (RTF) specifications are changed and published with major Microsoft Word/Microsoft Office versions.

RTF specifications for Microsoft Word[15][16]
RTF version↓ Publication date↓ Microsoft Word version↓ MS Word release date↓ Notes↓
1.0 1987 Microsoft Word 3 1987 latest revision 6/92[17][18]; the 1992 revision defines support for Microsoft Object Linking and Embedding (OLE) objects and Macintosh Edition Manager subscriber objects
1.1 font embedding - font data may be located inside the file
1.2 1993 [19][20]
1.3 January 1994 Microsoft Word 6 1993 1/94 GC0165[21][22]
1.4 September 1995 Microsoft Word 95/Word 7 1995
1.5 April 1997 Microsoft Word 97/Word 8 1997 Unicode RTF - supports 16-bit Unicode character encoding scheme[12]
1.6 May 1999 Microsoft Word 2000/Word 9 1999
1.7 August 2001 Microsoft Word 2002/Word 10 2001 8/2001– Word 2002 RTF Specification[13]
1.8 April 2004 Microsoft Word 2003/Word 11 2003 10/2003– Word 2003 RTF Specification[4]
1.9.1 19. March 2008
(1.9 - January 2007[23])
Microsoft Word 2007/Word 12 2006 use of XML markup - Custom XML Tags, SmartTags, Math elements in an RTF document - following the Office Open XML specification (Ecma-376, Part 4)[14]

[edit] Code example

As an example, the following RTF code:

{\rtf1\ansi{\fonttbl\f0\fswiss Helvetica;}\f0\pard This is some {\b bold}
            text.\par } 

would be rendered like this when read by a program that supports RTF:

This is some bold text.

Braces ({ and }) define a group; groups can be nested. A backslash (\) starts an RTF control code. A valid RTF document is a group that starts with the \rtf control code.

In the example above, the \b control code invokes boldface type; the example uses a group to limit the scope of the boldface control code. All other text characters will be rendered as plain text. The \par control code indicates the end of a paragraph.

[edit] Character encoding

RTF is an 8-bit format.[clarification needed] That would limit it to ASCII,[clarification needed] but RTF can encode characters beyond ASCII by escape sequences. The character escapes are of two types: code page escapes and, starting with RTF 1.5, Unicode escapes. In a code page escape, two hexadecimal digits following a backslash and typewriter apostrophe are used for denoting a character taken from a Windows code page. For example, if the code page is set to Windows-1256, the sequence \'c8 will encode the Arabic letter bāʼ (ب).

For a Unicode escape the control word \u is used, followed by a 16-bit signed decimal integer giving the Unicode code point number. For the benefit of programs without Unicode support, this must be followed by the nearest representation of this character in the specified code page. For example, \u1576? would give the Arabic letter beth, specifying that older programs which do not have Unicode support should render it as a question mark instead.

The control word \uc0 can be used to indicate that subsequent Unicode escape sequences within the current group do not specify a substitution character.

Until RTF specification version 1.5 release in 1997, RTF has only handled 7-bit characters directly and 8-bit characters encoded as hexadecimal (using \'xx). RTF control words (since RTF 1.5) generally accept signed 16-bit numbers as arguments. Unicode values greater than 32767 must be expressed as negative numbers.[12] If a Unicode character is outside BMP, it cannot be expressed in RTF.[24] Support for Unicode was made due to text handling changes in Microsoft Word – Microsoft Word 97 is a partially Unicode-enabled application and it handles text using the 16-bit Unicode character encoding scheme.[12] Microsoft Word 2000 and later versions are Unicode-enabled applications that handle text using the 16-bit Unicode character encoding scheme.[3]

RTF files are usually 7-bit ASCII plain text. RTF consists of control words, control symbols, and groups. RTF files can be easily transmitted between PC based operating systems because are encoded as a text file with 7-bit graphic ASCII characters. Converters that communicate with Microsoft Word for MS Windows or Macintosh should expect data transfer as 8-bit characters and binary data can contain any 8-bit values.[14]

RTF supports font embedding of fonts used in the document, but this feature is not widely supported in software implementations.[25][26][27] RTF also supports generic font family names used for font substitution: roman (serif), swiss (sans-serif), modern (monospace), script, decorative, technical.[18] This feature is not widely supported for font substitution, e.g. in OpenOffice.org or Abiword.

[edit] Human readability

Unlike most word processing formats, good RTF code can be made human-readable. When an RTF file is opened in a text editor, without formatting or processing of formatting, the alphanumeric text is legible and the markup language (formatting) elements are not too distracting or counter-intuitive. The RTF files produced by most programs, such as Microsoft Word (MS Word), will contain such a large number of control codes for compatibility with older programs that most files will easily be an order of magnitude larger than the raw text and very difficult to read.[28][29] Formats such as MS Word's .doc are, in contrast, binary formats with only a few scraps of legible text.

Nowadays, human-readable XML-based formats are becoming more common. But, during RTF's initial release, its level of readability was rare among document formats. Note that the XML-based OpenDocument and Office Open XML formats are often not immediately human-readable because they are a bundle of several different files within a ZIP archive.

RTF is a data format for expressing text documents. It is not really a markup language. It was never meant for intuitive and easy typing.[29][30] If some Unicode characters (e.g. diacritics) are used in an RTF document, it is difficult to read and understand document text content from the RTF code. RTF also supports Microsoft OLE embedded objects and Macintosh Edition Manager subscriber objects (since RTF 1.0) which are not human readable.

[edit] Common uses and interoperability

Most word processing software implementations support RTF format importing and exporting (following some version of RTF specification), and/or direct editing, often making it a "common" format between otherwise incompatible word processing software and operating systems. These factors contribute to its interoperability, but it will depend on what version of RTF is being used.[7] There are several consciously designed or accidentally born RTF dialects.[31] Most of applications that read RTF files silently ignore unknown RTF control words.[31]

RTF is the internal markup language used by Microsoft Word.[29] Overall, since 1987, RTF files may be transferred back and forth between many old and new computer systems (and now over the internet) despite differences between operating systems and their versions. (But there are some compatibility problems, e.g. between RTF 1.0 1987 and later specifications, or between RTF 1.0-1.4 and RTF 1.5+ in use of Unicode characters.[32][33][34]) This makes it a useful format for basic formatted text documents such as instruction manuals, résumés, letters, and modest information documents. These documents at minimum support bold, italic, and underline text formatting. Also typically supported are left-, center-, and right- aligned text. Furthermore, font specification and document margins are supported in RTF documents.

Font and margin defaults, as well as style presets and other functions will vary according to program defaults. There may also be subtle differences perhaps between different versions of the RTF specification implemented in differing programs and program versions. Nevertheless, the RTF format is consistent enough from computer to computer to be considered highly portable and moderately acceptable for cross-platform use.[who?] For greater consistency between more modern computers, a format such as PDF may be preferred; but PDFs are not traditionally distributed as editable documents whereas RTFs are.

Use of Microsoft Object Linking and Embedding (OLE) objects or Macintosh Edition Manager subscriber objects limits the interoperability, because these objects are not widely supported in programs for viewing or editing RTF files (e.g. embedding of other files inside the RTF, such as tables or charts from spreadsheet application).[35][36][37][38][39] If a software that understands an OLE object is not available, the object is usually replaced by a picture (bitmap representation of the object) or not displayed at all.[40][41]

Unlike Microsoft Word's DOC format, as well as the newer Office Open XML and OpenDocument formats, RTF does not support macros. For this reason, RTF is recommended over these formats when the spread of computer viruses is a concern. However, having the .RTF extension does not guarantee that a file is safe, since Microsoft Word will open standard DOC files renamed with an RTF extension and run any contained macros as usual. Manual examination of a file in a plain text editor such as Notepad, or use of the file command in UNIX-like systems, is required to determine whether or not a suspect file is really RTF.[8][42]

[edit] Implementations

Each of RTF implementations usually implements only some versions or subsets of RTF specification.[7] Many of the available RTF converters cannot understand all new features in the latest RTF specifications.[32][43]

The WordPad editor in Microsoft Windows creates RTF files by default. It once defaulted to the Microsoft Word 6.0 file format, but write support for Word documents (.doc) was dropped in a security update. Read support was also dropped in Windows 7. WordPad does not support some RTF features, such as headers and footers.[44] RTF is also the data format for "rich text controls" in MS Windows APIs.[29]

The default text editor for Mac OS X, TextEdit, can also view, edit and save RTF files as well as RTFD files. TextEdit currently (as of July 2009) has limited ability to edit RTF document margins. Much older Mac word processing application programs such as MacWrite and WriteNow were able to view, edit, and save RTF files as well.

The free and open-source word processors AbiWord, OpenOffice.org, KWord, and Bean can view, edit and save RTF files. (Abiword and OpenOffice.org use RTF 1.6 when a new file is saved.) RTF format is also used in Ted word processor. These implementations might be interesting for those who need to learn how to implement RTF support in their project and link it to other application functionality.

The open-source script rtf2xml can partially convert RTF to XML.[45][46]

SIL International’s Toolbox freeware application for developing and publishing dictionaries uses RTF as its most common form of document output. RTF files produced by Toolbox are designed to be used in Microsoft Word, but can also be used by other RTF-aware word processors.

RTF can be used on some ebook readers because of its interoperability,[citation needed][47] simplicity, and low CPU processing requirements, and some devices, including BeBook, work best with this format.

[edit] Criticism

The Rich Text Format was the standard file format for text-based documents in applications developed for Microsoft Windows. Microsoft did not initially make the RTF specification publicly available, making it difficult for competitors to develop document conversion features in their applications. Because Microsoft's developers had access to the specification, Microsoft's applications had better compatibility with the format. When Microsoft changed the RTF specification, Microsoft's own applications had a lead in time-to-market, because competitors had to redevelop their applications after studying the newer version of the format. Novell alleged that Microsoft's practices were anticompetitive in its antitrust complaint against Microsoft.[48][49] The RTF specifications lacks some of the semantic definitions necessary to read, write and modify documents.[50]

[edit] See also

[edit] External links

[edit] References

  1. ^ "Text Media Types". iana.org. 1993-06-08. http://www.iana.org/assignments/media-types/text/. Retrieved 2010-03-13. 
  2. ^ "Application Media Types". iana.org. 2007-06-18. http://www.iana.org/assignments/media-types/application/rtf. Retrieved 2010-08-20. 
  3. ^ a b c Microsoft Corporation (1999-05). "Rich Text Format (RTF) Specification, version 1.6". http://msdn.microsoft.com/en-us/library/aa140280(office.10).aspx. Retrieved 2010-03-13. 
  4. ^ a b c Microsoft Corporation (2004-04-20). "Word 2003: Rich Text Format (RTF) Specification, version 1.8". http://www.microsoft.com/downloads/details.aspx?familyid=AC57DE32-17F0-4B46-9E4E-467EF9BC5540&displaylang=en. Retrieved 2010-03-13. 
  5. ^ John Siracusa (2005-04-28). "Mac OS X 10.4 Tiger - File types revisited". http://arstechnica.com/apple/reviews/2005/04/macosx-10-4.ars/11. Retrieved 2010-03-13. 
  6. ^ "tutorial: Rich Text Format (RTF)". Colorado State University. http://accessproject.colostate.edu/udl/modules/word/tut_rtf.cfm. Retrieved 2010-03-13. 
  7. ^ a b c d "4.3 Non-HTML file formats". e-Government Unit. 2002-05. http://archive.cabinetoffice.gov.uk/e-government/resources/handbook/html/4-3.asp. Retrieved 2010-03-13. 
  8. ^ a b "Benefits of Rich Text Format (RTF)". Desktop Publishing, Presentations & Word Processing. ETR Associates. http://web.archive.org/web/20080323033333/http://www.seniortechcenter.org/desktop_publishing/rich_text_format.php. 
  9. ^ "Sean M. Burke - RTF-Writer - The RTF Cookbook". http://search.cpan.org/~sburke/RTF-Writer/lib/RTF/Cookbook.pod#NOTES. Retrieved 2010-03-13. 
  10. ^ Changes in Word 2010
  11. ^ Microsoft Corporation. "RTF - Rich Text Format". http://www.faqs.org/faqs/graphics/fileformats-faq/part3/section-127.html. Retrieved 2010-03-13. 
  12. ^ a b c d Microsoft Corporation. "Rich Text Format (RTF) Version 1.5 Specification". http://www.biblioscape.com/rtf15_spec.htm. Retrieved 2010-03-13. 
  13. ^ a b Microsoft Corporation (2001-08-31) (EXE (ZIP)), Word 2002 Tool: Rich Text Format Specification - 8/2001– Word 2002 RTF Specification, http://download.microsoft.com/download/Word2002/Install/1.7/W98NT42KMeXP/EN-US/W2KRTFSF.exe, retrieved 2010-03-13 
  14. ^ a b c Microsoft Corporation (2008-03-20). "Word 2007: Rich Text Format (RTF) Specification, version 1.9.1". http://www.microsoft.com/downloads/details.aspx?FamilyId=DD422B8D-FF06-4207-B476-6B5396A18A2B&displaylang=en. Retrieved 2010-03-13. 
  15. ^ "Information about the Rich Text Format (RTF) version specifications for various versions of Word". 2007-02-21. http://support.microsoft.com/kb/924944. Retrieved 2010-03-13. 
  16. ^ "Those who forget Santayana…". Rob Weir. 2007-12-20. http://www.robweir.com/blog/2007/12/those-who-forget-santayana.html. Retrieved 2010-03-13. 
  17. ^ Microsoft Corporation (RTF), Rich-Text Format (RTF) Specification - RTF Version 1.0, http://www.snake.net/software/RTF/Old/RTF-Spec-1.0.rtf, retrieved 2010-03-13 
  18. ^ a b Microsoft Corporation (1992-06) (TXT), Microsoft Product Support Services Application Note (Text File) - GC0165: RICH-TEXT FORMAT (RTF) SPECIFICATION, http://latex2rtf.sourceforge.net/RTF-Spec-1.0.txt, retrieved 2010-03-13 
  19. ^ Microsoft Corporation (RTF), Rich Text Format Specification v. 1.2, http://www.snake.net/software/RTF/Old/RTF-Spec-1.2.rtf, retrieved 2010-03-13 
  20. ^ (PDF) Rich Text Format Specification v. 1.2, http://latex2rtf.sourceforge.net/RTF-Spec-1.2.pdf, retrieved 2010-03-13 
  21. ^ Microsoft Corporation (1994-01) (RTF), Rich Text Format (RTF) Specification - RTF Version 1.3, http://www.snake.net/software/RTF/RTF-Spec-1.3.rtf, retrieved 2010-03-13 
  22. ^ Microsoft Corporation (1994-01) (TXT), Rich Text Format (RTF) Specification - RTF Version 1.3, http://latex2rtf.sourceforge.net/RTF-Spec-1.3.txt, retrieved 2010-03-13 
  23. ^ "RTF 1.9 Specification (Word 2007)". Greg Duncan. 2007-01-09. http://coolthingoftheday.blogspot.com/2007/01/rtf-19-specification-word-2007.html. Retrieved 2010-03-13. 
  24. ^ Sean M. Burke. RTF pocket guide. Google Books. http://books.google.com/books?id=4N_lVcyyhqMC&lpg=PP1&ots=dAf2jliqxf&dq=RTFPocketGuide&pg=PA33#v=onepage&q=&f=false. Retrieved 2010-03-15. 
  25. ^ "Embedded fonts are not displayed as expected in the documents that are saved as RTF in Word". Microsoft Corporation. 2007-02-20. http://support.microsoft.com/kb/275953. Retrieved 2010-03-17. 
  26. ^ "Embedding fonts in RTF file". 2005-04-23. https://list.unm.edu/cgi-bin/wa?A2=ind0504d&L=rtf-l&T=0&F=&S=&P=60. Retrieved 2010-03-17. 
  27. ^ "OpenOffice.org Issue - MS Interoperability: embedd fonts into the document". http://www.openoffice.org/issues/show_bug.cgi?id=20370. Retrieved 2010-03-17. 
  28. ^ Sean M. Burke (2008-07-12). "Rich Text Format - MSWord generates some scary RTF". http://interglacial.com/rtf/. Retrieved 2010-03-13. 
  29. ^ a b c d Sean M. Burke (2003-07). "RTF Pocket Guide". http://www.amazon.co.uk/gp/product/product-description/0596004753/ref=dp_proddesc_0/277-8520360-9056562?ie=UTF8&n=266239&s=books. Retrieved 2010-03-13. 
  30. ^ RTF Pocket Guide by O'Reilly Media, http://www.scribd.com/doc/15490806/RTF-Pocket-Guide-by-OReilly-Media, retrieved 2010-03-13 
  31. ^ a b Mark de Does (2009-10-23). "Ted, an easy rich text processor". http://ftp.nluug.nl/pub/editors/ted/TedDocument-en_US.html. Retrieved 2010-03-13. 
  32. ^ a b "How to Import Microsoft Word Files into WordPerfect for DOS". http://www.columbia.edu/~em36/wpdos/wordtowpdos.html. Retrieved 2010-03-13. 
  33. ^ "Abiword Help - File Formats". http://www.abiword.org/help/en-US/info/infoformats.html. Retrieved 2010-03-13. 
  34. ^ "Opening Rich Text Format (RTF) files". http://www.mackichan.com/index.html?techtalk/v30/30ts79.htm~mainFrame. Retrieved 2010-03-13. 
  35. ^ Bruce Byfield (2005-08-23). "FOSS word processors compared: OOo Writer, AbiWord, and KWord". http://www.linux.com/archive/feed/47307. Retrieved 2010-04-06. 
  36. ^ "Sharing files between OpenOffice.org and Microsoft Office". 2005-07-28. http://www.linux.com/archive/feed/46599. Retrieved 2010-04-06. 
  37. ^ "SoftMaker Office 2008 focuses on compatibility with Microsoft Office". 2008-11-20. http://www.linux.com/archive/feature/153229. Retrieved 2010-04-06. 
  38. ^ "SoftMaker Office 2006 beta: Not a killer app". 2006-11-21. http://www.linux.com/archive/feed/58330. Retrieved 2010-04-06. 
  39. ^ Philippe Lagadec (2006-11-30) (PDF), OpenOffice / OpenDocument and Microsoft Office 2007 / Open XML security, http://pacsec.jp/psj06/psj06lagadec-e.pdf, retrieved 2010-04-06 
  40. ^ "OLE object - bitmap representation?". http://www.keyongtech.com/2560234-ole-object-bitmap-representation. Retrieved 2010-04-06. 
  41. ^ "A Rich Edit Control That Displays Bitmaps and Other OLE Objects". http://www.codeproject.com/KB/edit/COleRichEditCtrl.aspx. Retrieved 2010-04-06. 
  42. ^ "Avoiding Macro Viruses". SANS Institute. http://www.sans.org/resources/macro.php. 
  43. ^ Wilfried Hennings (2010). "Converters from PC Textprocessors to LaTeX - Overview - Converting from RTF". http://www.tug.org/utilities/texconv/index.html. Retrieved 2010-03-13. 
  44. ^ "Why does RTF not work properly in WordPad and NotePad?". http://www.familysearch.org/eng/home/faq/faq_fileviewer.asp#Why_does_RTF_not. Retrieved 2010-03-13. 
  45. ^ "rtf2xml: convert MS RTF to XML". http://sourceforge.net/projects/rtf2xml/. Retrieved 2010-06-05. 
  46. ^ "rtf2xml - The Man Page". http://rtf2xml.sourceforge.net/docs/man-page.html. Retrieved 2010-06-05. 
  47. ^ "HANDBOOK ON MINIMUM INFORMATION INTEROPERABILITY STANDARDS (MIOS)". Cape Gateway. 2002-04-16. http://www.capegateway.gov.za/Text/2004/10/mios_v3_16_april_02.pdf. Retrieved 2010-07-11. 
  48. ^ Novell (2004-11-12) (PDF), Novell Files WordPerfect Antitrust Lawsuit against Microsoft, http://www.novell.com/news/press/archive/2004/11/complaint.pdf, retrieved 2010-03-13 
  49. ^ "The Novell Antitrust Complaint (as text) & A Law About Antitrust and Standards Writing". 2004-11-17. http://gl.scofacts.org/gl-20041115214025458.html. Retrieved 2010-03-13. 
  50. ^ Hannes Schmidt (2004-08-06). "Microsoft RTF Specification Nightmare". http://diaryproducts.net/for/geek/microsoft_rtf_specification_nightmare. Retrieved 2010-06-05. 
Retrieved from "http://en.wikipedia.org/wiki/Rich_Text_Format"
Personal tools
Namespaces
Variants
Actions
Navigation
Interaction
Languages