Thursday, December 1, 2011

What is DITA?

The Darwin Information Typing Architecture (DITA) is an XML-based architecture for authoring, producing, and delivering information. Although its main applications have so far been in technical publications, DITA is also used for other types of documents such as policies and procedures. The DITA architecture and a related DTD and XML Schema were originally developed by IBM. The architecture incorporates ideas in XML architecture, such as modular information architecture, various features for content reuse, and specialization, that had been developed over previous decades. DITA is now an OASIS standard.

The first word in the name "Darwin Information Typing Architecture" is a reference to the naturalist Charles Darwin. The key concept of "specialization" in DITA is in some ways analogous to Darwin's concept of evolutionary adaptation, with a specialized element inheriting the properties of the base element from which it is specialized.

DITA content is written as modular topics, as opposed to long "book-oriented" files. A DITA map contains links to topics, organized in the sequence (which may be hierarchical) in which they are intended to appear in finished documents. A DITA map defines the table of contents for deliverables. Relationship tables in DITA maps can also specify which topics link to each other. Modular topics can be easily reused in different deliverables. However, the strict topic-orientation of DITA makes it an awkward fit for content that contains lengthy narratives that do not lend themselves to being broken into small, standalone chunks. Experts stress the importance of content analysis in the early stages of implementing structured authoring.

Fragments of content within topics (or less commonly, the topics themselves) can be reused through the use of content references (conref), a transclusion mechanism.

DITA includes extensive metadata elements and attributes, which make topics easier to find.

DITA specifies three basic topic types: Task, Concept and Reference. Each of the three basic topic types is a specialization of a generic Topic type, which contains a title element, a prolog element for metadata, and a body element. The body element contains paragraph, table, and list elements, similar to HTML.

A Task topic is intended for a procedure that describes how to accomplish a task. A Task topic lists a series of steps that users follow to produce an intended outcome. The steps are contained in a taskbody element, which is a specialization of the generic body element. The steps element is a specialization of an ordered list element. Concept information is more objective, containing definitions, rules, and guidelines.

A Reference topic is for topics that describe command syntax, programming instructions, and other reference material, and usually contains detailed, factual material. DITA allows adding new elements and attributes through specialization of base DITA elements and attributes. Through specialization, DITA can accommodate new topic types, element types, and attributes as needed for specific industries or companies.

The extensibility of DITA permits organizations to specialize DITA by defining specific information structures and still use standard tools to work with them. The ability to define company-specific information architectures enables companies to use DITA to enrich content with metadata that is meaningful to them, and to enforce company-specific rules on document structure. DITA map and topic documents are XML files. As with HTML, any images, video files, or other files which need to appear in output are inserted via reference. Any XML editor can therefore be used to write DITA content, with the exception of editors that support only a limited set of XML schemas (such as XHTML editors). Various editing tools have been developed that provide specific features to support DITA, such as visualization of conrefs.

Next time: technology for managing content.

No comments:

Post a Comment