You are on page 1of 5

White Paper: Document Transformation

- Migration from Unstructured to Structured Documentation

Abstract
Organizations have always had difficulty in managing large collections of documentation, compelling them to look for ways and means to manage them with ease and efficiency. The difficulty crops up due to varied reasons such as, creating and storing documents in various formats and inconsistency in the organization and presentation of content. All of which in a nutshell, is called unstructured documentation. A structured approach to documentation results in retrieving valuable information for the right purpose at the right time. This document addresses the problems faced in unstructured documentation and the reasons for migrating to structured documentation. It also explores the methods and means to resolve the problems caused by unstructured documentation.

Introduction
Over the past several years, the increased production of documents in digital form has proved to be an uphill task to maintain and manage them by organizations. Due to their dynamic nature, fast growth rate, and unstructured format, it is increasingly difficult to identify and retrieve valuable information from these documents. More importantly, the usefulness of a document is primarily dependent upon the ease and efficiency with which the information is retrieved. Migrating unstructured documents to structured documents facilitates document management. What is Structured Documentation? To begin with, Structured Documentation is a method of designing, planning and executing the different stages of documentation life cycle with the help of tools and techniques. The tools could be software tools based on the approach of Structured writing, wherein information is broken down into smaller units, structured and labeled. This result in information being analyzed organized and presented so that it can be quickly accessed, comprehended and acted upon. Characteristics of Structured Documentation Structured documentation has the following characteristics: It is Semantic/Content oriented rather than presentational Components of the document have identifiable structure For example, document created in Word (without templates) is not structured, whereas information presented in HTML, FrameMaker and Word are somewhat structured, while document created in XML and DocBook are strictly structured. Further more, structured documentation aids in the generation of media-neutral documentation.

Why Migrate?
While working with unstructured documentation, writers often face the following situations: Although information is available, reutilization of the same information in parts or in entirety, is time consuming Lack of consistency and standardization across documents when it is multi-authored Inability to render the document in multiple formats because of the lack of structure within the document Time consuming and inefficient way to convert information from one format to another Since same information need to be stored in different format, it has impact on storage and maintenance

Structuring the Unstructured


Structured documentation allow companies to address tight schedules, cost, and resource constraints by helping in planning and implementing the various phases of writing a document and maintaining consistency in writing and formatting of content. This in turn aids in effective storage, sharing of information and quick and easy retrieval of documents. The Writers Block www.twb.in Page 1 of 5

White Paper: Document Transformation


- Migration from Unstructured to Structured Documentation

According to a new report from Aberdeen, a Harte-Hanks Company, in their press release, states that structured document along with content management, technical illustration, 3D visualization, and translation memory technologies are paying off big time for some technical documentation departments. For example, among companies surveyed, 83% of best-in-class companies - those that meet product launch dates 100% of the time - use structured documentation authoring tools such as XML and help technologies. Migration to structured documentation arises: When there is a need to increase productivity To reduce research time and have a higher turnaround of information To have quick access to current documents To benefit from new technologies

Advantages of structured documentation: Migrating to structured environment ensures consistency within and across documents in spite of the document being multi-authored Easy access to information if written in a structured format Quick and easy retrieval of information All of the above enables single sourcing which makes storage, reuse of information and retrieval of information (in part or in entirety), quick, easy and efficient Single sourcing optimizes publication and distribution of information in a structured manner across various formats.

Structured documentation also enables the concept of single sourcing, a method which is most sought after in this technology driven age with a constant need to meet timelines. This concept is extremely helpful when publishing documents on multiple media such as, CD, paper, the Internet, etc.

What is Single Sourcing?


Single sourcing also known as single source publishing allows: The same content to be used in different documents and in various formats A manufacturing firm may have several products with user guides that share a common contact detail. Rather than maintaining duplicate versions of this contact detail (one in each manual) the manuals can share the content by flowing it into the document at the time of publication. The creation of documents in various formats from the same content A company might use the same content in online help, a printed document, a Web page and in an Interactive Voice Response system. With a single source solution, the company only has to update the one source file for the content and regenerate the four outputs. Ideally, the tools used for single sourcing do not require human intervention to customize the formatting or content for the various outputs. This can be difficult to achieve without the use of some kind of content management system, which often employs XML technology. Advantages of Single Sourcing Single sourcing eliminates the need to duplicate content and saves on the translation costs, reduces maintenance costs, improves consistency and reduces errors. It also helps in publishing the same document in various formats. Tools used for single-sourcing help in structuring documents and automating the entire process of content management. www.twb.in Page 2 of 5

The Writers Block

White Paper: Document Transformation


- Migration from Unstructured to Structured Documentation

Tools and Technology


Migration of unstructured documentation can be effectively brought about with a combination of mastering new tools (such as the structure features in FrameMaker), new concepts (structured authoring), and new technology (XML, structure definitions, and perhaps Extensible Stylesheet Language Transformation (XSLT)). To be able to focus on document content, it is important to adopt a standardized structure that includes assisted content creation and is independent of platforms and tools. Some of such standards used in transformation are: Standard General Markup Language (SGML) Extensible Markup Language (XML) Structured FrameMaker along with XML helps in achieving structured documentation ready to be imported or exported

Structured Documentation Techniques


Structured writing is intended to structure information in a consistent, easily human-readable manner, while XML assists in maintaining consistency and extends the content through semantic and structural tagging to be machine-readable as well. Standardization Structured methodology provides means to assist in the standardization of documents from both an organizational and a semantic/content-oriented standpoint. Standardization based on structure and content enhances the potential for reuse of the information for both print and electronic delivery. Organization Within a publishing environment, concepts and technology used for migration help in identifying and maintaining the structure and content of information, independent of formatting specifications. By maintaining a consistent organization within the data, the information can be reused across formats and publications with a minimum of effort. For example, XML can serve as an aid in the editorial process by providing a standard methodology for describing the meaning, structure, and other properties of the data. Consistency The problems caused by multiple authors working on the same project, an author's shifting mindset when writing at different times, no established protocol to guide the author, issues of inconsistency in organization and presentation of content. All these limitations can be overcome by adopting tools and technology that promote structured documentation techniques.

Transformation Glitches
Although the transformation of unstructured documentation to structured documentation seems simple and easy, the technology involved in transformation has not proved to be easy to learn and practice. Portability and Availability of Tools A number of high-quality tools with demonstrated maturity are available, but most are not free. And for those, which are free, portability issues remain a problem. Technical Writers Vs DTD Developers While there can be a strong relationship between the authoring and editing of content and structured markup, all too often conflicts arise between technical writers and Document Type Definition (DTD)/schema designers and programmers. The perceived need for editorial license and creative freedom by many authors/editors clashes The Writers Block www.twb.in Page 3 of 5

White Paper: Document Transformation


- Migration from Unstructured to Structured Documentation with the need for rigid structure to facilitate ease of programming for markup technologists and programmers. The battles are commonly between format and structure, looseness and rigidity, and are often more philosophical than practical. The Concept is Easy, Reality is not Sometimes the restrictions placed on the writer by the DTD (one of several SGML and XML schema languages) diminish the overall quality of the content. Other times the nature of the information diminishes the quality of the tagging. Striking a balance between truly important flexibility and the need for consistent structure in a data model is problematic to say the least. Unfortunately, there is no remedy for this ever-plaguing problem. There will usually be some level of compromise mandated, depending on the complexity level of the information set. Steps must be taken to keep the need for compromise minimal. Drawback of Not Keeping In Line with Technology Many editors come from a print or desktop-publishing environment, where formatting of the page is a key part of their responsibilities. For some editors, formatting is intertwined with content because of their experience and training. Structural and content-based tagging devoid of format is a foreign concept for them. The transition from thinking of documents as a rendered view of information to thinking of them as pieces of data as well as understanding the logical interrelations of these pieces is a difficult task, to say the least.

Approach to Migration
For a company to adapt to the technology employed in migrating to structured documentation, they will have to understand and abide by the following suggestions: The first step is to understand what technical writing is and the strong relationship between the concepts of technical writing and the purpose of semantic technology that aids in structured documentation. A quality assurance program for both content and markup is critical to success. Institute training to introduce the writing and editing professionals to the new publishing paradigm and supplement it on an ongoing basis to keep skills current and revisited periodically, if old formatoriented habits reappear. There must be a coordinated effort from the content specialists and DTD developers and programmers, in order to produce useful, media-neutral documentation. Lines of communication must be open and utilized between the content specialists and DTD developers and programmers.

Summary
Structure encourages consistency and modular writing, which are bound to be helpful eventually. At some point in the evolution of your documentation set, there will be a demand for changes with regard to translation or change in the editing tool. There may be instances where you are asked to present a document in a new format. In such a case, structure will make the task much less painful, and may even be the factor that makes it possible. Furthermore, if you can convert to XML, then you have the chance to automate those changes with a script.

References
Goldfarb, Charles F., Prescod, Paul. The XML Handbook 3rd ed. Upper Saddle River, NJ: Prentice Hall PTR, 2001 Technical Writing and XML, Reconciling Editorial License with Structured Markup, Douglas Rudder, <drudder@drugfacts.com>

The Writers Block

www.twb.in

Page 4 of 5

White Paper: Document Transformation


- Migration from Unstructured to Structured Documentation Press Release, Source: Aberdeen Group, Top Performers Authoring Structured Documentation, Hitting Product Launch Dates Every Single Time, Wednesday January 3, 3:48 pm ET

TWB Contacts
TWB is a leading technical documentation development and design company. Should you want more information from the TWB repository of information on techniques in technical documentation please contact: Global Sales Anindya Shankar anindya.shankar@twb.in +91.9880280022 India Sales Vinisha Gunther vinisha.gunther@twb.in +91.9901189163

The Writers Block

www.twb.in

Page 5 of 5

You might also like