SCORM and DITA for Reusable Content

This document provides an analysis of the relationship between SCORM and DITA from the perspective of global content reuse enablement, including repurposing for e-learning delivery. The author’s background is in general content repurposing including repurposing of content for learning deliverables. The realistic assumption is made that Gigabytes or Terabytes of “legacy content”–content developed outside of any structure of metadata tagging system–are available, that reworking all previous content to abide by newly adopted standards is not likely to be cost-effective, and that new learning content will conform to standards as standards are adopted by an organization.


Repurposing of content has been an objective in both the content and learning realms for years. Repurposing promises to reduce effort and ensure consistency of content. For content to be used across deliverables, it is common to normalize the content and associate metadata with it, so that it can be easily reused–searched, edited, added to, deleted from, and transformed into various deliverables. The level of reuse can vary from reusing an entire document to reusing text strings or individual graphics within a document. At some point, the value of repurposing smaller objects within the document reaches a point of diminishing returns. More effort is required in terms of processing, storing, searching, and using the objects with less and less value returned. Smaller objects (content objects) may be used to build learning objects.

To maximize opportunities for interoperability between various components of an end-to-end learning solution including development, student registration, hosting of content, content management, assessment, etc., the repurposing system should leverage learning standards that describe learning objects and content.


Standards that impact e-learning are emerging and changing, but the Sharable Content Object Reference Model (SCORM) from Advanced Distributed Learning is particularly visible at this time.

One of the purposes of SCORM is to “foster creation of reusable learning content as ‘instructional objects’ within a common technical framework for computer-based and Web-based learning” (SCORM 2004 2nd Edition Overview). The framework includes models for content aggregation, for run-time environment interaction between the content and the system delivering it, and for sequencing and navigation.

The SCORM Content Aggregation Model (CAM) book provides information on how to describe learning content with metadata, how to package those components for use with interoperable systems, and how to define sequencing information for the components. The CAM metadata framework covers the following:

  • Low level assets, such as an audio file or a fragment of text
  • Sharable content objects, a collection of learning assets launched as the lowest level of meaningful learning object
  • Content aggregation metadata that describes the package as a whole

SCORM metadata is organized into the following elements:

  • General–Describes the resource as a whole including title, language, description keywords, level of aggregation (ie, course or lesson) and more
  • Lifecycle–Contains versioning and authoring information
  • Metametadata–Describes the metadata itself
  • Technical–Contains technical information needed to use the learning content including format, size, location and platform requirements
  • Educational–Level of interactivity, type of learning activity (ie, exercise, figure, text, assessment), age range, learning time and more
  • Rights–Includes cost, copyright and restriction information and description
  • Relation–Describes the relationship between the current content or asset object and another target object.
  • Annotation–Comments on the educational use
  • Classification elements–Includes purpose and taxonomy path

SCORM describes a learning object, but it does not define the internal information architecture of the object. The closest SCORM comes to describing the structure of the content is the Educational element which identifies the type of learning activity.

From a content repurposing perspective, this can actually be very positive, since no assumptions are made regarding how a developer or instructional designer might reorganize content as it is being repurposed. Any piece of content may be used in any position that any content developer–regardless of organizational domain–deems appropriate. Reuse is maximized. This is important given that content may be useful across types of e-learning deliverables, as well–from a standard Web-based training course to a game-based deliverable or types of e-learning that have yet to be defined.

However, the flexibility does nothing to support adherence to a given structure for known learning deliverable types. In SCORM, metadata can be used to define a lesson as a lesson, but there is no mechanism for making sure that the lesson includes an overview, learning objectives, content, summary, reinforcement activity, and an assessment, or any other specified sequence that an organization determines will deliver pedagogically-sound results.


DITA, developed by IBM in 2000 and now an OASIS standard, is an XML-based architecture for creating topic-oriented, information-typed content that can be repurposed. DITA also provides a mechanism for creating new topic types and describing new information domains.

The DITA specification covers base document types used for authoring and organizing content. In DITA, the highest content structure is a “topic.” A topic is the basic unit of authoring and of reuse. DITA documents can contain any sequence of the following topics:

  • Concept topics–provide background for the viewer to understand essential information. These answer the “What is?” question.
  • Task topics–provide well-defined procedural information
  • Reference topics–include regular statements. An example would be commands in a programming language or quick facts.

Concept, task, and reference topics may be nested within one another. Entire topics can be reused or a chunk of content within a DITA topic can point to an equivalent chunk of content in another DITA topic.

DITA maps define which topics are needed and in what order they are needed. They define the information architecture. DITA maps are documents that collect references to one or more DITA topic files and indicate relationships among DITA topics. They can serve as outlines or tables of contents for DITA deliverables.

Maps also describe the context in which the topics will be read. Author, publisher, copyright, source (original formal of the content), critdates (milestones in the publishing cycle), permissions for accessing content, a resource identifier, audience, category, keywords, and product information (definition of the product or platform for the topic are included.

DITA domains define elements associated with a particular subject area. As an example, in the programming subject area, it is common to treat code examples and syntax examples with specific typographic conventions. The programming domain provides the details.

DITA’s object oriented design makes use of a topic specialization concept that allows content designers or developers to create new information types, such as topic types or map types. Through specialization, DITA provides an opportunity for learning organizations to specify outlines that align with pedagogically sound best practices. A very common pattern in learning is to have an introduction, learning objectives, body of content (task, content, and reference topics), reinforcement activity, summary, and assessment. Specialization can be used to define any number of new e-learning maps to cover traditional Web-based training or even simulations content or game-based learning content.


SCORM essentially wraps content and makes no specification regarding the underlying pedagogical soundness of the content. For interoperability with other learning systems and processes, content repurposing systems can benefit from mapping the SCORM XML schema to a repository database schema. As assets or sharable content objects are created, added to, deleted from, and repurposed, SCORM metadata can be associated with the assets (in some cases auto-generated). Under those conditions, the necessary SCORM metadata is available when a course is exported or transformed into a SCORM package.

Through specialization, DITA can help to enforce good instructional design by providing guidance on the content categories that should be included in a reusable learning topic and can help ensure consistency by providing a mechanism for describing elements in the learning domain. Each deliverable across domains (a marketing datasheet, a technical course, a new course announcement document, etc) may require a different map. The different domains may have different requirements for describing the same element in different contexts. Optimally, content in a content management system should be available independently of any mapping or structural constraints. Unstructured content in the repository can be used in conjunction with structured content also available in the repository. New topics that are created can be incorporated back into the database, so that they are available for reuse and for transformation with XSLT and/or transformation scripts.

SCORM and DITA are not mutually exclusive. Used together, they can significantly improve opportunities for reusability and repurposing of e-learning and other content deliverables.

Leave a Reply

Your email address will not be published. Required fields are marked *