Every document and every work group using SGML can define their own DTD according to their needs and preferences. Resulting from this practice interchange between different areas would require both, the DTD and the document instance. In addition to that, for every text to be encoded in SGML a new DTD could be defined. This is not what SGML was made for.
There exist predefined DTDs for different kinds of texts or they are developed to be used for text encoding, for example by the Text Encoding Initiative (TEI) (TEI, 1994, see). Consequently, if the predefined DTD is available on the recipient side in an interchange process, only the document instance, referring to the DTD, has to be submitted, resulting in a smaller amount of information to be sent. Furthermore it is possible for the recipient to reuse programs and scripts written for other documents with the same DTD but with new document instances. Using predefined DTDs results in less work and a higher grade of interchangeability.
Different texts require different structures and different semantic information. A fictional text for example has different structures -- chapters, sections, paragraphs -- than a dictionary -- individual entries with a well structured content. Consequently, according to the classification of texts, there are different DTDs, one DTD per class of texts. Certain parts of a DTD might be useful for other DTDs as well; therefore, SGML allows the use of combined DTDs. One DTD may call for another one to be included, which could also allow comparative work on more than one class of texts in certain respects.
There are different kinds of DTDs, publicly available DTD, for example in international standards, and local user specified DTDs. To find the correct DTD public identifiers are used; public identifiers replace the operating system specific way of referring to a file such as paths in DOS or UNIX environments. Public Identifiers have the form:
<!DOCTYPE name PUBLIC description [optional local additions]>
Here the static elements are set in capitals, while the information on the
DTD is in lower case letters. description is itself a variable for a number
registered with ISO//owner//DTD class and specification//language
Individual categories here are separated with double slashes. registered with ISO holds the value + or -, language gives the two character ISO language notation.
An example of a Public Identifier with the interpretation of the single components is given in table 3.1.
|<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">|
|name||html||Hypertext Markup Language|
|registered with ISO||-||Not registered with ISO|
|owner||W3C||World Wide Web consortium|
|class||HTML||Hypertext Markup Language|
Fri May 21 13:04:11 MET DST 1999