You are on page 1of 47

COURSE OBJECTIVES:

3/22/2012

XML Introduction DTD and Validity Attribute Declarations in a DTD Entities and External DTD subsets Embedding non XML data Namespaces DOM and SAX Parsers Schemas

by FaaDoOEngineers.com

WHAT IS XML ?

XML stands for eXtensible Markup Language It is a set of rules for defining semantic tags. It is a markup language which defines a syntax to define domainspecific, semantic and structured markup languages
by FaaDoOEngineers.com

3/22/2012

INTRODUCTION
3/22/2012

XML
Technology for creating markup languages Enables document authors to describe data of any type Allows creating new tags

by FaaDoOEngineers.com

HTML limits document authors to fixed tag set

WHY XML ?
3/22/2012

Light weighted document Supports cross platform User friendly language Supports Unicode Describes structure and semantics Does not focuses on formatting Can validate the document

by FaaDoOEngineers.com

INTRODUCTION TO XML MARKUP


3/22/2012

XML document (intro.xml)


Marks up message as XML Commonly stored in text files

by FaaDoOEngineers.com

Extension .xml

An Example Intro.xml
1 2 3 4 5 6 <myMessage> 7 <message>Welcome to XML!</message> 8 </myMessage> Line numbers are not part of XML document. We include them for clarity. Document begins with <?xml version = "1.0"?> declaration that specifies XML version 1.0 <!-- Fig.: intro.xml --> Comments <!-- Simple introduction to XML markup -->
3/22/2012 by FaaDoOEngineers.com

Fig. Simple XML document containing a message.

INTRODUCTION TO XML MARKUP (CONT.)


3/22/2012

XML documents

Must contain exactly one root element

by FaaDoOEngineers.com

Attempting to create more than one root element is erroneous Incorrect: <x><y>hello</x></y> Correct: <x><y>hello</y></x>

Elements must be nested properly

PARSERS AND WELL-FORMED XML DOCUMENTS


3/22/2012

XML parser

Processes XML document


by FaaDoOEngineers.com

Reads XML document Checks syntax Reports errors (if any) Allows programmatic access to documents contents

PARSERS AND WELL-FORMED XML DOCUMENTS (CONT.)


3/22/2012

XML document syntax

Considered well formed if syntactically correct


by FaaDoOEngineers.com

Single root element Each element has start tag and end tag Tags properly nested Attribute (discussed later) values in quotes Proper capitalization

Case sensitive

PARSERS AND WELL-FORMED XML DOCUMENTS (CONT.)


3/22/2012

XML parsers support


Document Object Model (DOM)

by FaaDoOEngineers.com

Builds tree structure containing document data in memory Generates events when tags, comments, etc. are encountered

Simple API for XML (SAX)

Events are notifications to the application

PARSING AN XML DOCUMENT WITH MSXML


3/22/2012

XML document
Contains data Does not contain formatting information Load XML document into Internet Explorer

by FaaDoOEngineers.com

Document is parsed by msxml. Places plus (+) or minus (-) signs next to container elements

Plus sign indicates that all child elements are hidden Clicking plus sign expands container element

Displays children

Minus sign indicates that all child elements are visible Clicking minus sign collapses container element

Hides children

Error generated, if document is not well formed

3/22/2012 by FaaDoOEngineers.com

XML DOCUMENT SHOWN IN IE.

ERROR MESSAGE FOR A MISSING END TAG.


3/22/2012 by FaaDoOEngineers.com

CHARACTERS
3/22/2012

Characters that may be represented in XML document

e.g., ASCII character set


by FaaDoOEngineers.com

Letters of English alphabet Digits (0-9) Punctuation characters, such as !, - and ?

Carriage returns Line feeds Unicode characters

Enables computers to process characters for several languages

CHARACTERS VS. MARKUP


3/22/2012

XML must differentiate between

Markup text

by FaaDoOEngineers.com

Enclosed in angle brackets (< and >)

e.g,. Child elements

Character data

Text between start tag and end tag

e.g., Fig., line 7: Welcome to XML!

LIFE OF AN XML DOCUMENT


3/22/2012

XML adhere to a series of rules about how they look like. There are two levels of conformity to the XML standard.
Well formedness Validity

by FaaDoOEngineers.com

Editors. Parsers and Processors. Browsers and Other Tools.

RELATED TECHNOLOGIES

XML doesnt operate in a vacuum. Using XML as more than a data format requires interaction with a no. of related technologies. This includes:
by FaaDoOEngineers.com

3/22/2012

HTML CSS XSL URLs and URIs XLinks and XPointers The Unicode Character Set

3/22/2012 by FaaDoOEngineers.com

DOCUMENT TYPE DEFINITIONS AND VALIDITY

DOCUMENT TYPE DEFINITION (DTD)


Data sent along with a DTD is known as valid XML. Data sent without a DTD is known as well-formed XML. With both valid and well-formed XML, XML encoded data is selfdescribing since descriptive tags are intermixed with the data.

DTDs help ensure that different people and programs can read each others files. The DTD defines exactly what is and is not allowed to appear inside a document.

3/22/2012 by FaaDoOEngineers.com

DTD FOR OUR SIMPLE XML

A DTD consists of a left square bracket character ([) followed by a series of markup declarations, followed by a right square bracket character (]).
by FaaDoOEngineers.com

3/22/2012

<?xml version="1.0" standalone="yes" ?> <!DOCTYPE Simple [ <!ELEMENT Simple ANY>

]
> <Simple> This is the most simplest XML document I have ever seen </Simple>

DTD DECLARATIONS
3/22/2012

Element type declarations Attribute-list declarations Entity declarations Notation declarations Processing declarations Comments Parameter entity references

by FaaDoOEngineers.com

HELLO XML WITH DTD


3/22/2012

<?xml version=1.0 standalone=yes?> <! DOCTYPE GREETING [ <! ELEMENT GREETING (#PCDATA)> ]> <GREETING> Hello XML! </GREETING>

by FaaDoOEngineers.com

LISTING THE ELEMENTS

The first step to creating a DTD appropriate for a particular document is to understand the structure of the information that will be encoded using the elements defined in the DTD.
by FaaDoOEngineers.com

3/22/2012

<?xml version=1.0 standalone=yes ?> <Root> <Element 1> <Element 11> </Element 11> </Element 1>

</Root>

ELEMENT DECLARATIONS

Each tag used in a valid XML document must be declared with an element declaration in the DTD. This specifies the name and possible contents of an element. This list of contents is also called the content specification. * - may occur more than once (Zero or More Children) ? may or may not occur (Zero or One Children) + - must occur at least once (One or More Children)
by FaaDoOEngineers.com

3/22/2012

ATTRIBUTE DECLARATIONS IN DTDS

by FaaDoOEngineers.com

3/22/2012

SPECIFYING DEFAULT VALUES FOR ATTRIBUTES

Instead of specifying an explicit default attribute value, an attribute declaration can be provided a value, allow the value to be omitted completely, or even always use the default values. These requirements are specified with the three keywords
by FaaDoOEngineers.com

3/22/2012

#REQUIRED #IMPLIED #FIXED

3/22/2012 by FaaDoOEngineers.com

EMBEDDING NON XML DATA

by FaaDoOEngineers.com

3/22/2012

NOTATIONS

The first problem that we encounter when working with non-XML data in an XML document is identifying the format of the data and telling the XML application how to read and display the nonXML data. For ex., it would be inappropriate to try to draw an MP3 sound file on the screen. Furthermore, no application understands all possible file formats. Ideally, we want documents to tell the application the format of the external entity so you dont have to rely on the application recognizing the file type by a magic number or a potentially unreliable file formats.
by FaaDoOEngineers.com

3/22/2012

NOTATION

It is used to provide a fixed and mandatory value to an attribute. The value is declared in the notation which can have a path using SYSTEM or a string using PUBLIC.
by FaaDoOEngineers.com

3/22/2012

EXAMPLE OF NOTATION
3/22/2012 by FaaDoOEngineers.com

<?xml version="1.0" encoding="UTF-8"?> <!ELEMENT IMAGES (IMAGE+) > <!ELEMENT IMAGE (#PCDATA) > <!NOTATION iPATH SYSTEM "C:\windows\a.bmp" > <!ATTLIST IMAGE SRC NOTATION (iPATH) #REQUIRED>

USING NOTATION
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE IMAGES SYSTEM "C:\N.dtd"> <IMAGES> <IMAGE SRC="iPATH">abc</IMAGE> </IMAGES> Note Because XML processor cannot parse bmp files, we need to use an external program for displaying or editing them. When the parser encounters a usage of the notation line name, it will simply provide the path of the application.
by FaaDoOEngineers.com 3/22/2012

CONDITIONAL SECTIONS
3/22/2012

Include declarations

Keyword INCLUDE Keyword IGNORE Parameter entities


by FaaDoOEngineers.com

Exclude declarations

Often used with entities

Preceded by percent character (%) Creates entities specific to DTD Can be used only inside DTD in which they are declared

3/22/2012 by FaaDoOEngineers.com

XML NAMESPACES

by FaaDoOEngineers.com

3/22/2012

CONFLICTING ISSUES

Namespaces ensure that element names do not conflict, and clarify who defined which term.
Namespaces do not give instructions on how to process the elements.

3/22/2012

by FaaDoOEngineers.com

Readers still need to know what the elements mean and decide how to process them.
Namespaces simply keep the names straight.

XML NAMESPACES
3/22/2012

Naming collisions

Two different elements have same name <subject>Math</subject> <subject>Thrombosis</subject> Differentiate elements that have same name <school:subject>Math</school:subject>

by FaaDoOEngineers.com

Namespaces

<medical:subject>Thrombosis</medical:subje ct> school and medical are namespace prefixes


Prepended to elements and attribute names Tied to uniform resource identifier (URI)

Series of characters for differentiating names

XML NAMESPACES (CONT.)


3/22/2012

Creating namespaces

Use xmlns keyword


xmlns:text = urn:deitel:textInfo xmlns:image = urn:deitel:imageInfo

by FaaDoOEngineers.com

Creates two namespace prefixes text and image urn:deitel:textInfo is URI for prefix text urn:deitel:imageInfo is URI for prefix image Default namespaces

Child elements of this namespace do not need prefix xmlns = urn:deitel:textInfo

3/22/2012 by FaaDoOEngineers.com

XML SCHEMAS

by FaaDoOEngineers.com

3/22/2012

XML SCHEMA

To define the structure of an XML document. defines the list of elements and attributes than can be used in an XML Document It also specifies the order in which these elements appear in the XML document and their datatypes Microsoft has developed this XML Schema Definition (XSD) language It has become w3c recommendation for creating valid XML documents.
by FaaDoOEngineers.com

3/22/2012

INTRODUCTION
3/22/2012

XML Path Language (XPath) Syntax for locating information in XML document

L & K India - Education

e.g., attribute values Not structural language like XML XSLT XPointer

String-based language of expressions

Used by other XML technologies


39

EXAMPLE FOR NODES


1 <?xml version = "1.0"?> Root node 2 3 <!-- Fig.: simple.xml --> Comment nodes 4 <!-- Simple XML document --> 5 6 <book title = "C++ How to Program" edition = "3"> 7 Element nodes 8 <sample> 9 <![CDATA[ Attribute nodes 10 11 // C++ comment 12 if ( this->getX() < 5 && value[ 0 ] != 3 ) 13 cerr << this->displayError(); 14 ]]> 15 </sample> Text nodes 16 40 17 C++ How to Program by Deitel &amp; Deitel 18 </book>
3/22/2012

L & K India - Education

LOCATING NODES
3/22/2012

XML documents can be represented as a tree view of nodes XPath uses a pattern expression to identify nodes in an XML document. An XPath pattern is a slash-separated list of child element names that describe a path through the XML document. The pattern "selects" elements that match the path. The following XPath expression selects all the price elements of all the cd elements of the catalog element:

L & K India - Education

/catalog/cd/price

If the path starts with a slash ( / ) it represents an absolute path to an element.


41

WHAT IS XSLT ?

XSLT stands for XSL Transformations XSLT is the most important part of XSL XSLT transforms an XML document into another XML document XSLT uses XPath to navigate in XML documents XSLT is a W3C Recommendation
L & K India - Education

42

3/22/2012

XSL - MORE THAN A STYLE SHEET LANGUAGE


3/22/2012

XSL consists of three parts:


XSLT - a language for transforming XML documents XPath - a language for navigating in XML documents XSL-FO - a language for formatting XML documents

43

L & K India - Education

XSL

44

3/22/2012 L & K India - Education

PRESENTING XML

There are two style sheet languages available for use with XML in Internet Explorer

3/22/2012

Cascading Style Sheets (CSS)


Extensible Style Language (XSL)

An important point to consider in choosing a style sheet language for a particular document is whether the structure of the XML document is suitable for display. With CSS, the structure of the XML content must be virtually identical to the structure of the presentation. Since one of the goals of XML is a complete separation of content from display, many XML documents are difficult to display as you might wish using CSS.
45

L & K India - Education

BROWSERS SUPPORTING XML AND XSLT.


Mozilla Firefox

3/22/2012

As of version 1.0.2, Firefox has support for XML and XSLT (and CSS).

Mozilla
Mozilla includes Expat for XML parsing and has support to display XML + CSS. Mozilla also has some support for Namespaces. Mozilla is available with an XSLT implementation.

L & K India - Education

Netscape

As of version 8, Netscape uses the Mozilla engine, and therefore it has the same XML / XSLT support as Mozilla. As of version 9, Opera has support for XML and XSLT (and CSS). Version 8 supports only XML + CSS.

Opera

Internet Explorer
As of version 6, Internet Explorer supports XML, Namespaces, CSS, XSLT, and XPath. Version 5 is NOT compatible with the official W3C XSL Recommendation.

46

DIFFERENCE BETWEEN CSS AND XSL ?

XML does not use predefined tags (we can use any tagnames we like), and the meaning of these tags are not well understood. A <table> element could mean an HTML table, a piece of furniture, or something else - and a browser does not know how to display it. XSL describes how the XML document should be displayed!

You might also like