XML Unit 2 Notes

UNIT – II
XML Technology
1.1 XML Introduction

XML stands for eXtensible Markup Language. It is a markup language much like HTML, but
there are several differences between them:
HTML includes a collection of predefined tags that you can use right away in editing your HTML
files, e.g. <font> and <h1>, but XML tags are not predefined.
To use XML, users have to define their own tags for their specific application before using them.
For example, to describe a note, <note>, <to>, <from>, <heading>, and <body> are defined in
advance and used in a nested fashion to present the following XML file as a note:
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don’t forget the party!</body>
</note>
With XML, one can define whatever tags needed, which together compose a user-defined
markup language similar to HTML, and then use the language to describe data. Specifically XML
uses Document Type Definition (DTD) or an XML Schema to define tags.
In this sense, XML is viewed as a meta-language since it can be used to define and describe a
markup language instead of concrete data directly. That is also why it is called extensible.
XML was designed to describe data while HTML was designed for displaying data. If you
remember, HTML tags control the way data is presented to browser users, color, font size,
spacing, etc. Differently XML aims to deal with the logic meaning of data, or semantics.
In the above example, the text wrapped in <from> and </from> is the name of the sender of the
note. This enables the fulfillment of the task of finding all the notes written by a specific person.
So XML was designed to describe data and to focus on what data is while HTML was designed to
display data and to focus on how data looks.
Actually XML and HTML can complement each other. For example, we use XML files to store
data on a web server machine and when a request arrives, a servlet runs to retrieve data in
XML, compose a HTML file, and finally output it to the client.
This way you can concentrate on using HTML for data layout and display, and be sure that
changes in the underlying data will not require any changes to your HTML.
Besides XML’s role of storing data, with XML, data can be exchanged between incompatible
systems.
In the real world, computer systems and databases contain data in incompatible formats. One of
the most time-consuming challenges for developers has been to exchange data between such
systems over the Internet.
Converting the data to XML can greatly reduce this complexity and create data that can be read
by many different types of applications, since XML data is stored in plain text format, which is
software- and hardware-independent.
XML Syntax
The syntax rules of XML are very simple and very strict. The rules are very easy to learn, and
very easy to use. Because of this, creating software that can read and manipulate XML is very
easy to do.
An Example XML Document

XML documents use a self-describing and simple syntax. For example, the following is a
complete XML file presenting a note:
<?xml version="1.0" encoding="ISO-8859-1"?>

<note>
<to>Tove</to>
<from>Jani</from>
<body>Don’t forget me this weekend!</body>
</note>
The first line in the document – the XML declaration – defines the XML version and the
character encoding used in the document. In this case the document conforms to the 1.0
specification of XML.
The next line and the last line describe the root element of the document. So when you look at
the document, you easily detect this is a note to Tove from Jani. So XML is pretty self-
descriptive.
Different from HTML, XML has a syntax that is much more rigid.
• With XML, it is illegal to omit the closing tag.
In HTML some elements do not have to have a closing tag to be able to present content in the
way the user wants them to. For example:
<p>This is a paragraph
<p>This is another paragraph
In XML, however, all elements must have a closing tag, like this:
<p>This is a paragraph</p>
<p>This is another paragraph</p>
Note that you might have noticed from the previous example that the XML declaration did not
have a closing tag. This is not an error. The declaration is not a part of the XML document itself.
It is not an XML element, and it should not have a closing tag.
• Unlike HTML, XML tags are case sensitive. With XML, the tag <Letter> is different from the tag
<letter>.
Opening and closing tags must therefore be written with the same case:
<Message>This is incorrect</message>
<message>This is correct</message>
• Improper nesting of tags makes no sense to XML.

In HTML some elements can be improperly nested within each other and still display content in
the desired way like this:
<b><i>This text is bold and italic</b></i>
In XML all elements must be properly nested within each other like this:
<b><i>This text is bold and italic</i></b>
• All XML documents must contain a single tag pair to define a root element.
All other elements must be within this root element. All elements can have sub elements (child
elements). Sub elements must be correctly nested within their parent element:
<root>
<child>
<subchild>.....</subchild>
</child>
</root>
• With XML, it is illegal to omit quotation marks around attribute values. XML elements can
have attributes in name/value pairs just like in HTML. In XML the attribute value must always
be quoted. Study the two XML documents below. The first one is incorrect, the second is
correct:
<note date=12/11/2002>
<to>Tove</to>
<from>Jani</from>
</note>
--------------------------------------------------------------------------------------------------------------------------
<note date="12/11/2002">
<to>Tove</to>
<from>Jani</from>
</note>
The error in the first document is that the date attribute in the note element is not quoted.
With XML, the white space in your document is not truncated. This is unlike HTML. With HTML,
a sentence like this:
Hello my name is Tove,
will be displayed like this:
XML Elements
XML elements define the framework of an XML document and the structure of data items.
XML Elements are Extensible

XML was invented to be extensible and XML documents can be extended to carry more
information.
Take the above note as an example again. Let’s imagine that we created an application that
extracted the <to>, <from>, and <body> elements from the XML document to produce this
output:
MESSAGE
To: Tove
From: Jani
Don’t forget me this weekend!
Imagine that the author of the XML document added some extra information to it:
<note>
<date>2002-08-01</date>
<to>Tove</to>
<from>Jani</from>
</note>
Should the application break or crash? No. The application should still be able to find the <to>,
<from>, and <body> elements in the XML document and produce the same output.
XML documents are Extensible.
XML Elements have Relationships

Elements are related as parents and children.
To understand XML terminology, you have to know how relationships between XML elements
are named, and how element content is described.
Imagine that this is a description of a book:

My First XML
Introduction to XML
• What is HTML
• What is XML XML
Syntax
• Elements must have a closing tag
• Elements must be properly nested
Imagine that this XML document describes the book:
<book>
<title>My First XML</title>
<prod id="33-657" media="paper"></prod>
<chapter>Introduction to XML
<para>What is HTML</para>
<para>What is XML</para>
</chapter>
<chapter>XML Syntax
<para>Elements must have a closing tag</para>
<para>Elements must be properly nested</para>
</chapter>
</book>
book is the root element. title, prod, and chapter are child elements of book. book is the parent
element of title, prod, and chapter. title, prod, and chapter are siblings (or sister elements)
because they have the same parent.
Elements have Content

Elements can have different content types.
An XML element is everything from (including) the element’s start tag to (including) the
element’s end tag.
An element can have element content, mixed content, simple content, or empty content. An
element can also have attributes.
In the example above, book has element content, because it contains other elements. Chapter
has mixed content because it contains both text and other elements. para has simple content (or
text content) because it contains only text. prod has empty content, because it carries no
information.
In the example above only the prod element has attributes. The attribute named id has the
value “33-657”. The attribute named media has the value “paper”.
Element Naming Rules:

XML elements must follow these naming rules:
• Names can contain letters, numbers, and other characters
• Names must not start with a number or punctuation character
• Names must not start with the letters xml (or XML or Xml ..)
 Names cannot contain spaces
XML Attributes
XML elements can have attributes in the start tag, just like HTML. Attributes are used to provide
additional information about elements.
From HTML you will remember this: <IMG SRC="computer.gif">. The SRC attribute provides
additional information about the IMG element.
In HTML (and in XML) attributes provide additional information about elements:

<img src="computer.gif">
<a href="demo.asp">
Attributes often provide information that is not a part of the data. In the example below, the file
type is irrelevant to the data, but important to the software that wants to manipulate the
element:
<file type="gif">computer.gif</file>
As we said before, attribute values in XML must always be enclosed in quotes, but either single
or double quotes can be used. Note that if the attribute value itself contains double quotes it is
necessary to use single quotes and vice versa, like in this example:
<gangster name=’George "Shotgun" Ziegler’>
Sometimes information may be placed as attributes or child elements. Take a look at these
examples:
<person sex="female">
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>
and
<person>
<sex>female</sex>
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>
In the first example sex is an attribute. In the last, sex is a child element. Both examples provide
the same information.
There are no rules about when to use attributes, and when to use child elements. Generally
attributes are handy in HTML, but in XML you should try to avoid them. Use child elements if
the information feels like data.
Here are some of the problems using attributes:
• attributes cannot contain multiple values (child elements can)
• attributes are not easily expandable (for future changes)
• attributes cannot describe structures (child elements can)
• attributes are more difficult to manipulate by program code
• attribute values are not easy to test against a Document Type Definition (DTD) – which is
used to define the legal elements of an XML document
1.2 XML DTD

Similar to HTML, XML with correct syntax is Well Formed XML. That is a well formed XML
document is a document that conforms to the XML syntax rules that were described in the
previous sections.
More specifically, to be well formed, an XML document must be validated against a Document
Type Definition (DTD). The purpose of a DTD is to define the legal building blocks of an XML
document. It defines the document structure with a list of legal elements. a DTD can be specified
internally or externally. The following is an example of internal DTD for the above
note example:
<?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
<note>
<to>Tove</to>
<from>Jani</from>
<body>Don’t forget me this weekend</body>
</note>
The DTD above is interpreted like this: !DOCTYPE note (in line 2) defines that this is a
document of the type note. !ELEMENT note (in line 3) defines the note element as having four
elements: “to,from,heading,body”. !ELEMENT to (in line 4) defines the to element to be of the
type “#PCDATA”. !ELEMENT from (in line 5) defines the from element to be of the type
“#PCDATA” and so on ...
If the DTD is external to your XML source file, it should be wrapped in a DOCTYPE definition
with the following syntax:
<!DOCTYPE note SYSTEM "note.dtd">
<note>
<to>Tove</to>
<from>Jani</from>
</note>
And the following is a copy of the file “note.dtd” containing the DTD:
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
W3C supports an alternative to DTD called XML Schema. If interested, you may read more about
XML Schema in related books.
The W3C XML specification states that a program should not continue to process an XML
document if it finds a validation error. The reason is that XML software should be easy to write,
and that all XML documents should be compatible.
With HTML it was possible to create documents with lots of errors (like when you forget an end
tag). One of the main reasons that HTML browsers are so big and incompatible, is that they have
their own ways to figure out what a document should look like when they encounter an HTML
error. With XML this should not be possible.
1.3 XML Schema
XML Schema is commonly known as XML Schema Definition (XSD). It is used to describe and
validate the structure and the content of XML data. XML schema defines the elements, attributes
and data types. Schema element supports Namespaces. It is similar to a database schema that
describes the data in a database.
Syntax
You need to declare a schema in your XML document as follows − The following example
shows how to use schema −
employee.xsd
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://ww
w.javatpoint.com"
xmlns="http://www.javatpoint.com"
elementFormDefault="qualified">
<xs:element name="employee">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
<xs:element name="email" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Let's see the xml file using XML schema or XSD file.
employee.xml
<employee
xmlns="http://www.javatpoint.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.javatpoint.com employee.xsd">
<firstname>vimal</firstname>
<lastname>jaiswal</lastname>
<email>vimal@javatpoint.com</email>
</employee>
Description of XML Schema
<xs:element name="employee"> : It defines the element name employee.
<xs:complexType> : It defines that the element 'employee' is complex type.
<xs:sequence> : It defines that the complex type is a sequence of elements.
<xs:element name="firstname" type="xs:string"/> : It defines that the element 'firstname' is

of string/text type.
<xs:element name="lastname" type="xs:string"/> : It defines that the element 'lastname' is
of string/text type.
<xs:element name="email" type="xs:string"/> : It defines that the element 'email' is of
string/text type.
XML Schema Data types

There are two types of data types in XML schema.
1. simpleType
2. complexType
simpleType
The simpleType allows you to have text-based elements. It contains less attributes, child
elements, and cannot be left empty.
complexType
The complexType allows you to hold multiple attributes and elements. It can contain additional
sub elements and can be left empty.
CDATA vs PCDATA
CDATA
CDATA: (Unparsed Character data): CDATA contains the text which is not parsed further in an
XML document. Tags inside the CDATA text are not treated as markup and entities will not be
expanded.
Let's take an example for CDATA:
<!DOCTYPE employee SYSTEM "employee.dtd">
<employee>
<![CDATA[
]]>
</employee>
In the above CDATA example, CDATA is used just after the element employee to make the
data/text unparsed, so it will give the value of employee:
<firstname>vimal</firstname><lastname>jaiswal</lastname><email>vimal@javatpoint.com<
/email>
PCDATA: (Parsed Character Data):

XML parsers are used to parse all the text in an XML document. PCDATA stands for Parsed
Character data. PCDATA is the text that will be parsed by a parser. Tags inside the PCDATA will
be treated as markup and entities will be expanded.
In other words you can say that a parsed character data means the XML parser examine the
data and ensure that it doesn't content entity if it contains that will be replaced.
Let's take an example:

<!DOCTYPE employee SYSTEM "employee.dtd">
<employee>
</employee>
In the above example, the employee element contains 3 more elements 'firstname', 'lastname',
and 'email', so it parses further to get the data/text of firstname, lastname and email to give the
value of employee as:
vimal jaiswal vimal@javatpoint.com
The basic idea behind XML Schemas is that they describe the legitimate format that an XML
document can take.
Defining XML Elements

Elements are the main building block of all XML documents, containing the data and determine
the structure of the instance document.
An element can be defined within an XSD as follows:

<xs:element name="x" type="y" />
Each element definition within the XSD must have a 'name' property, which is the tag name that
will appear in the XML document. The 'type' property provides the description of what type of
data can be contained within the element when it appears in the XML document.
There are a number of predefined simple types, such as xs:string, xs:integer, xs:boolean and
xs:date. Elements of these simple data types are said to have a 'simple content model', whereas
elements that can contain other elements are said to have a 'complex content model' and
elements that can contain both have a 'mixed content model'.
If we have set the type property for an element in the XSD, then the corresponding value in the
XML document must be in the correct format for its given type otherwise this will cause a
validation error when a validating parser attempts to parse the data from the XML document.
Examples of simple elements and their XML data are shown below:
Sample XSD Sample XML
<xs:element name="Customer_dob" <Customer_dob>

type="xs:date" /> 2000-01-12T12:13:14Z
</Customer_dob>
<xs:element name="Customer_address" <Customer_address>

type="xs:string" /> 99 London Road
</Customer_address>
<xs:element name="OrderID" <OrderID>

type="xs:int" /> 5756
</OrderID>
<xs:element name="Body" <Body></Body>

type="xs:string" />
Defining Simple Types

It is possible to create a new simpleType by restricting an existing simpleType, allowing you to
define your own data types. It is possible to restrict a built in type (xs:string, xs:integer, xs:date
etc) or one of your own previously defined simpleType's
Examples uses:
 Defining an ID, this may be an integer with a maximum value limit.
 A Postcode or Zip code could be restricted to ensure it is the correct length and complies
with a regular expression.
 Defining a field to have a maximum length.
Defining Complex Types

A xs:complexType provides the definition for an XML Element, it's specifies which element and
attributes are permitted and the rules regarding where they can appear and how many times.
They can be used inplace within an element definition or named and defined.
Here are some simple element definitions:
<xs:element name="Customer_dob" type="xs:date" />
<xs:element name="Customer_address" type="xs:string" />
<xs:element name="Supplier_phone" type="xs:integer" />
<xs:element name="Supplier_address" type="xs:string" />
We can see that some of these elements should really be represented as child elements,
"Customer_dob" and "Customer_address" should belong to a parent element – "Customer".
While "Supplier_phone" and "Supplier_address" should belong to a parent element "Supplier".
We can therefore re-write this in a more structured way:
<xs:element name="Customer">
<xs:complexType>
<xs:sequence>
<xs:element name="Dob" type="xs:date" />
<xs:element name="Address" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="Supplier">
<xs:complexType>
<xs:sequence>
<xs:element name="Phone" type="xs:integer" />
<xs:element name="Address" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
Attributes
Attributes in XSD provide extra information within an element. Attributes
have name and type property as shown below −
<xs:attribute name = "x" type = "y"/>
An attribute is specified within a xs:complexType, the type information for the attribute comes
from a xs:simpleType (either defined inline or via a reference to a built in or user defined
xs:simpleType definition). The Type information describes the data the attribute can contain in
the XML document, i.e. string, integer, date etc. Attributes can also be specified globally and
then referenced (but more about this later).
<xs:element name="Order"> <Order OrderID="6" />

<xs:complexType> - or no attribute -
<xs:attribute name="OrderID" <Order />
type="xs:int" />
</xs:complexType>
</xs:element>

<xs:complexType> - or no attribute -
<xs:attribute name="OrderID" <Order />
type="xs:int"
use="optional" />
</xs:complexType>
</xs:element>

<xs:complexType>
<xs:attribute name="OrderID"
type="xs:int"
use="required" />
</xs:complexType>
</xs:element>
1.4 XML PARSER

An XML parser is a software library or package that provides interfaces for client applications
to work with an XML document. The XML Parser is designed to read the XML and create a way
for programs to use XML.
XML parser validates the document and check that the document is well formatted.
Let's understand the working of XML parser by the figure given below:
Types of XML Parsers

These are the two main types of XML Parsers:
1. DOM
2. SAX
DOM (Document Object Model)

A DOM document is an object which contains all the information of an XML document. It is
composed like a tree structure. The DOM Parser implements a DOM API. This API is very simple
to use.
Features of DOM Parser
A DOM Parser creates an internal structure in memory which is a DOM document object and the
client applications get information of the original XML document by invoking methods on this
document object.
DOM Parser has a tree based structure.
Advantages
1) It supports both read and write operations and the API is very simple to use.
2) It is preferred when random access to widely separated parts of a document is required.
Disadvantages
1) It is memory inefficient. (consumes more memory because the whole XML document needs
to loaded into memory).
2) It is comparatively slower than other parsers.
SAX (Simple API for XML)

A SAX Parser implements SAX API. This API is an event based API and less intuitive.
Features of SAX Parser
It does not create any internal structure.
Clients does not know what methods to call, they just overrides the methods of the API and
place his own code inside method.
It is an event based parser, it works like an event handler in Java.
Advantages
1) It is simple and memory efficient.
2) It is very fast and works for huge documents.
Disadvantages
1) It is event-based so its API is less intuitive.
2) Clients never know the full information because the data is broken into pieces.
Demo Example
Here is the input xml file that we need to parse −
<?xml version = "1.0"?>
<class>
<student rollno = "393">
<firstname>dinkar</firstname>
<lastname>kad</lastname>
<nickname>dinkar</nickname>
<marks>85</marks>
</student>

<firstname>Vaneet</firstname>
<lastname>Gupta</lastname>
<nickname>vinni</nickname>
<marks>95</marks>
</student>

<firstname>jasvir</firstname>
<lastname>singn</lastname>
<nickname>jazz</nickname>
<marks>90</marks>
</student>
</class>
DomParserDemo.java
package com.tutorialspoint.xml;
import java.io.File;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
public class DomParserDemo {

public static void main(String[] args) {
try {
File inputFile = new File("input.txt");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(inputFile);
doc.getDocumentElement().normalize();
System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
NodeList nList = doc.getElementsByTagName("student");
System.out.println("----------------------------");
for (int temp = 0; temp < nList.getLength(); temp++) {

Node nNode = nList.item(temp);
System.out.println("\nCurrent Element :" + nNode.getNodeName());
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
System.out.println("Student roll no : "
+ eElement.getAttribute("rollno"));
System.out.println("First Name : "
+ eElement
.getElementsByTagName("firstname")
.item(0)
.getTextContent());
System.out.println("Last Name : "
+ eElement
.getElementsByTagName("lastname")
.item(0)
.getTextContent());
System.out.println("Nick Name : "
+ eElement
.getElementsByTagName("nickname")
.item(0)
.getTextContent());
System.out.println("Marks : "
+ eElement
.getElementsByTagName("marks")
.item(0)
.getTextContent());
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
This would produce the following result −

Root element :class
----------------------------
Current Element :student

Student roll no : 393
First Name : dinkar
Last Name : kad
Nick Name : dinkar
Marks : 85

First Name : Vaneet
Last Name : Gupta
Nick Name : vinni
Marks : 95

First Name : jasvir
Last Name : singn
Nick Name : jazz
Marks : 90
1.5 XPATH:
XPath is an important and core component of XSLT standard. It is used to traverse the elements
and attributes in an XML document.
XPath is a W3C recommendation. XPath provides different types of expressions to retrieve

relevant information from the XML document. It is syntax for defining parts of an XML
document.
Following are the main parts of XSL −

 XSLT − used to transform XML documents into various other types of document.
 XPath − used to navigate XML documents.
 XSL-FO − used to format XML documents.
XPath provides various types of expressions which can be used to enquire relevant
information from the XML document.
 Structure Definitions − XPath defines the parts of an XML document like element,
attribute, text, namespace, processing-instruction, comment, and document nodes
 Path Expressions − XPath provides powerful path expressions select nodes or list of
nodes in XML documents.
 Standard Functions − XPath provides a rich library of standard functions for
manipulation of string values, numeric values, date and time comparison, node and
QName manipulation, sequence manipulation, Boolean values etc.
 Major part of XSLT − XPath is one of the major elements in XSLT standard and is must
have knowledge in order to work with XSLT documents.
 W3C recommendation − XPath is an official recommendation of World Wide Web
Consortium (W3C).
XPath - Expression
An XPath expression generally defines a pattern in order to select a set of nodes. These
patterns are used by XSLT to perform transformations or by XPointer for addressing purpose.
XPath specification specifies seven types of nodes which can be the output of execution of the
XPath expression.
 Root
 Element
 Text
 Attribute
 Comment
 Processing Instruction
 Namespace
XPath uses a path expression to select node or a list of nodes from an XML document.
Following is the list of useful paths and expression to select any node/ list of nodes from an
XML document.
S.No. Expression & Description
node-name
1
Select all nodes with the given name "nodename"
/
2
Selection starts from the root node
//
3
Selection starts from the current node that match the selection
.
4
Selects the current node
..
5
Selects the parent of the current node
@
6
Selects attributes
Example
In this example, we've created a sample XML document, students.xml and its stylesheet
document students.xsl which uses the XPath expressions under select attribute of various
XSL tags to get the values of roll no, firstname, lastname, nickname and marks of each student
node.
students.xml
<?xml version = "1.0"?>
<?xml-stylesheet type = "text/xsl" href = "students.xsl"?>
<class>
<firstname>Dinkar</firstname>
<lastname>Kad</lastname>
<nickname>Dinkar</nickname>
<marks>85</marks>
</student>
<firstname>Vaneet</firstname>
<lastname>Gupta</lastname>
<nickname>Vinni</nickname>
<marks>95</marks>
</student>
<firstname>Jasvir</firstname>
<lastname>Singh</lastname>
<nickname>Jazz</nickname>
<marks>90</marks>
</student>
</class>
students.xsl
<?xml version = "1.0" encoding = "UTF-8"?>
<xsl:stylesheet version = "1.0"
xmlns:xsl = "http://www.w3.org/1999/XSL/Transform">
<xsl:template match = "/">

<html>
<body>
<h2>Students</h2>
<table border = "1">
<tr bgcolor = "#9acd32">
<th>Roll No</th>
<th>First Name</th>
<th>Last Name</th>
<th>Nick Name</th>
<th>Marks</th>
</tr>
<xsl:for-each select = "class/student">
<tr>
<td> <xsl:value-of select = "@rollno"/></td>
<td><xsl:value-of select = "firstname"/></td>
<td><xsl:value-of select = "lastname"/></td>
<td><xsl:value-of select = "nickname"/></td>
<td><xsl:value-of select = "marks"/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
XPath - Absolute Path
Location path specifies the location of node in XML document. This path can be absolute or
relative. If location path starts with root node or with '/' then it is an absolute path. Following
are few of the example locating the elements using absolute path.
/class/student − select student nodes within class root node.
<xsl:for-each select = "/class/student">
/class/student/firstname − select firstname of a student node within class root node.
<p><xsl:value-of select = "/class/student/firstname"/></p>
XPath - Relative Path

If location path starts with the node that we've selected then it is a relative path.
Following are few examples locating the elements using relative path.
firstname − select firstname related to student nodes.
<p><xsl:value-of select = "firstname"/></p>
XPath - Axes
Following is the list of various Axis values.
S.No. Axis & Description
1 ancestor
Represents the ancestors of the current node which include the parents up to the
root node.
ancestor-or-self
2
Represents the current node and it's ancestors.
attribute
3
Represents the attributes of the current node.
child
4
Represents the children of the current node.
descendant
5 Represents the descendants of the current node. Descendants include the node's
children upto the leaf node(no more children).
descendant-or-self
6
Represents the current node and it's descendants.
following
7
Represents all nodes that come after the current node.
following-sibling
8 Represents the following siblings of the context node. Siblings are at the same
level as the current node and share it's parent.
namespace
9
Represents the namespace of the current node.
parent
10
Represents the parent of the current node.
preceding
11 Represents all nodes that come before the current node (i.e. before it's opening
tag).
self
12
Represents the current node.
1.6 XML Transformation
Besides CSS, XSL was invented just for displaying XML. The eXtensible Stylesheet Language
(XSL) is far more sophisticated than CSS.
XSL consists of three parts:

XSLT (XSL Transformations) is a language for transforming XML documents. XSLT is the most
important part of the XSL Standards.
It is the part of XSL that is used to transform an XML document into another XML document, or
another type of document that is recognized by a browser, like HTML and XHTML.
Normally XSLT does this by transforming each XML element into an (X)HTML element. XSLT
can also add new elements into the output file, or remove elements. It can rearrange and sort
elements, and test and make decisions about which elements to display, and a lot more. A
common way to describe the transformation process is to say that XSLT transforms an XML
source tree into an XML result tree.
XSLT uses XPath to define the matching patterns for transformations. In the transformation
process, XSLT uses XPath to define parts of the source document that match one or more
predefined templates.
When a match is found, XSLT will transform the matching part of the source document into the
result document. The parts of the source document that do not match a template will end up
unmodified in the result document.
XPath is a language for defining parts of an XML document.
XPath uses path expressions to identify nodes in an XML document. These path expressions
look very much like the expressions you see when you work with a computer file system:
w3schools/xpath/default.asp
Look at the following simple XML document:
<?xml version="1.0" encoding="ISO-8859-1"?> <catalog>
<cd country="USA">
<title>Empire Burlesque</title>
<artist>Bob Dylan</artist>
<price>10.90</price>
</cd>
<cd country="UK">
<title>Hide your heart</title>
<artist>Bonnie Tyler</artist>
<price>9.90</price>
</cd>
<cd country="USA">
<title>Greatest Hits</title>
<artist>Dolly Parton</artist>
<price>9.90</price>
</cd>
</catalog>
The XPath expression below selects the ROOT element catalog:

/catalog
The XPath expression below selects all the cd elements of the catalog element:
/catalog/cd
The XPath expression below selects all the price elements of all the cd elements of the catalog
element:
/catalog/cd/price
Note: If the path starts with a slash (‘/’) it represents an absolute path to an element! XPath also
defines a library of standard functions for working with strings, numbers and Boolean
expressions. The XPath expression below selects all the cd elements that have a price element
with a value larger than 10.80:
/catalog/cd[price>10.80]
XSL-FO is a language for formatting XML documents.

Think of XSL as set of languages that can transform XML into XHTML, filter and sort XML data,
define parts of an XML document, format XML data based on the data value, like displaying
negative numbers in red, and output XML data to different media, like screens, paper, or voice.
One way to use XSL is to transform XML into HTML before it is displayed by the browser. Below
is a fraction of the XML file, with an added XSL reference. The second line, <?xml-stylesheet
type="text/xsl" href="simple.xsl"?>, links the XML file to the XSL file:
<?xml-stylesheet type="text/xsl" href="simple.xsl"?>
<breakfast_menu>
<food>
<name>Belgian Waffles</name>
<price>$5.95</price>
<description>
two of our famous Belgian Waffles
</description>
<calories>650</calories>
</food>
</breakfast_menu>
And the corresponding XSL file is as follows:

<html xsl:version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/TR/xhtml1/strict">
<body style="font-family:Arial,helvetica,sans-serif;font-size:12pt;
background-color:#EEEEEE">
<xsl:for-each select="breakfast_menu/food">
<div style="background-color:teal;color:white;padding:4px">
<span style="font-weight:bold;color:white">
<xsl:value-of select="name"/></span>
- <xsl:value-of select="price"/>
</div>
<div style="margin-left:20px;margin-bottom:1em;font-size:10pt">
<xsl:value-of select="description"/>
<span style="font-style:italic">
(<xsl:value-of select="calories"/> calories per serving)
</span>
</div>
</xsl:for-each>
</body>
</html>

XML Unit 2 Notes

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

XML Unit 2 Notes

Diunggah oleh

Hak Cipta:

Format Tersedia

UNIT – II

1.1 XML Introduction

An Example XML Document

<?xml version="1.0" encoding="ISO-8859-1"?>

• Improper nesting of tags makes no sense to XML.

XML Elements are Extensible

XML Elements have Relationships

Imagine that this is a description of a book:

Elements have Content

Element Naming Rules:

In HTML (and in XML) attributes provide additional information about elements:

1.2 XML DTD

1.3 XML Schema

<xs:element name="firstname" type="xs:string"/> : It defines that the element 'firstname' is

XML Schema Data types

Let's take an example for CDATA:

PCDATA: (Parsed Character Data):

Let's take an example:

Defining XML Elements

An element can be defined within an XSD as follows:

<xs:element name="Customer_dob" <Customer_dob>

<xs:element name="Customer_address" <Customer_address>

<xs:element name="OrderID" <OrderID>

<xs:element name="Body" <Body></Body>

Defining Simple Types

Defining Complex Types

<xs:element name="Order"> <Order OrderID="6" />

<xs:element name="Order"> <Order OrderID="6" />

<xs:element name="Order"> <Order OrderID="6" />

1.4 XML PARSER

Types of XML Parsers

DOM (Document Object Model)

SAX (Simple API for XML)

<student rollno = "493">

<student rollno = "593">

public class DomParserDemo {

for (int temp = 0; temp < nList.getLength(); temp++) {

This would produce the following result −

Current Element :student

Current Element :student

Current Element :student

XPath is a W3C recommendation. XPath provides different types of expressions to retrieve

Following are the main parts of XSL −

<xsl:template match = "/">

/class/student − select student nodes within class root node.

<xsl:for-each select = "/class/student">

/class/student/firstname − select firstname of a student node within class root node.

<p><xsl:value-of select = "/class/student/firstname"/></p>

XPath - Relative Path

firstname − select firstname related to student nodes.

<p><xsl:value-of select = "firstname"/></p>

XSL consists of three parts:

XPath is a language for defining parts of an XML document.

The XPath expression below selects the ROOT element catalog:

XSL-FO is a language for formatting XML documents.

Anda mungkin juga menyukai