C Language Help, C Language Tutorials, C Language Programming, C Language Tricks
XML Classes
eXtensible Markup Language i.e. XML is a subset of the Standard Generalized
Markup Language
(SGML), which is an ISO standard numbered ISO 8879. SGML was perceived to be
remarkably colossal
and extremely convoluted to be put to any pragmatic use. Thus, a subset of this
language, XML,
was developed to work seamlessly with both SGML and HTML. XML may be considered
as a restricted
form of SGML, since it conforms to the rules of an SGML document.
XML was created in the year 1996 under the auspices of the World Wide Web
Consortium (W3C),
under the chairmanship of Jon Bosak. This group spelt out 10 ground rules for
XML, with 'ease of
use' as its fundamental philosophy. From thereon, the expectations reached a
threshold wherein,
XML was expected to eradicate world poverty and generally rid the world of all
its tribulations.
To be precise, XML was overvalued, way beyond realistic levels. There are people
who appear to
be extremely infatuated by XML, even though they may not have read through a
single rule or
specification of the language.
The specifications of XML laid down by its three primary authors- Tim Bray, Jean
Paoli and C. M.
Sperberg-McQueen, are accessible at the web site http://www.w3.org/XML.
XML documents consists entities comprising of Characters or Markups. An XML file
is made up of a
myriad components, which shall be unravelled one at a time, after we have
discerned the basic concepts of this language. We commence this chapter by introducing a program
that generates an
XML file.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main() {
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.Flush();
a.Close();
}
}
In this program, we use a class called XmlTextWriter, which comes from the
System.Xml namespace.
An instance 'a' of the XmlTextWriter class is created, by passing two parameters
to the
constructor:
• The first parameter, b.xml, is a string and represents the name of the file to
be
created. If the file exists in the current directory, it gets deleted and then
recreated, but
with zero bytes.
• The second parameter is null. It represents the Encoding type used.
Unicode is a standard whereby each character is assigned 16 bits. All the
languages in the
world can now be easily represented by this standard. In the .Net world, we are
furnished with
classes whose methods facilitate conversion of arrays and strings made up of
Unicode characters,
to and from arrays made up of bytes alone.
The System.Text namespace has a large number of Encoding implementations, such
as the following:
• The ASCII Encoding encodes the Unicode characters as 7-bit ASCII.
• The UTF8 Encoding class encodes Unicode characters using UTF-8 encoding.
UTF-8 stands for UCS Transformation Format 8 bit. It supports all Unicode
characters. It is
normally accessed as code page 65001. UTF-8 is the default value and represents
all the letters
from the English alphabet. Here, since we have specified the second parameter as
null, the
default value of UTF-8 encoding is taken.
If we execute the program at this stage, you would be amazed by the fact that no
file by the
name of b.xml will be displayed. To enable this to happen, a function named
Flush needs to be
called.
Each time we ask the class XmlTextWriter to write to a file, it may not oblige
immediately, but
may place the output in a buffer. Only when the buffer becomes full, will it
write to the file.
This approach is pursued to avoid the overhead of accessing the file on the disk
repetitively.
This improves efficiency. The Flush function flushes the buffer to the file
stream, but it does
not close the file.
The Close function has to be employed to execute the twin tasks of flushing the
buffer to the
file, and closing the file. It is sagacious to call Flush, and then call Close,
even though
Close is adequate to carry out both these tasks.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main() {
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
Here, we have called a function called WriteStartDocument from the XmlTextWriter
class, which
does not take any parameters. It produces the line <?xml version="1.0"?>, in the
file b.xml.
Any line that begins with <?xml is called an XML declaration. Every entity in
XML is described
as a node. Every XML file must begin with an XML Declaration node. There can be
only one such
node in our XML file and it must be placed on the first line. Following it is an
attribute
called version, which is initialized to a value of 1.0.
The XML specifications lucidly stipulate that there would be no attribute called
version in the
next version of the software. Even if there is, its value would be
indeterminate. In other
words, in the foreseeable future, the only mandatory attribute would be
version=1.0.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.WriteDocType("vijay", null, null ,null);
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?><!DOCTYPE vijay>
The next vital declaration is the DOCTYPE declaration. Every XML file must have
one DOCTYPE
declaration, as it specifies the root tag. In our case, the root tag would be 'vijay'.
An XML file is made up of tags, which are words enclosed within angular
brackets. The file also
contains rules, which bind the tags. The next three parameters of the function
WriteDocType are
presently specified as null. You may refer to the documentation to decipher the
remaining
values, since these may be used in place of null. If this does not appeal to
you, you may have
to hold your horses, till we furnish the explanation at an appropriate time.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main() {
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
In the earlier example, all the nodes were displayed on the same line. We would
indubitably
desire that every node be displayed on a new line. The property Formatting in
XmlTextWriter, is
used to accomplish this task. Formatting can be assigned only one of the
following two values:
Indented or None. By default, the value assigned is None.
The Indented option indents the child elements by 2 spaces. The magnitude of the
indent may be
altered, by stipulating a new value for the Indentation field. In our program,
we want the
indent to be 3 spaces deep. Hence, we stipulate the value as 3. As is evident,
all nodes do not
get indented. For example, the DOCTYPE node does not get indented; instead, it
is placed on a
new line.
The IndentChar property may be supplied with the character that is to be
employed for
indentation. By default, a space character is used for this purpose.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay />
The function WriteStartElement accepts a single parameter, which is the tag
name, to be written
to the XML file. This is an oft-repeated instruction, to be iterated in almost
every program,
since an XML file basically comprises of tags. A tag normally has a start point
and an end
point, and it confines entities within these two extremities. However, there are
tags that do
not accept any entities. Such tags end with a / symbol.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.WriteAttributeString ("wife","sonal");
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay wife="sonal" />
The newly added function WriteAttributeString accepts two parameters, which it
writes in the
form of a name-value pair. Thus, along with 'vijay', we see the attribute named
'wife', having a
value of 'sonal'. An attribute is analogous to an adjective of the English
language, in that, it
describes the object. In our case, it describes the tag 'vijay'. It divulges
additional
information about the properties of a tag.
XML does not interpret the contents of these tags. The word 'wife' or the value
'sonal', have no
special significance for XML, which is absolutely unconcerned about the
information provided
within the tags.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main() {
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.WriteAttributeString ("wife","sonal");
a.WriteElementString("surname", "mukhi");
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay wife="sonal">
<surname>mukhi</surname>
</vijay>
An element represents entities within a tag. We have a tag surname containing
the value 'mukhi'.
We can have multiple tags within the root tag.
We have been reiterating the fact that we need to adhere to specific rules. You
may steer clear
of the beaten path and interchange the following two newly added functions as
follows:
a.WriteElementString("surname", "mukhi");
a.WriteAttributeString ("wife","sonal");
As a fallout of this interchange, the following exception will be thrown:
Unhandled Exception: System.InvalidOperationException: Token StartAttribute in
state Content
would result in an invalid XML document.
This exception is triggered off due to the fact that the attribute must be
specified first.
Then, and only then, should the child tags within the tag, be specified.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.WriteAttributeString ("wife","sonal");
a.WriteAttributeString ("friend","two");
a.WriteElementString("surname", "mukhi");
a.WriteElementString("books", "67");
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay wife="sonal" friend="two">
<surname>mukhi</surname>
<books>67</books>
</vijay>
To summarize, the WriteDocType function specifies the root tag, the
WriteStartElement the tag,
the WriteAttributeString, the attributes for the active tag and
WriteElementString function, a
tag within a tag. We can enumerate as many attributes as we desire. They will
eventually be
clustered together. The WriteElementString function is also capable of creating
as many tags, as
are needed under a tag.
In the file b.xml, we see two attributes and two tags, under the root tag 'vijay'.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.WriteAttributeString ("friend","two");
a.WriteStartElement("mukhi");
a.WriteAttributeString ("wife","sonal");
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay friend="two">
<mukhi wife="sonal" />
</vijay>
In the above example, 'vijay' is the root tag, with the attribute 'friend',
which is assigned a
value of 2. It also has a child tag 'mukhi' having the attribute of 'wife'
initialized to
'sonal'. Both the tags, 'vijay' and 'mukhi', are created using the function
WriteStartElement.
Unlike function WriteElementString, which creates a start and end tag,
WriteStartElement creates
only a start tag.
A tag too can be endowed with attributes. The active tag is the last inserted by
the
WriteStartElement function. Functions such as WriteAttributeString, act on the
active tag. Thus,
we notice that the attribute of 'wife' has the tag 'mukhi' and not 'vijay'.
Finally, since the
tag 'mukhi' is devoid of any contents, it ends with a / symbol on the same line.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.WriteAttributeString ("friend","two");
a.WriteStartElement("mukhi");
a.WriteAttributeString ("wife","sonal");
a.WriteFullEndElement();
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay friend="two">
<mukhi wife="sonal">
</mukhi>
</vijay>
The function WriteFullEndElement marks the end of the active tag. Therefore, the
single tag
'mukhi', does not end with a / symbol on the same line. It has an ending tag
instead. Both these
possibilities are equally valid in this case. But, if the tags embody any
contents, then both
the start and the end tags are mandatory. In such situations, a single empty tag
would just not suffice.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
//a.WriteComment("comment 1");
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteStartDocument();
a.WriteComment("comment 1");
a.WriteDocType("vijay", null, null ,null);
a.WriteComment("comment 2");
a.WriteStartElement("vijay");
a.WriteAttributeString ("wife","sonal");
a.WriteComment("comment 3");
a.WriteElementString("surname", "mukhi");
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!--comment 1-->
<!DOCTYPE vijay>
&<!--comment 2-->
<vijay wife="sonal">
<!--comment 3-->
<surname>mukhi</surname>
</vijay>
Every programming language extends the facility of writing comments, even though
it may be a
seldom used feature. Programmers insert comments amidst their code to document
or explain the
functioning of their programs. At times, comments assist in deciphering the code
from the
programmer's perspective. Practically, it may be easier to teach an elephant how
to tap-dance,
than to convince a programmer to write comments.
In the XML world, comments begin with <!-, and end with -->. This is somewhat
similar to the
HTML syntax. In fact, the rules of HTML are written in XML.
Comments are like a liquid, since they can be moulded to fit-in anywhere, except
on the first
line of a program. The first line in an XML file has to be a declaration. If you
dispense with
the comments given with the function WriteComment, an exception will be thrown
with the
following message:
Unhandled Exception: System.InvalidOperationException: WriteStartDocument should
be the first
call.
Thus, functions such as WriteComment, can be used to insert comments anywhere in
the code,
primarily for the purpose of documentation, which would enable even an alien
from outer space to
decipher the code better.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.WriteProcessingInstruction ("sonal", "mukhi=no");
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay>
<?sonal mukhi=no?>
</vijay>
A line beginning with <?, Is called a Processing Instruction (PI). This line is
inserted using
the function WriteProcessingInstruction, and is passed two parameters:
• the first is the name of the processing instruction.
• the second is the text that is to be inserted for the processing instruction.
A Procession Instruction is used by XML to communicate with other programs
during the
performance of certain tasks. XML does not have the wherewithal to execute
instructions. It
therefore delegates this task to the XML processor. The processor is a program
that is able to
recognise an XML file. When it encounters the processing instruction, and if it
is able to
understand it, it executes it. In cases where it cannot comprehend it, the
processor simply
ignores the instruction. This is the methodology by which XML communicates with
external
programs.
In our program, the instruction 'sonal' is ignored, as it does not provide any
meaningful input
to the processor.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.WriteString("mukhi");
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay>mukhi</vijay>
An XML file mainly consists of strings and tags. The WriteString function is
very extensively
exploited, since it writes content/strings between tags.
In the above example, the text 'mukhi' is enclosed within the tags of 'vijay'.
Even though we
have not explicitly asked the XmlTextWriter class to close the tag, the ending
tag has been used
because there exists some content after the opening tag.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.WriteAttributeString ("friend","two");
a.WriteString("hi");
//a.WriteAttributeString ("friend","three");
a.WriteStartElement("mukhi");
a.WriteAttributeString ("friend","two");
a.WriteString("bye");
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay friend="two">hi<mukhi friend="two">bye</mukhi></vijay>
The function WriteString can be inserted almost anywhere in the program. The
first WriteString
function writes 'hi' between the tags of 'vijay', while the second WriteString
function writes
'bye' between the tags of 'mukhi'. The WriteString is aware of the active tag.
Therefore, it
inserts the text accordingly. Here also, if we uncomment the line,
a.WriteAttributeString("friend","three"), the following exception will be
generated.
Unhandled Exception: System.InvalidOperationException: Token StartAttribute in
state Content
would result in an invalid XML document.
XML is very strict and meticulous in the sense that, it expects a certain order
to be
maintained, or else, it throws an exception. For instance, an element or a tag
has to be created
first. Only then, can all the attributes be written; and finally, the text or
content has to be
supplied. We are not permitted to write the text first and enter the attributes
later. In the
XmlTextWriter class, there is no going back. It is a one-way path, which only
moves in the
forward direction.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.WriteCharEntity ('A');
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay>A</vijay>
During our exploratory journey of XML, we shall discuss a large number of
characters that are
'reserved'. They have a special significance and cannot be used literally. These
Unicode
characters have to be written in a hex format. The function WriteCharEntity
performs this task.
It accepts a char or a Unicode character as a parameter and returns a number in
hex, prefaced
with the &# symbol.
For those who do not understand hexadecimal and consider it Greek and Latin, 41
hex is equal to
ASCII 65, which is the ASCII value for the capital letter A. You can pass
different characters
to this function and see their equivalent hex values.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.WriteCData("mukhi & <sonal>");
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay><![CDATA[mukhi & <sonal>]]></vijay>
The above program introduces a new function called WriteCData, which creates a
node called
CDATA. The parameter passed to this function is placed as it is, but is enclosed
within square
brackets.
A CDATA section is used whenever we want to use characters such as <, >, & and
the likes, in
their literal sense, which would otherwise be mistaken for Markup characters.
Thus, in the above
program, the CDATA section that contains the symbol &, interprets it as the
literal character &,
and not as a special character. Also, <sonal> is not recognized as a tag in this
section. A
CDATA section cannot be nested within another CDATA section.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.WriteString("<A>&");
a.WriteCData("<A>&");
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay><A>&
<![CDATA[<A>&]]>
</vijay>
This program illustrates certain characters that are special to XML. These are
the obvious
characters, such as <, > and &, since they are used whilst an XML file is being
created. Thus,
whenever XML comes across the following symbols, it replaces them with the
symbols depicted
against each:
• < is replaced with '<'
• > is replaced with '>'
• & is replaced with '&'.
If the same string that contains the above mentioned special characters is
placed within a CDATA
statement, gets written verbatim, without any conversions.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.WriteEntityRef("Hi");
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay>&Hi;</vijay>
The entity ref is very straightforward to understand. The string passed to the
function
WriteEntityRef is placed in the XML file, preceded by a '&' sign and followed by
a semi-colon.
An entity ref in XML is equivalent to a variable. It is included to provide
flexibility to the
program.
Thus in the above code, a variable called 'hi' is created. The task of stating
what 'hi'
signifies, can be defined in the XML file.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.WriteRaw("<A>&");
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay><A>&</vijay>
The WriteRaw function writes the characters passed to it, without carrying out
any conversions.
The above XML file is obviously erroneous, as no end tag has been specified for
the tag A. Also,
no name has been specified after the & sign.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
Boolean b = true;
a.WriteElementString("Logical", XmlConvert.ToString(b));
Int32 c = -2147483648;
a.WriteElementString("SmallInt", XmlConvert.ToString(c));
Int64 d = 9223372036854775807;
a.WriteElementString("Largelong", XmlConvert.ToString(d));
Single e = ((Single)22)/((Single)7);
a.WriteElementString("Single", XmlConvert.ToString(e));
Double f = 1.79769313486231570E+308;
a.WriteElementString("Double", XmlConvert.ToString(f));
DateTime h = new DateTime(2001, 07, 08 ,22, 0, 30, 500);
a.WriteElementString("DateTime", XmlConvert.ToString(h));
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay>
<Logical>true</Logical>
<SmallInt>-2147483648</SmallInt>
<Largelong>9223372036854775807</Largelong>
<Single>3.142857</Single>
<Double>1.7976931348623157E+308</Double>
<DateTime>2001-07-08T22:00:30.5000000+05:30</DateTime>
</vijay>
The above example contains a plethora of data types such as, boolean, int,
double and Data Time.
The XmlConvert class has a large number of static functions that help us convert
one data type
to another. One such function is the ToString function. For types such as int or
long, the
smallest and the largest values are used, in order to check the veracity of the
ToString
function.
The ToString function is overloaded to handle many more data types than we have
shown. The point
here is that, it is possible for us to convert any data type into a string and
write it to disk.
This factor gains immense importance when data is being received from a
database, and requires
to be converted into a string in an XML file.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.WriteStartAttribute("hi", "mukhi", "xxx:yyy");
a.WriteString("1-861003-78");
a.WriteEndAttribute();
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay hi:mukhi="1-861003-78" xmlns:hi="xxx:yyy" />
In the above example, we have introduced the WriteStartAttribute function. As is
apparent from
its name, it starts an attribute. The first parameter to this function is 'hi',
which is the
namespace, to which the prefix of the attribute belongs. The second parameter
'mukhi' is the
name of the attribute.
The names assigned to attributes and tags may not always result in a unique
name. A programmer
may inadvertently create a tag or an attribute with a name that already exists.
How then does
XML decide what the tag denotes?
To help resolve such potential conflicts, each tag or entity is prefaced with a
name known as
the namespace. This is followed by a colon sign. Normally, meaningful names are
assigned, rather
than words like 'hi'. Prefixes or namespaces like xmlns, are reserved by XML.
The concept of
namespaces in XML is identical to the concept of namespaces in C#.
The third parameter is a Uniform Resource Identifier (URI). This parameter
reveals greater
details about the location of the namespace. It informs XML that somewhere
within the document,
additional information about the namespace 'hi' is available. In this case it is
at xxx:yyy. As
the WriteStartAttribute function does not specify any value for the attribute,
the WriteString
function is employed to assign the value 1-861003-78, to the attribute 'mukhi'
in the namespace
'hi'.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main() {
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.WriteAttributeString("xmlns", "bk", null, "sonal:wife");
string p = a.LookupPrefix("sonal:wife");
a.WriteStartAttribute(p, "mukhi", "sonal:wife");
a.WriteString("sonal");
a.WriteEndAttribute();
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay xmlns:bk="sonal:wife" bk:mukhi="sonal" />
Here, the function WriteAttributeString is called with four parameters. The
first, as always, is
the name of the namespace, i.e. xmlns. The second is the name of the attribute
i.e. bk, which is
suffixed to the name of the namespace, as xmlns:bk. The third parameter is the
namespace URI. In
the earlier program, we had specified the value of xxx:yyy for the URI. For this
program, since
the namespace xmlns is a reserved namespace, the URI parameter is specified as
null. The last
parameter is the value of the attribute.
As a consequence, the above function takes the form of an attribute consisting
of
xmlns:bk=sonal:wife. The next function LookupPrefix, accepts a namespace URI and
returns the
prefix. As the parameter supplied to this function is sonal:wife, the prefix
returned is bk,
which is stored in p.
The WriteStartAttribute then uses the following:
• 'bk' as the namespace,
• 'mukhi' as the name of the attribute, and
• 'sonal:wife' as the namespace URI.
Thus, the attribute 'mukhi' is prefaced with the namespace 'bk'. Finally, the
WriteString
function assigns the value of 'sonal' to the attribute bk:mukhi.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.WriteAttributeString("xmlns", "bk", null, "sonal:wife");
a.WriteAttributeString("jjj", "bk", "kkk", "sonal:wife");
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay xmlns:bk="sonal:wife" jjj:bk="sonal:wife" xmlns:jjj="kkk" />
In this version of the WriteAttributeString function, the namespace is jjj and
the attribute
name is bk, with the value sonal:wife. Thus, the attribute becomes
jjj:bk=sonal:wife. The third
parameter to the function is the namespace URI, which is now assigned a value of
kkk, instead of
null.
Thus, one more attribute xmlns:jjj gets added, which indicates that the
namespace URI is kkk. We
notice that this attribute does not get added for the xmlns namespace. We have
chosen the
attribute name 'bk' again, just to demonstrate that they belong to different
namespaces.
Therefore, this bk is considered to be a different attribute from the earlier
bk.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.WriteStartAttribute(null,"sonal", null);
a.WriteQualifiedName("mukhi", "http://vijaymukhi.com");
a.WriteEndAttribute();
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay sonal="n1:mukhi" xmlns:n1="http://vijaymukhi.com" />
In the WriteStartAttribute function, only the second parameter out of the three
parameters, has
a value 'sonal, which is the name of the attribute. The first parameter, which
is the name of
the namespace and the third parameter, which is the URI of the namespace, are
both assigned null
values.
The next function, WriteQualifiedName assigns a value to the attribute 'sonal'.
This function
takes two parameters, the value 'mukhi' and the namespace URI for the value.
The value 'mukhi' gets prefaced by a namespace n1, which is created dynamically
by XML. The name
n1 belongs to the reserved xmlns namespace and the URI to n1 is specified in the
second
parameter, http://vijaymukhi.com. The method WriteQualifiedName, then looks up
the prefix within
the scope for the given namespace.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.WriteStartElement("vijay");
a.WriteAttributeString("xmlns","mukhi",null,"xxx:yyy");
a.WriteString("Hi ");
a.WriteQualifiedName("sonal","xxx:yyy");
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<vijay xmlns:mukhi="xxx:yyy">Hi mukhi:sonal</vijay>
In this example, we first create an attribute 'mukhi' in the reserved namespace
xmlns. This
attribute is then rendered a value of xxx:yyy. The WriteString function writes
'Hi' as the
content and then, the WriteQualifiedName writes the string 'sonal'. However,
since 'sonal' is a
Qualified name, it is prefaced by 'mukhi' and not by xxx:yyy, because 'mukhi' is
equated to
xxx:yyy.
The prefix in the scope for the namespace is given precedence.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main() {
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.WriteStartElement("vijay");
a.WriteElementString("vijay","mukhi");
a.WriteElementString("vijay","sonal","mukhi");
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<vijay>
<vijay>mukhi</vijay>
<vijay xmlns="sonal">mukhi</vijay>
</vijay>
As we have just observed, the WriteElementString function had only two
parameters in the earlier
program. However, here it has three parameters. The first and the third
parameters are the same,
i.e. the tag name and the value. The newly inducted second parameter indicates
the namespace
'sonal'. The tag in the first parameter 'vijay', has the namespace of sonal.
Thus, the XML file
contains the tag with the attribute of xmlns=sonal.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main() {
XmlTextWriter a = new XmlTextWriter (Console.Out);
a.WriteStartDocument();
a.WriteStartElement("vijay");
a.Close();
}
}
Output
<?xml version="1.0" encoding="IBM437"?><vijay />
The XmlTextWriter class can write to different entities, using the constructor
that accepts a
single parameter. The Console class has a static property out of datatype
TextWriter that
represents the console. Thus, the output is now displayed on the console. By
default, the
encoding attribute is assigned a value of IBM437.
One of the primary reasons for designing XML was to introduce validation of the
tags in order to
produce a well-evolved XML file.
There are a few validations that need to be performed in an XML file, such as:
• It should be ensured that the basic rules of XML as well as our indigenous
rules are
followed.
• Certain tags should be placed only within specified tags and cannot be used
independently.
• The number of times a tag is being used can be regulated, since it cannot be
used
infinite times.
• A check should be placed on the name and the number of times an attribute is
used within
a tag.
All such rules that need to be enforced are enunciated in XML parlance and then,
placed in a DTD
or a Document Type Description. The DTD may either be placed in a separate file
or may be made
part of the DOCTYPE declaration. In the XML file shown below, the DTD is
internal.
Thus, a DTD stores the grammar that is permissible in an XML file. The entity
refs are also
defined in a DTD. One of the reasons why HTML is also reffered to as XHTML is
that, the rules of
well-formed html are available in the form of a DTD.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
String s = "<!ELEMENT vijay (#PCDATA)>";
a.WriteDocType("vijay", null, null, s);
a.WriteStartElement("vijay");
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay[<!ELEMENT vijay (#PCDATA)>]>
<vijay />
The WriteDocType function accepts four parameters. The first parameter is the
starting or root
tag 'vijay'. Hence, it must contain a value. The last parameter is the subset
(as referred to by
the documentation), which follows the root tag 'vijay'. If you observe the
DOCTYPE statement
carefully, you will notice that an extra pair of square brackets [], have been
added.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.WriteDocType("vijay", null, "a.dtd", null);
a.WriteStartElement("vijay");
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay SYSTEM "a.dtd">
<vijay />
The third parameter to WriteDocType function specifies the name of the DTD file.
In other words,
it states the URI of the DTD. The second parameter is assigned the value of
null. Hence, the
word SYSTEM is displayed before the name of the file, in the XML file.
Whenever XML wishes to ensure the validity of an XML file, it ascertains the
rules from a.dtd.
If both internal and external DTDs are present, both of them are checked.
However, the internal
DTD is accorded priority.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.WriteDocType("vijay", "mmm", "a.dtd", null);
a.WriteStartElement("vijay");
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay PUBLIC "mmm" "a.dtd">
<vijay />
In the earlier program, SYSTEM was added in the XML file, since the second
parameter had been
specified as null. However, in this program, the second parameter is not null.
Hence, the word
PUBLIC gets added. Thereafter, the string or the id specified in the second
parameter is added.
And then, the dtd in the third parameter is specified.
Therefore, it is either the PUBLIC identifier or the SYSTEM identifier, which
would be present.
The XML program or the processor scanning the XML file, uses the PUBLIC
identifier to retrieve
the content for the entities that use the URI. If it fails, it falls back upon
to the SYSTEM
literal.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main() {
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument(false);
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0" standalone="no"?>
The WriteStartDocument can take a boolean parameter that adds an attribute which
could either be
'standalone = yes' or 'standalone=no', depending upon the value specified. This
attribute
determines whether the DTD is in an external file or it is internal to the XML
file. If the
standalone has a value of 'yes', it is suggestive of the fact that there is no
external DTD, and
therefore, all the grammatical rules have to be placed within the XML file
itself.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main() {
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
System.Console.WriteLine(a.WriteState);
a.WriteStartDocument();
System.Console.WriteLine(a.WriteState);
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
System.Console.WriteLine(a.WriteState);
a.WriteStartElement("vijay");
System.Console.WriteLine(a.WriteState);
a.WriteAttributeString ("wife","sonal");
System.Console.WriteLine(a.WriteState);
a.WriteStartAttribute("hi", "mukhi", "xxx:yyy");
System.Console.WriteLine(a.WriteState);
a.WriteString("1-861003-78");
a.WriteElementString("surname", "mukhi");
a.Flush();
System.Console.WriteLine(a.WriteState);
a.Close();
System.Console.WriteLine(a.WriteState);
}
}
Output
Start
Prolog
Prolog
Element
Element
Attribute
Content
Closed
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay wife="sonal" hi:mukhi="1-861003-78" xmlns:hi="xxx:yyy">
<surname>mukhi</surname>
</vijay>
The XmlTextWriter object can be in any one of six different states. The
WriteState property
reveals its current state. When an XmlTextWriter Object is created, it is in the
Start state, as
may be evident from the fact that, no write method has been called so far. After
the Close
function, the Writer is in the Closed state. When the WriteStartDocument and
WriteDocType
functions are called, they reach the Prolog state, because the prolog is being
written.
The WriteStartElement function actually starts writing to the XML file, thereby,
morphing to the
Element state. The element start tag 'vijay' begins the XML file. The next
function
WriteAttributeString does not change the state, since the element in focus still
is 'vijay'. The
WriteStartAttribute function needs the WriteString to complete the attribute.
Thus, after the
WriteStartAttribute function executes, the Text Writer assumes the Attribute
mode. The surname
attribute becomes the content in the XML file. Hence, the state changes to
Content mode.
This goes on to prove that the TextWriter can possibly be in any one of the
above six states,
depending upon the entities written to the file. While the TextWrtier is in the
Attribute state,
it cannot switch to an element state to write an element. Therefore, it throws
an exception.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.Namespaces = false;
a.WriteStartDocument();
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.WriteAttributeString("jjj", "bk", "kkk", "sonal:wife");
a.Flush();
a.Close();
}
}
Output
Unhandled Exception: System.ArgumentException: Cannot set the namespace if
Namespaces is
'false'.
at System.Xml.XmlTextWriter.WriteStartAttribute(String prefix, String localName,
String ns)
at System.Xml.XmlWriter.WriteAttributeString(String prefix, String localName,
String ns, String
value)
at zzz.Main()
The TextWriter class has a Namespaces property that is read-write, and it has a
default value of
true. The Namespace property is turned off, by setting this property to false.
The above runtime
exception is thrown because, we have attempted to introduce a namespace jjj, in
the
WriteAttributeString function.
a.cs
using System;
using System.Xml;
public class zzz {
public static void Main() {
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.QuoteChar = '\'';
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.WriteAttributeString("jjj", "bk");
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay jjj='bk' />
Various facets of XML can be modified. By using the property QuoteChar, we can
modify the
default quoting character, from double inverted commas to single inverted
commas. Since a single
quote cannot be enclosed within a set of single quotes, we use the backslash to
escape it. All
attributes can now be placed in single quotes instead of double quotes.
a.cs
using System;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextWriter a = new XmlTextWriter ("b.xml", null);
a.WriteStartDocument();
a.Formatting = Formatting.Indented;
a.Indentation = 3;
a.WriteDocType("vijay", null, null ,null);
a.WriteStartElement("vijay");
a.WriteStartAttribute("hi", "mukhi", "xxx:yyy");
a.WriteString("1-861003-78");
a.WriteEndAttribute();
a.WriteEndElement();
a.WriteEndDocument();
a.Flush();
a.Close();
}
}
b.xml
<?xml version="1.0"?>
<!DOCTYPE vijay>
<vijay hi:mukhi="1-861003-78" xmlns:hi="xxx:yyy" />
Good programming style necessitates every 'open' to have a corresponding
'close'. Thus, the
Begin functions for an Element, Attribute and Document have corresponding Close
functions too.
However, if we do not End them, they close by default and no major calamity
befalls them. We are
using them in the above program as an abandon caution.
The WriteEndDocument function puts the Text Writer in the Start mode.
Reading an XML file
b.xml
<?xml version="1.0" standalone="yes"?>
<!DOCTYPE vijay SYSTEM "a.dtd" [<!ENTITY baby "No">]>
<vijay aa="no">
<!--comment 2--><?sonal mukhi=no?>
Hi&baby;
<![CDATA[,mukhi>]]><aa>bb</aa>
</vijay>
> copy con a.dtd
Enter
^Z
a.cs
using System;
using System.IO;
using System.Xml;
public class zzz
{
public static void Main() {
XmlTextReader r;
r = new XmlTextReader("b.xml");
while (r.Read())
{
Console.Write("{0} D={1} L={2} P={3} ", r.NodeType, r.Depth, r.LineNumber,
r.LinePosition );
Console.Write(" name={0} value={1} AC={2}",r.Name,r.Value,r.AttributeCount);
Console.WriteLine();
}
}
}
Output
XmlDeclaration D=0 L=1 P=3 name=xml value=version="1.0" standalone="yes" AC=2
Whitespace D=0 L=1 P=39 name= value= AC=0
DocumentType D=0 L=2 P=11 name=vijay value=<!ENTITY baby "No"> AC=1
Whitespace D=0 L=2 P=54 name= value= AC=0
Element D=0 L=3 P=2 name=vijay value= AC=1
Whitespace D=1 L=3 P=16 name= value= AC=0
Comment D=1 L=4 P=5 name= value=comment 2 AC=0
ProcessingInstruction D=1 L=4 P=19 name=sonal value=mukhi=no AC=0
Text D=1 L=4 P=36 name= value=Hi AC=0
EntityReference D=1 L=5 P=4 name=baby value= AC=0
Whitespace D=1 L=5 P=9 name= value= AC=0
CDATA D=1 L=6 P=10 name= value=,mukhi> AC=0
Element D=1 L=6 P=21 name=aa value= AC=0
Text D=2 L=6 P=24 name= value=bb AC=0
EndElement D=1 L=6 P=28 name=aa value= AC=0
Whitespace D=1 L=6 P=31 name= value= AC=0
EndElement D=0 L=7 P=3 name=vijay value= AC=0
In this program, we read an XML file and display all the nodes contained
therein. To avoid any
errors from being displayed, you should create an empty file by the name of
a.dtd.
We have a class called XmlTextReader that accepts a filename as a parameter. We
pass the
filename b.xml to it. This file contains most of the entities present in an XML
file. The Read
function in this class picks up a single node or XML entity at a time. It
returns true, if there
are more nodes to be read, or else, it returns false. Thus, when there are no
more nodes to be
read from the file, the while loop ends. The Read function scans the active node
and displays
its contents in the loop.
The NodeType property displays the name of the nodetype. As an XML file normally
starts with a
declaration, the NodeType property displays the NodeType as XMLDeclaration,
using the ToString
function.
The Depth property gets incremented by one, every time an element or a tag is
encountered. At
the Declaration statement, the depth is 0. At the EndElement or at the end of
the tag, its value
reduces by one. Thus, the Depth property reveals the number of open tags in the
file and it can
be used for indentation.
The Line Number indicates the line on which the statement is positioned, while
the LinePosition
property displays the position on the line at which the statement begins. The
Name property in
the class reveals the name of the tag, XML. The output displayed by this
property depends upon
the active node type. On acute observation, you shall notice that the word XML
is not preceded
by the symbol <? in the output.
The value property relates to the name property, in this case, to
XmlDeclaration. It displays
the entire gamut of attributes to the node. As there exist two attributes,
version and
standalone, the property AttributeCount displays a value of 2.
If the enter key is pressed after the node declaration, it is interpreted as a
Whitespace
character. Whitespace characters are separators, which could consist of an
enter, space et al.
The Position property specifies the character position as 39.
The XmlDeclaration has to be the first node in an XML file, and it cannot have
any children. The
DOCTYPE declaration, which is known as a DocumentType Node, displays the name as
vijay, which is
the root node. The value is displayed as <!ENTITY baby "No">, which includes
everything except
the SYSTEM and a.dtd. Thus, in the case of a DocumentType Node, value is the
internal DTD.
We shall encounter the Whitespace Node very frequently. Hence, we shall not
discuss it
hereinafter. The Attribute Count will be displayed in the next program. This
node can have the
Notation and Entity as child nodes.
The next node in sequence is our very first element or tag 'vijay', which is the
same value that
was displayed earlier, with the name property for the DocumentType Node. The
Value property for
this element shows null, since tags are devoid of Values. Instead, they have
Attributes.
The attribute Count displays a value of one. At the following Whitespace node,
the Depth
property gets incremented by one. This is the only way to ascertain whether we
are at the root
node or not. We now stumble upon a comment, which has no name. The value
displayed is the value
of the comment. And yet again, the <!-characters are not displayed along with
the value.
Thereafter, a processing instruction (PI) is encountered. No whitespace is
displayed between the
comment and the PI, since we have not pressed the Enter key. 'Sonal' becomes the
name of the
program that runs 'vijay'. The rest turns into the value property having no
attributes. TextNode
is displayed next because the text 'Hi' is displayed in the XML file. This node
too is not
assigned any name and the value is depicted as 'Hi'.
What follows the text is an Entity Reference. It is assigned the name 'baby' and
is devoid of
the ampersand sign. Its value is null and it does not have any attributes. The
CDATA section is
given the name as null. The value is assigned the content of the CDATA, after
stripping away the
square brackets.
The value of the Depth property is incremented by 1. The Text Node follows the
element aa. This
node does not have any name and it displays the value as 'bb'. In the following
program, we
explore the various attributes.
a.cs
using System;
using System.IO;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextReader r;
r = new XmlTextReader("b.xml");
r.WhitespaceHandling = WhitespaceHandling.None;
while (r.Read())
{
Console.Write("{0} D={1} L={2}
P={3}",r.NodeType,r.Depth,r.LineNumber,r.LinePosition);
Console.Write(" name={0} value={1} AC={2}",r.Name,r.Value,r.AttributeCount);
Console.WriteLine();
if (r.HasAttributes)
{
for ( int i =0; i < r.AttributeCount; i++)
{
r.MoveToAttribute(i);
System.Console.WriteLine("Att {0}={1}",r.Name,r[i]);
}
}
}
}
}
Output
XmlDeclaration D=0 L=1 P=3 name=xml value=version="1.0" standalone="yes" AC=2
Att version=1.0
Att standalone=yes
DocumentType D=0 L=2 P=11 name=vijay value=<!ENTITY baby "No"> AC=1
Att SYSTEM=a.dtd
Element D=0 L=3 P=2 name=vijay value= AC=1
Att aa=no
Comment D=1 L=4 P=5 name= value=comment 2 AC=0
ProcessingInstruction D=1 L=4 P=19 name=sonal value=mukhi=no AC=0
Text D=1 L=4 P=36 name= value=
Hi AC=0
EntityReference D=1 L=5 P=4 name=baby value= AC=0
CDATA D=1 L=6 P=10 name= value=,mukhi> AC=0
Element D=1 L=6 P=21 name=aa value= AC=0
Text D=2 L=6 P=24 name= value=bb AC=0
EndElement D=1 L=6 P=28 name=aa value= AC=0
EndElement D=0 L=7 P=3 name=vijay value= AC=0
A property called WhiteSpaceHandling is initialized to None, as a result of
which, the node
Whitespace is not visible in the output.
The XmlTextReader has a member HasAttributes, which returns a True value if the
node has
attributes and False otherwise. Alternatively, we could also have used the
property
AttributeCount to obtain the number of attributes that the node contains.
If the node has attributes, a 'for statement' is used to display all of them. In
the loop, we
first use the function MoveToAttribute to initially activate the attribute. This
is achieved by
passing the number as a parameter to the function. Bear in mind that the index
starts from Zero
and not One.
Thereafter, the Name property is used to display the name of the attribute. If
the attribute is
not activated, the Name property displays the name of the node. This explains
the significance
of the MoveToAttribute function.
As you would recall, the XmlTextReader class has an indexer for the attributes,
and like all
indexers, it is zero based, i.e. r[0] accesses the value of the first attribute.
This is how we
display the details of all attributes of the node.
For the node DOCTYPE, the SYSTEM becomes the name of the attribute and the value
becomes the
name of the DTD file. For an element, the attributes are specified in name-value
pairs.
a.cs
using System;
using System.IO;
using System.Xml;
public class zzz
{
public static XmlTextReader r;
public static void Main()
{
r = new XmlTextReader("b.xml");
int declaration=0, pi=0, doc=0, comment=0, element=0, attribute=0, text=0,
whitespace=0,cdata=0,endelement=0,
entityr=0,entitye=0,entity=0,swhitespace=0,notation=0;
while (r.Read())
{
Console.Write("{0} D={1} L={2}
P={3}",r.NodeType,r.Depth,r.LineNumber,r.LinePosition);
Console.Write(" name={0} value={1} AC={2}",r.Name,r.Value,r.AttributeCount);
Console.WriteLine();
if (r.HasAttributes)
{
for ( int i =0; i < r.AttributeCount; i++)
{
r.MoveToAttribute(i);
System.Console.WriteLine("Att {0}={1}",r.Name,r[i]);
}
}
switch (r.NodeType)
{
case XmlNodeType.XmlDeclaration:
declaration++;
break;
case XmlNodeType.ProcessingInstruction:
pi++;
break;
case XmlNodeType.DocumentType:
doc++;
break;
case XmlNodeType.Comment:
comment++;
break;
case XmlNodeType.Element:
element++;
if (r.HasAttributes)
attribute += r.AttributeCount;
break;
case XmlNodeType.Text:
text++;
break;
case XmlNodeType.CDATA:
cdata++;
break;
case XmlNodeType.EndElement:
endelement++;
break;
case XmlNodeType.EntityReference:
entityr++;
break;
case XmlNodeType.EndEntity:
entitye++;
break;
case XmlNodeType.Notation:
notation++;
break;
case XmlNodeType.Entity:
entity++;
break;
case XmlNodeType.SignificantWhitespace:
swhitespace++;
break;
case XmlNodeType.Whitespace:
whitespace++;
break;
}
}
Console.WriteLine ();
Console.WriteLine("XmlDeclaration: {0}",declaration);
Console.WriteLine("ProcessingInstruction: {0}",pi);
Console.WriteLine("DocumentType: {0}",doc);
Console.WriteLine("Comment: {0}",comment);
Console.WriteLine("Element: {0}",element);
Console.WriteLine("Attribute: {0}",attribute);
Console.WriteLine("Text: {0}",text);
Console.WriteLine("Cdata: {0}",cdata);
Console.WriteLine("EndElement: {0}",endelement);
Console.WriteLine("Entity Reference: {0}",entityr);
Console.WriteLine("End Entity: {0}",entitye);
Console.WriteLine("Entity: {0}",entity);
Console.WriteLine("Whitespace: {0}",whitespace);
Console.WriteLine("Notation: {0}",notation);
Console.WriteLine("Significant Whitespace: {0}",swhitespace);
}
}
Output
XmlDeclaration D=0 L=1 P=3 name=xml value=version="1.0" standalone="yes" AC=2
Att version=1.0
Att standalone=yes
Whitespace D=0 L=1 P=39 name= value=
AC=0
DocumentType D=0 L=2 P=11 name=vijay value=<!ENTITY baby "No"> AC=1
Att SYSTEM=a.dtd
Whitespace D=0 L=2 P=54 name= value=
AC=0
Element D=0 L=3 P=2 name=vijay value= AC=1
Att aa=no
Whitespace D=1 L=3 P=16 name= value=
AC=0
Comment D=1 L=4 P=5 name= value=comment 2 AC=0
ProcessingInstruction D=1 L=4 P=19 name=sonal value=mukhi=no AC=0
Text D=1 L=4 P=36 name= value=
Hi AC=0
EntityReference D=1 L=5 P=4 name=baby value= AC=0
Whitespace D=1 L=5 P=9 name= value=
AC=0
CDATA D=1 L=6 P=10 name= value=,mukhi> AC=0
Element D=1 L=6 P=21 name=aa value= AC=0
Text D=2 L=6 P=24 name= value=bb AC=0
EndElement D=1 L=6 P=28 name=aa value= AC=0
Whitespace D=1 L=6 P=31 name= value=
AC=0
EndElement D=0 L=7 P=3 name=vijay value= AC=0
Whitespace D=0 L=7 P=9 name= value=
AC=0
XmlDeclaration: 0
ProcessingInstruction: 1
DocumentType: 0
Comment: 1
Element: 1
Attribute: 0
Text: 2
Cdata: 1
EndElement: 2
Entity Reference: 1
End Entity: 0
Entity: 0
Whitespace: 6
Notation: 0
Significant Whitespace: 0
The above program is a continuation from where we left off in the previous
program. The initial
portion of the code is identical. A colossal case statement is introduced in the
program to
check the NodeType.
For each Node Type, there is a corresponding variable, whose value is
incremented by 1 whenever
the Node Type matches. Then, the values contained in these variables are
displayed. For
inexplicable reasons, the NodeType property does not return the following node
types - Document,
DocumentFragment, Entity, EndEntity, or Notation.
a.cs
using System;
using System.IO;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextReader r = new XmlTextReader("b.xml");
r.WhitespaceHandling = WhitespaceHandling.None;
while (r.Read())
{
if (r.HasValue)
Console.WriteLine("{0} {1}={2}", r.NodeType, r.Name, r.Value);
else
Console.WriteLine("{0} {1}", r.NodeType, r.Name);
}
}
}
Output
XmlDeclaration xml=version="1.0" standalone="yes"
DocumentType vijay=<!ENTITY baby "No">
Element vijay
Comment =comment 2
ProcessingInstruction sonal=mukhi=no
Text =
Hi
EntityReference baby
CDATA =,mukhi>
Element aa
Text =bb
EndElement aa
EndElement vijay
The HasValue property simply identifies whether a Node can contain a value or
not. There are
nine nodes that can possess values. These nodes are Attribute, CDATA, Comment,
DocumentType,
ProcessingInstruction, Significant Whitespace, Whitespace, Text and
XmlDeclaration. All the
above nodes must have a value, but they need not necessarily have a name.
a.cs
using System;
using System.IO;
using System.Xml;
public class zzz
{
public static void Main()
{
XmlTextReader r = new XmlTextReader("b.xml");
r.MoveToContent();
string s = r["mukhi"];
Console.WriteLine(s);
s = r.GetAttribute("sonal");
Console.WriteLine(s);
s = r[2];
Console.WriteLine(s);
}
}
b.xml
<vijay mukhi="no" sonal="yes" aaa="bad" />
Output
no
yes
bad
The MoveToContent function moves to the first element in the XML file.
In this program, we display the attributes using different methods. In the first
approach, the
indexer is passed a string, which is the name of the attribute 'mukhi'. It
receives 'no' as the
return value.
In the second approach, the indexer is passed the integer value 2 as a
parameter, to access the
value of the third attribute, which is 'bad'.
Alternatively, the WriteAttribue function could have been given the string
'sonal' as a
parameter, to return the value of the attribute as 'yes'. Thus, there are
multiple means to
achieving the same objective.
a.cs
using System;
using System.IO;
using System.Xml;
public class zzz
{
public static void Main() {
XmlTextReader r = new XmlTextReader("b.xml");
r.MoveToContent();
string s ;
s = r.GetAttribute("aa:bb");
Console.WriteLine(s);
s = r.GetAttribute("bb");
Console.WriteLine(s);
s = r.GetAttribute("bb","sonal:mukhi");
Console.WriteLine(s);
s = r.GetAttribute("bb","sonal:mukhi");
Console.WriteLine(s);
s = r.GetAttribute("bb","aa");
Console.WriteLine(s);
s = r.GetAttribute("xmlns:aa");
Console.WriteLine(s);
}
}
b.xml
<vijay xmlns:aa="sonal:mukhi" aa:bb="no" />
Output
no
no
no
sonal:mukhi
The MoveToContent function is used in this program, instead of the Read
function. In the file
b.xml, we have an attribute bb in the namespace aa. It is initialized to a value
of 'no'. The
namespace aa has a URI, sonal:mukhi, because of the xmlns declaration. Thus, the
full name of
the attribute becomes aa:bb i.e. prefix, followed by the colon, followed by the
actual name. As
a result, specifying aa:bb results in the display of 'no', but only specifying
bb as a parameter
to GetAttribute results in a null value.
The full name of an attribute includes the name of the namespace too. So, we can
use the second
form of the GetAttribute function that has an overload of two parameters, where
the second
parameter is the name of the URI and not the namespace. Hence, it is acceptable
to call the
function with the URI sonal:mukhi, but if we use the namespace aa, no output
will be produced.
The last GetAttribute utilizes the full name xmlns:aa to retrieve the URI for
the element. Thus,
we can use this variant of the GetAttribute function with the URI instead of the
namespace:name.
a.cs
using System;
using System.IO;
using System.Xml;
public class zzz {
public static void Main() {
XmlTextReader r = new XmlTextReader("b.xml");
r.WhitespaceHandling=WhitespaceHandling.None;
r.MoveToContent();
r.MoveToAttribute("cc");
Console.WriteLine(r.Name + " " + r.Value);
Console.WriteLine(r.ReadAttributeValue());
Console.WriteLine(r.Name + " " + r.Value);
}
}
b.xml
<vijay aa="hi" bb="bye" cc="no" />
Output
cc no
True
No
In this example, we directly focus on the attribute that we are interested in,
i.e. cc. The name
and value properties in XMLTextReader display 'cc' and 'no' respectively. As
there are numerous
attributes of the node remaining to be read, the ReadAttribute function returns
True. This
function is normally used to read text or entity reference nodes that constitute
the value of
the attribute.
The Name property of the XmlTextReader however becomes null after the function
ReadAttributeValue is called.
No comments:
Post a Comment