Web solution, Websites help, Java help, C & C# Language help

Wednesday, December 26, 2007

C Language Help, C Language Tutorials, C Language Programming, C Language Tricks { The C# Language }

C Language Help, C Language Tutorials, C Language Programming, C Language Tricks
XML Documentation




XML is used extensively to document program code. Apart from its status of a
mere acronym for eXtensible Markup Language, XML has a bounden duty to achieve
miracles. However, in this chapter, we would be utilizing XML for the temporal
task of documenting our code.



a.cs

class zzz

{

public static void Main() {

}

}



Run the compiler as



>csc a.cs /doc:a.xml



a.xml

<?xml version="1.0"?>

<doc>

<assembly>

<name>a</name>

</assembly>

<members>

</members>

</doc>



This is the smallest possible XML file that can be effectively generated by the
compiler. The compiler with the /doc option, followed by a colon and filename,
creates the XML file, which documents the C# program.



Anything placed within the <> brackets is called a tag. At the expense of
sounding tediously repetitive, we once again reiterate that, each and every XML
file must necessarily start with a tag beginning with <?xml>. This has to be
followed by an attribute called version. Presently, the only permissible value
that can be assigned to version is 1.0. There may never be a version 1.1 or 2.0.




Subsequent to this, a root tag called 'doc' is specified, which envelopes all
other tags within the file. We have a tag called 'assembly' which owns another
tag called 'name' that encloses our assembly name 'a'. The name of our program
file is 'a'. There are no members to be documented, therefore, no names are
noticeable within the member tags. This is the basic structure of our XML file.




It is the bare minimum, no frills structure, created by the compiler. Now, let
us refurbish and garnish it through our own contribution.



a.cs

class zzz

{

/// <vijay>

public static void Main()

{

}

}



We have added an XML tag in the .cs file. The documentation clearly states that
an XML tag in a .cs file cannot begin with two slashes, since they are reserved
for comments. It should instead have three slashes. The use of the two slashes
generates the following warning:





Output

a.cs(4,3): warning CS1570: XML comment on 'zzz.Main()' has badly formed XML --
'End tag 'member' does not match the start tag 'vijay'.'



From now on, we shall only display the pertinent portions of the XML file, while
ignoring the portion that is invariant, such as, the doc and the /doc tags. So
in the file created as:



a.xml

<?xml version="1.0"?>

<doc>

<assembly>

<name>a</name>

</assembly>

<members>

<!-- Badly formed XML comment ignored for member "M:zzz.Main" -->

</members>

</doc>



we will display only



xml fragment:

<members>

<!-- Badly formed XML comment ignored for member "M:zzz.Main" -->

</members>



In the next program, a closing tag of 'vijay' is specified.



a.cs

class zzz

{

/// <vijay>

/// Function Main gets called first

/// </vijay>

public static void Main()

{

}

}



xml

<members>

<member name="M:zzz.Main">

<vijay>

Function Main gets called first

</vijay>

</member>

</members>



The earlier program generated an error, since there was no corresponding end tag
to the tag 'vijay'.



Every tag in XML must necessarily have a start tag and an end tag. We are at
liberty to choose any name for the tag. Therefore, we have chosen the name
'vijay'.



The compiler creates a tag called 'member' under the tag members, and adds an
attribute called 'name'. This attribute is initialized to the name of the C#
entity, which is the position where the tag is placed. The value assigned to
name is, as follows:

• The letter M,

• Followed by a colon separator,

• Followed by the name of the class zzz,

• Followed by a dot separator

• Finally, followed by the name of the function i.e. Main.



Thus, it displays M:zzz.Main.



The tag 'vijay' and the text enclosed between the start and end tags of 'vijay'
subsequently, get enclosed within the members tag.



a.cs

/// <mukhi>

/// Our first class

/// </mukhi>

class zzz

{

/// <vijay>

/// Function Main gets called first

/// </vijay>

public static void Main()

{

}

}



xml

<members>

<member name="T:zzz">

<mukhi>

Our first class

</mukhi>

</member>

<member name="M:zzz.Main">

<vijay>

Function Main gets called first

</vijay>

</member>

</members>



Now, we have added a tag called 'mukhi' to the class. It is however, not added
to the function. So presently, there are two member tags, and each has been
assigned a name within the members tag. The value assigned to the member name,
which encloses 'mukhi' is, T:zzz. This is created in a manner akin to that of a
function, with the exception of the prefix M, which has now been replaced by T.




The rationale behind introducing XML tags is that, a program known as an XML
Parser can interpret the above XML file, and subsequently, create legible
output. But, the prime impediment that we have to confront and surmount is, the
reluctance of programmers to document code in the first place.



a.cs

class zzz

{

/// <mukhi>

/// Our Instance Variable

/// </mukhi>

public int i;

public static void Main()

{

/// <vijay>

/// Our first variable ii

/// </vijay>

int ii;

}

}



Output

a.cs(12,5): warning CS0168: The variable 'ii' is declared but never used

a.cs(9,1): warning CS1587: XML comment is not placed on a valid language element

a.cs(6,12): warning CS0649: Field 'zzz.i' is never assigned to, and will always
have its default value 0



xml

<members>

<member name="F:zzz.i">

<mukhi>

Our Instance Variable

</mukhi>

</member>

</members>



Oops! The tags that we give within a function have not been reflected in the XML
file. The instance variable i is flagged with an F, but the local variable ii is
not reflected in the XML file at all.



We have carried this out on purpose, with the intention of demonstrating that it
is not permissible to write XML tags within functions. Even though our preceding
action would not produce any error, the compiler shall, in any case, ignore our
work of art.



We will now dwell upon the names or string ids, which the compiler generates for
names of entities. They are as follows:

• N stands for a Namespace.

• T represents a type, class, interface, struct, enum and delegate.

• F denotes a field.

• P is indicative of either properties or indexers.

• M indicates all methods

• E represents an event.

• ! is indicative of an error.



a.cs

/// <vijay>

/// Our first variable ii

/// </vijay>

namespace aaa

{

class zzz

{

public static void Main()

{

}

}

}



Output

a.cs(1,1): warning CS1587: XML comment is not placed on a valid language element



We do not enjoy the license to place tags wherever we desire. Thus, a warning is
issued when a tag is placed on a namespace, although no error is generated. In
the earlier case too, a similar warning was issued, when we had placed the tags
in a function.



a.cs

class zzz

{

/// <vijay>

/// Our function <c>Main</c> is the starting point

/// </vijay>

public static void Main()

{

}

}



xml

<member name="M:zzz.Main">

<vijay>

Our function <c>Main</c> is the starting point

</vijay>

</member>



If we place the tag <c> and </c> around some text in a tag, the tag gets copied
verbatim, into the XML file. It is indicative of the fact that the text
represents some code. To mark multiple lines of code, the code tag may be used
instead. The example tag also works in a similar manner.



a.cs

/// <exception cref="System.Exception"> Our exception class </exception>

class zzz

{

public static void Main()

{

}

}



xml

<member name="T:zzz">

<exception cref="T:System.Exception"> Our exception class </exception>

</member>



The exception tag works as demonstrated above, with a single difference in that,
the attribute cref is inspected for accuracy. This attribute should represent a
valid C# entity, otherwise, the following warning is displayed:



/// <exception cref="System.Exceptions"> Our exception class </exception>



Output

a.cs(2,22): warning CS1574: XML comment on 'zzz' has cref attribute
'System.Exceptions' that could not be found.



xml

<member name="T:zzz">

<exception cref="!:System.Exceptions"> Our exception class </exception>

</member>



All the tags expounded by us so far, have a similar functionality. This surely
must have kindled your curiosity to discover the rationale behind employing so
many tags.



This is so, because each of these tags, documents different features of the
language. Let us assume that we yearn for a bulleted list in our final
documentation. We would have to use the following tags to accomplish it:



a.cs

/// <list type="bullet">

/// <item>

/// <description>One</description>

/// </item>

/// <item>

/// <description>Two</description>

/// </item>

/// </list>

class zzz

{

public static void Main() {

}

}



xml fragment

<member name="T:zzz">

<list type="bullet">

<item>

<description>One</description>

</item>

<item>

<description>Two</description>

</item>

</list>

</member>



The attribute type can also be assigned the values of bullet, number or table.

Bear in mind, that an external program will interpret the XML tags. Thus, the
designers of the language standardized tags for describing different things. If
the same tag had been utilized for documenting every feature, a single program
would have sufficed to document the entire C# code. We have employed tags such
as 'vijay' and 'mukhi' to document our classes. Chaos would reign supreme, if
everyone decided to use his or her own names. Appended below is a brief summary
of the various tags that can be employed in the XML file:

• The param tag is used to document the parameters to a function.

• The paramref tag is used to document a parameter name.

• The permission tag is used to document the access allowed to a member.

• The remarks tag is used to specify the overall information for a class.

• The return tag has been introduced to document the return values of functions.


• The seealso tag helps in implementing the 'see also' section at the bottom of
help sections.

• The summary tag documents all members of a type.

• The value tag documents all the properties.

C Language Help, C Language Tutorials, C Language Programming, C Language Tricks { The C# Language }

C Language Help, C Language Tutorials, C Language Programming, C Language Tricks
XML Data Document



a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlDocument d = new XmlDocument();

XmlTextReader r = new XmlTextReader("books.xml");

r.WhitespaceHandling = WhitespaceHandling.None;

//r.MoveToContent();

d.Load(r);

d.Save(Console.Out);

}

}



books.xml

<?xml version='1.0'?>

<!-- This file represents a fragment of a book store inventory database -->

<bookstore>

<book genre="autobiography" publicationdate="1981" ISBN="1-861003-11-0">

<title>The Autobiography of Benjamin Franklin</title>

<author>

<first-name>Benjamin</first-name>

<last-name>Franklin</last-name>

</author>

<price>8.99</price>

</book>

<book genre="novel" publicationdate="1967" ISBN="0-201-63361-2">

<title>The Confidence Man</title>

<author>

<first-name>Herman</first-name>

<last-name>Melville</last-name>

</author>

<price>11.99</price>

</book>

<book genre="philosophy" publicationdate="1991" ISBN="1-861001-57-6">

<title>The Gorgias</title>

<author>

<name>Plato</name>

</author>

<price>9.99</price>

</book>

</bookstore>



Output

<?xml version="1.0"?>

<!-- This file represents a fragment of a book store inventory database -->

<bookstore>

..



The XmlDocument class represents an XML document. The W3C has two standards, one
called the Document Object Model DOM Level 1, and the other called the Core DOM
Level 2. The XmlDocument class implements these standards. This class is derived
from an abstract class called XmlNode.



The XML Document is loaded into memory as a tree structure, thereby facilitating
the navigation within and the editing of the document. The representation of the
document in memory is known as a DOM.

The XML file books.xml represents a bookstore , which stores a large number of
books. This file shall not be displayed in future since it is pretty lengthy.
This file is utilised by most of the XML samples supplied by Microsoft.



The tag 'books' represents a book, and has attributes such as, type of book,
genre, the year of publication or publication date etc; and finally, a unique
number, ISBN, assigned to the book. Each book has a title as well as an author.
The attribute author contains a few tags that disclose details such as, the
first name and the last name of the author. The price of the book is the last
element. It is not imperative for these elements of a book to be present in a
specific order, i.e. price can come before the title, and so on.



The object r of type XmlTextReader class represents the XML file books.xml. As
always, the WhiteSpaceHandling is set to None. D is an object that looks like
XmlDocument. The Load function of this class can handle four overloads. One of
the overloads that the function can take is an XmlReader. As we are supplying an
XmlTextReader to this function, it eventually scales down to an XmlReader.



The Load function is responsible for loading the XML file into the XmlDocument
object. If the property PreserveWhitespace is set to True, then and only then,
are the Whitespace nodes created. In this case, since we have set Whitespaces to
None in the XML file, no white space nodes can ever be created, regardless of
property PreserveWhitespace being set to True. The Load function cannot change
the Whitespace Handling present in the reader. The Load method does not perform
any validations. In case validations are imperative, the XmlValidatingReader
class must be utilised in place of the plain Jane XmlTextReader class. The same
argument holds good when entities have to be resolved.



The Save function writes the XML document associated with the XmlDocument,
either to disk or to the console. We have displayed only a section of the
document, since the entire books.xml shall occupy space needlessly. We acquire
the entire file because the ReadState is set to Initial. This is so because the
state has not been changed yet. Let us now position the reader and see what
happens.



We remove the comment marks associated with MoveToContent, thereby, making it
available. As a consequence, both, the XmlDeclaration and the content are
skipped, and the reader is positioned at the first node 'Bookstore'. The output
is as displayed below:



Output

<bookstore>

<book genre="autobiography" publicationdate="1981" ISBN="1-861003-11-0">



</bookstore>



a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlDocument d = new XmlDocument();

XmlTextReader r = new XmlTextReader("books.xml");

r.WhitespaceHandling = WhitespaceHandling.None;

r.MoveToContent();

r.Read();

r.Skip();

r.Skip();

d.Load(r);

d.Save(Console.Out);

}

}



Output

<book genre="philosophy" publicationdate="1991" ISBN="1-861001-57-6">

<title>The Gorgias</title>

<author>

<name>Plato</name>

</author>

<price>9.99</price>

</book>



The Read function takes the liberty to position itself at the book tag, the
first Skip function skips over the first book, and the second skip function
skips over the second book. The Load function of the XmlDocument now loads the
node from the reader's current position, i.e. from the third book onwards. Thus,
the XmlDocument contains the third book, which is about Philosophy, and not the
other two books. The skip function skips over nodes that are at the same level.



a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlDocument d = new XmlDocument();

d.LoadXml("<aa bb='hi'> bad </aa>");

d.Save("d.xml");

}

}



d.xml

<aa bb="hi"> bad </aa>



There are different ways of associating XML with an XmlDocument object. The
XmlDocument class has a function called LoadXml that converts a string
representing XML into an XmlDocument object. The Save function writes the
contents to a file on disk named d.xml.



a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlDocument d = new XmlDocument();

d.CreateElement("vijay");

d.Save("d.xml");

}

}



Output

Unhandled Exception: System.Xml.XmlException: This an invalid XML document. The
document does not have a root element.



The CreateElement function, as the name suggests, creates an element with the
name passed as a parameter. We encounter an impediment when we attempt to write
it to disk. An exception is thrown, since we have not added the element to the
XmlDocument. Had we used the Save overload that writes to Console.Out, no
exceptions would have been thrown, but neither would any output have been
displayed on the screen.



a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main() {

XmlDocument d = new XmlDocument();

XmlElement e;

e = d.CreateElement("vijay");

d.AppendChild(e);

d.Save("d.xml");

}

}



d.xml

<vijay />



The CreateElement function returns an XmlElement object, which is stored in e.
This object is then passed on to the AppendChild function, which creates a child
node. Thus, when we write the XmlDocument to disk, d.xml will display a single
tag of 'vijay'.

We shall now examine a series of programs, which shall create each and every
type of node and shall write them all to disk.



a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlDocument d = new XmlDocument();

XmlElement e;

e = d.CreateElement("vijay");

XmlAttribute a = d.CreateAttribute("a1");

e.SetAttributeNode(a);

e.SetAttribute("a1", "howdy");

d.AppendChild(e);

d.Save("d.xml");

}

}



d.xml

<vijay a1="howdy" />



CreateElement creates an element called 'vijay'. Thereafter, a function called
CreateAttribute creates an attribute called a1. This function returns an
XmlAttribute object that is associated with the element e, using the function
SetAttributeNode, off the XmlElement class. If we comment the line
e.SetAttribute("a1", "howdy"); we shall see the file d.xml containing the
following:



d.xml

<vijay a1="" />



The attribute gets created, but it is devoid of any value at this stage. It is
the SetAttribute function from the XmlElement class that will assign the value
to the attribute. Hence, it accepts two parameters.



The first parameter is a1, which is the name of the attribute, and the second
parameter 'howdy' is the value of the attribute.

a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlDocument d = new XmlDocument();

XmlElement e;

e = d.CreateElement("vijay");

d.AppendChild(e);

XmlCDataSection c;

c = d.CreateCDataSection("sonal mukhi");

d.AppendChild(c);

d.Save("d.xml");

}

}



Output

Unhandled Exception: System.InvalidOperationException: The specified node cannot
be inserted as the valid child of this node, because the specified node is the
wrong type.



We first create an element 'vijay' by using the CreateElement function.
Thereafter, using the Append Child function, we add the element to the
XmlDocument. No errors are generated upto this point.



Now, we wish to add a CDATA section to the file. To achieve this, the
CreateCDataSection is used with a string parameter 'sonal mukhi', which returns
an XmlCDataSection object. Employing the familiar AppendChild function, the
section is added to the file.



Since an XML document cannot have a CDATA section by itself, the above exception
is generated.



a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlDocument d = new XmlDocument();

XmlElement e;

e = d.CreateElement("vijay");

d.AppendChild(e);

XmlCDataSection c;

c = d.CreateCDataSection("sonal mukhi");

XmlElement r = d.DocumentElement;

r.AppendChild(c);

d.Save("d.xml");

}

}



d.xml

<vijay><![CDATA[sonal mukhi]]></vijay>



The error now disappears, as the AppendChild function is called from the
XmlElement class, and not from the XmlDocument class.



The DocumentElement property represents the root or the first element of
theXMLdocument. The CDATA section now gets added to the element 'vijay', since
the root element is 'vijay'.



a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main() {

XmlDocument d = new XmlDocument();

d.LoadXml("<vijay> hi </vijay>");

XmlCDataSection c;

c = d.CreateCDataSection("sonal mukhi");

XmlElement r = d.DocumentElement;

r.AppendChild(c);

d.Save("d.xml");

}

}

d.xml

<vijay> hi <![CDATA[sonal mukhi]]></vijay>



An alternative approach would be to use the trustworthy LoadXml function, in
lieu of CreateElement. Any of the two approaches may be employed to add a root
node, thereby, displaying the d.xml file. As we have been harping repeatedly,
there are many ways to skin a cat.



a.cs

using System.Xml;

public class Sample

{

public static void Main()

{

XmlDocument d = new XmlDocument();

d.LoadXml("<!DOCTYPE book > <book /> ");

XmlDocumentType t;

t = d.DocumentType;

System.Console.WriteLine(t.OuterXml);

System.Console.WriteLine(t.Name);

}

}



Output

<!DOCTYPE book[]>

book



This program demonstrates our ability to access a specific node while processing
an XML file.



The DocumentType property returns an XmlDocumentType t, which represents the
singular DOCTYPE node present in an XML file. This object t has a large number
of properties that facilitate access to all the details regarding the node. In
the program, only two properties are displayed:

• The first property OuterXml represents the entire node as a string. Since no
external DTD file is present, by default, the internal DTD, which does not have
any contents, is displayed.

• The second property Name presents the name of the root node.

a.cs

using System;

using System.Xml;

public class Sample

{

public static void Main()

{

XmlDocument d = new XmlDocument();

d.LoadXml("<book> <title>None</title> <author>Me</author> </book>");

XmlNode r = d.FirstChild;

XmlNode n = r.FirstChild;

Console.WriteLine(r);

Console.WriteLine(n.OuterXml);

Console.WriteLine(r.OuterXml);

}

}



Output

System.Xml.XmlElement

<title>None</title>

<book><title>None</title><author>Me</author></book>



The FirstChild property of the XmlDocument class or XmlNode class retrieves the
first child of the current node in the document i.e. XmlElement. The value is
stored in both r and n. It can be displayed using the WriteLine function. The
OuterXml property contains the tags, including the child nodes.



The FirstChild property of r is an XmlNode, whose OuterXml is the first child
node or tag title within the tag 'book'. If no child node exists, the value is
null.



a.cs

using System;

using System.Xml;

public class Sample

{

public static void Main()

{

XmlDocument d = new XmlDocument();

d.LoadXml("<book> <title>None</title> <author>Me</author> </book>");

XmlNode r = d.LastChild;

XmlNode n = r.LastChild;

Console.WriteLine(r);

Console.WriteLine(n.OuterXml);

Console.WriteLine(r.OuterXml);

}

}



Output

System.Xml.XmlElement

<author>Me</author>

<book><title>None</title><author>Me</author></book>



Comparable to the FirstChild property is the LastChild property, which merely
returns the last child in the node. Therefore, as the 'author' tag is the last
tag, the LastChild property returns a reference to 'Me'. Since title is the only
tag in the file, the use of FirstChild or LastChild achieves the same outcome,
in this case.



a.cs

using System;

using System.Xml;

public class Sample

{

public static void Main()

{

XmlDocument d = new XmlDocument();

d.LoadXml("<aa><book> hi<title>None</title> </book> </aa>");

XmlDocument de = (XmlDocument) d.CloneNode(true);

Console.WriteLine(de.ChildNodes.Count);

for (int i=0; i<de.ChildNodes.Count; i++)

Console.WriteLine(de.ChildNodes[i].OuterXml);

Console.WriteLine(de.Name + de.OuterXml);

Console.WriteLine("Shallow");

XmlDocument sh = (XmlDocument) d.CloneNode(false);

Console.WriteLine(sh.ChildNodes.Count);

for (int i=0; i<sh.ChildNodes.Count; i++)

Console.WriteLine(sh.ChildNodes[i].InnerXml);

Console.WriteLine(sh.Name + sh.OuterXml);

}

}

Output

1

<aa><book> hi<title>None</title></book></aa>

#document<aa><book> hi<title>None</title></book></aa>

Shallow

0

#document



We start with an XML fragment that has a root node aa, which has a child node of
book, containing another child node called title. The CloneNode function in
XmlDocument takes a boolean parameter, where True refers to cloning the nodes
within the node aa, which includes the nodes 'book' and 'title'. It behaves akin
to a copy constructor for nodes.



The cloned node does not have a parent. Therefore, when we exploit the property
ParentNode to ascertain the value of the parent node, it returns a value of
null. The ParentNode property returns a handle to the Parent node of the node.



The return value of the CloneNode function is an XmlNode class. As the
XmlDocument derives from XmlNode, the 'cast' operator is used. The XmlNode class
has a property called ChildNodes, which returns an XmlNodeList object that
refers to all the child nodes. This object is of a Collection data type. The
Count property in the collection can be used to render a count of the number of
entities present in the clone.



The Name property returns the qualified name of #document and the OuterXml
property provides the entire element. This function at variance with the
LocalName function, which returns the name of an attribute, #cdata-section for
the CDATA section, and so on. Thus, with the help of this function, we can
identify the different names for different node types.



The CloneNode function is called again, but with a value of false. Thus, a
shallow node with no ChildNodes is returned. This is confirmed using the count
property, which shows a count of zero. This proves that only the node has been
cloned and not the content.



Using a 'for' statement, we iterate through all the child nodes, and display the
OuterXml property using the indexer. The value returned is the content of the
node, which includes text such as 'hi'. If you read about the CloneNode function
in the documentation, it would apprise you of the impending eventuality, if an
attempt is made to clone nodes of different types.



a.cs

using System;

using System.IO;

using System.Xml;

public class Sample

{

public static void Main()

{

XmlDocument d = new XmlDocument();

d.LoadXml("<!DOCTYPE book [<!ENTITY bb 'vijay'>]> <book> <misc /> </book>");

XmlEntityReference e = d.CreateEntityReference("bb");

Console.WriteLine(e.ChildNodes.Count);

d.DocumentElement.LastChild.AppendChild(e);

Console.WriteLine(e.FirstChild.InnerText + ". " + e.ChildNodes.Count + " " +
e.Name);

d.Save(Console.Out);

XmlEntityReference e1 = d.CreateEntityReference("bbb");

Console.WriteLine(e1.ChildNodes.Count);

d.DocumentElement.LastChild.AppendChild(e1);

Console.WriteLine(e1.FirstChild.InnerText + ". " + e1.ChildNodes.Count + " " +
e1.Name);

d.Save(Console.Out);

}

}



Output

0

vijay. 1 bb

<!DOCTYPE book[<!ENTITY bb 'vijay'>]>

<book>

<misc>&bb;</misc>

</book>0

. 1 bbb

<!DOCTYPE book[<!ENTITY bb 'vijay'>]>

<book>

<misc>&bb;&bbb;</misc>

</book>



The LoadXml function has a DOCTYPE declaration, and an entity reference called
bb having a value of 'vijay'. Now, using the function CreateEntityReference, we
create an entity reference called bb. This function returns an
XmlEntityReference object, which is a class derived from XmlLinkedNode, which,
in turn, is derived from XmlNode. The number of child nodes is zero at this
stage, because the node has not yet been expanded.



The function AppendChild is then used to append the child to the node, returned
by the LastChild property.



Since the node has been appended to the document, its parent node is set and the
entity reference bb is expanded to 'vijay'. Thus, the file has a child node that
contains 'vijay', the entity reference text. The InnerText property of the
FirstChild gives the replacement text. The Count of the child nodes is one,
since we have added one entity, and the Name property of the Entity Reference is
the name bb, which we have created.



Finally, the Save function, which prints out the XML fragment, displays the
entity reference bb, starting with an ampersand symbol & and ending with a
semi-colon, with the LastChild node named misc.



We now add another entity reference bbb, but lay it aside, undefined. The rules,
which applied earlier to the entity reference, are also relevant now. The only
significant difference is that the InnerText property does not have any value
when the reference node is expanded. Thus, the child is an empty text node. The
entity refs of bb and bbb are placed one after the other.



a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main() {

XmlDocument d = new XmlDocument();

d.LoadXml("<book></book>");

XmlNode n=d.CreateNode("element", "vijay", "");

n.InnerText = "hi";

XmlElement r = d.DocumentElement;

r.AppendChild(n);

//d.LastChild.AppendChild(n);

//d.AppendChild(n);

Console.WriteLine(d.OuterXml);

d.Save(Console.Out);

}

}



Output

<book><vijay>hi</vijay></book>

<book>

<vijay>hi</vijay>

</book>



The XML fragment shows the root tag as 'book' because of the LoadXml function.
The CreateNode function creates a new node or element called 'vijay'. This
function returns an object of type XmlNode and accepts the following parameters:




The first parameter decides on the type of node. Exactly 10 types of nodes can
be created with this function: element, attribute, text, cdatasection,
entityreference, processinginstruction, comment, document, documenttype and
documentfragmnet. This parameter is case sensitive.



The second parameter is the name of the new node. If there is a colon in the
name, it is parsed into a prefix and a LocalName.



The third parameter refers to the namespace URI. The InnerText property is
another mechanism for adding content between tags. The tag 'vijay' now embodies
the word 'hi'. This node in memory is added to the XML fragment with the help of
the AppendChild function, which is called off the XmlElement, returned by the
DocumentElement property.

Calling AppendChild off the XmlDocument class, throws an exception, since a
DocumentElement node already exists within the document. If we uncomment the
line d.AppendChild(n), it will result in the following exception:



Output

Unhandled Exception: System.InvalidOperationException: This document already has
a DocumentElement node.



If we uncomment the line d.LastChild.AppendChild(n), it would be similar to
first gaining a handle to the last child and then adding the node. In this case,
whether the node is the first child or the last child, it does not make any
difference at all. The documentation for the CreateNode function, offers a table
that very distinctly specifies as to which nodes can be placed within other
nodes.



a.cs

using System;

using System.Xml;

public class zzz {

public static void Main()

{

XmlDocument d = new XmlDocument();

d.Load("b.xml");

XmlImplementation i;

i = d.Implementation;

XmlDocument d1 = i.CreateDocument();

d.Save(Console.Out);

d1.Save(Console.Out);

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay />



Output

<?xml version="1.0"?>

<!DOCTYPE vijay[]>

<vijay />

Every XmlDocument has an XmlImplementation object associated with it, which can
be accessed using the Implementation property. XmlDocument objects created from
the same XmlImplementation, share the same name table. Thus, it is permissible
to compare attribute and element names as objects, instead of strings.



We create an empty XmlDocument d by using the constructor as before, and
thereafter, use the Load function to associate anXMLfile named b.xml with the
XmlDocument. Earlier we had used the Reader object.



To create an XmlDocument object that shares the same XmlNameTable, we use the
function CreateDocument from the XmlImplementation object. Then, the Save
function is used to write out both the XmlDocument objects to the Console. The
first XmlDocument object d, displays a replica of the file b.xml, while the
second one is empty because we have not associated any XML content with it.



Thus, the CreateDocument function creates an empty XML document, but it does not
copy any content to it. It ensures that the names are shared, and not
duplicated.



a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextReader r = new XmlTextReader ("books.xml");

r.WhitespaceHandling = WhitespaceHandling.None;

XmlDocument d = new XmlDocument();

d.Load (r);

XmlNode n;

n = d.DocumentElement;

Console.WriteLine(n.Name + "<" + n.Value + ">");

n = n.FirstChild;

Console.WriteLine(n.Name + "<" + n.Value + ">");

while (n != null)

{

XmlNamedNodeMap m = n.Attributes;

foreach (XmlNode a in m)

Console.Write(" " + a.Name + "<" + a.Value + "> ");

n = n.NextSibling;

Console.WriteLine();

}

}

}



Output

bookstore<>

book<>

genre<autobiography> publicationdate<1981> ISBN<1-861003-11-0>

genre<novel> publicationdate<1967> ISBN<0-201-63361-2>

genre<philosophy> publicationdate<1991> ISBN<1-861001-57-6>



First, we start by loading the XML file books.xml, and then, positioning the
DocumentElement at the root tag bookstore. Then, using the FirstChild property,
we make the first child node 'book' as the active node and display it. As we
have no clue about the number of child book nodes that are present in the file,
we use a while loop that repeats itself until the XmlNode object n returns a
null.



The Attributes property returns an XmlNamedNodeMap object that represents a
collection of nodes. These nodes can be represented by a name or an index. So,
using the foreach loop, we fetch one attribute at a time in an XmlNode object
and display the Name and the Value contained in it. The <> signs are used as a
talisman.



Once this is accomplished, we move to the next book at the same level. Tags at
the same level are called siblings. The property NextSibling moves to the next
book or tag at the same level, thereby, displaying one book tag after another.



a.cs

using System;

using System.Xml;

public class Sample

{

public static void Main()

{

XmlTextReader r = new XmlTextReader ("books.xml");

r.WhitespaceHandling = WhitespaceHandling.None;

XmlDocument d = new XmlDocument();

d.Load (r);

XmlNode n = d.DocumentElement;

Console.WriteLine(n.Name);

XmlNode b = n.LastChild;

Console.WriteLine(b.Name);

Console.WriteLine(b.OuterXml + "\n");

b = n.LastChild.PreviousSibling;

Console.WriteLine(b.OuterXml);

}

}



Output

bookstore

book

<book genre="philosophy" publicationdate="1991" ISBN="1-861001-57-6"><title>The
Gorgias</title><author><name>Plato </name></author><price>9.99</price></book>



In the previous example, we first displayed the root node bookstore, followed by
the LastChild and not the FirstChild node. Both are expected to return the same
name, since the first and last child tags refer to the same node book. The XML
associated with this node was displayed using the OuterXml function, which
displayed the last book, which deals with Philosophy. You may recollect that the
OuterXml function displays the child tags also.



In this example, we proceed backwards. Hence, we use the PreviousSibling
function to display the second last book. Thus, we can either proceed in the
forward direction, like we did in the last example, or in the backward
direction, as shown in the current example.



a.cs

using System;

using System.IO;

using System.Xml;

public class Sample

{

public static void Main()

{

XmlDocument d = new XmlDocument();

d.LoadXml("<book> hi </book>");

XmlComment c;

c = d.CreateComment("Comment 1");

XmlElement r = d.DocumentElement;

d.InsertBefore(c, r);

d.Save(Console.Out);

Console.WriteLine("\n---");

XmlComment c1;

c1 = d.CreateComment("Comment 2");

d.InsertAfter(c1, r);

d.Save(Console.Out);

}

}



Output

<!--Comment 1-->

<book> hi </book>

---

<!--Comment 1-->

<book> hi </book>

<!--Comment 2-->



LoadXml creates a node called book, which encompasses 'hi'. Thereafter, a
Comment object is created by calling the CreateComment function. The function
merely requires the string to be displayed as a comment. The extra comment
characters are placed by the function in the XML file.



The next dilemma is with regard to the position of the comment i.e. should the
comment be placed before or after the node book. The InsertBefore function
inserts the required node. It takes two parameters, i.e. a comment, followed by
the node before which the comment is to be inserted. As we want it to be
inserted before the book node, we use the handle returned by DocumentElement
property, which contains the handle to the book node. The Save function then
displays the comment before the book node.



The InsertAfter function inserts the comment node after the reference node,
which has been passed as the second parameter. Thus, the second comment comes
after the book node.



a.cs

using System;

using System.IO;

using System.Xml;

public class zzz {

public static void Main()

{

XmlDocument d = new XmlDocument();

d.LoadXml("<?xml version='1.0' ?><book> <title>Vijay</title> </book>");

XmlNode n = d.DocumentElement;

n.RemoveChild(n.FirstChild);

d.Save(Console.Out);

d.RemoveAll();

Console.WriteLine("\n=========");

d.Save(Console.Out);

}

}



Output

<?xml version="1.0"?>

<book>

</book>

=========



The XML fragment that is loaded using the LoadXml function has an XML
Declaration, which is a tag book, containing an element of title enclosing
'Vijay'.



The DocumentElement property returns a handle to the node book, which is stored
in n. Then, the RemoveChild function of node n is called with a single
parameter, which is the node title, since the FirstChild returns this value.
This function removes the child from the tag.

Saving to the Console displays the remaining XML fragment. Thus, we can see the
XML declaration and the tag book, but without any content, since the tag title
has been removed.



The next function that is implemented is, RemoveAll from the XmlDocument class.
On displaying the XML file, we witness no output since the RemoveAll function
erases everything from the Document.



a.cs

using System;

using System.IO;

using System.Xml;

public class zzz {

public static void Main()

{

XmlNode n;

XmlDocument d = new XmlDocument();

d.LoadXml("<zz xml:space=\"preserve\"><aa>Vijay</aa><bb>Mukhi</bb></zz>");

Console.WriteLine(d.DocumentElement.InnerText);

n=d.DocumentElement;

XmlSignificantWhitespace s=d.CreateSignificantWhitespace(" ");

d.Save(Console.Out);

Console.WriteLine();

n.InsertAfter(s, n.FirstChild);

Console.WriteLine(d.DocumentElement.InnerText);

d.Save(Console.Out);

}

}



Output

VijayMukhi

<zz xml:space="preserve">

<aa>Vijay</aa>

<bb>Mukhi</bb>

</zz>

Vijay Mukhi

<zz xml:space="preserve">

<aa>Vijay</aa> <bb>Mukhi</bb></zz>



The DocumentElement property returns a handle to the zz tag, whereas, the
InnerText property in the DocumentElement, refers to the Text present within the
tags aa and bb. Thus, we see 'VijayMukhi' displayed without any spaces displayed
between the two words.



We also create significant whitespace, using the function
CreateSignificantWhitespace, after initializing the node n to tag zz. This
function accepts only one string parameter, which could be any one of the
following four: &#20; &#10; &#13; and &#9. It does not augment theXMLfragment.
Finally, we write the XML fragment using the Save function. The attribute
space=preserve is visible. This attribute is optional.



Using the InsertAfter function, we add significant whitespace after the
FirstChild node. Thus, we see significant space between the nodes aa and bb. The
InnerText also displays the spaces between the words 'Vijay' and ' Mukhi'.



Output

VijayMukhi

<zz xml:space="preserve">

<aa>Vijay</aa>

<bb>Mukhi</bb>

</zz>

VijayMukhi

<zz xml:space="preserve"> <aa>Vijay</aa><bb>Mukhi</bb></zz>



The above output is obtained on adding the significant space before the first
child, using the InsertBefore function, in lieu of the InsertAfter function. The
spaces are inserted before the FirstChild aa, i.e. before the text 'Vijay' and
not after.



a.cs

using System;

using System.Collections;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlDocument d = new XmlDocument();

d.Load("books.xml");

XmlNode n = d.DocumentElement;

IEnumerator i = n.GetEnumerator();

XmlNode b;

while (i.MoveNext())

{

b = (XmlNode)i.Current;

Console.WriteLine(b.OuterXml);

}

}

}



Output

<book genre="autobiography" publicationdate="1981"
ISBN="1-861003-11-0"><title>The Autobiography of Benjamin
Franklin</title><author><first-name>Benjamin</first-name><last-name>
Franklin</last-name></author><price>8.99</price></book>

<book genre="novel" publicationdate="1967" ISBN="0-201-63361-2"><title>The
Confidence Man</title><author><first-name>
Herman</first-name><last-name>Melville</last-name></author>
<price>11.99</price></book><book genre="philosophy" publicationdate="1991"
ISBN="1-861001-57-6"><title>The

Gorgias</title><author><name>Plato</name></author><price> 9.99</price></book>



The last example in the series, displays various parts of an XML file in a
different manner.



Every XML node object has a function called GetEnumerator, which returns an
IEnumerator interface object. This interface has a MoveNext function that moves
from one node to the next. In our program, we move from one book to another,
since the XmlNode is on a tag book.



The Current property accesses the current book node, since it happens to be the
first one. Then, we use the OuterXml property to display the entire contents of
this node.





The MoveNext function then activates the next node. If no more nodes exist, it
returns False, or else it returns True. In this manner, we can iterate through
all the nodes. The foreach instruction has a similar functionality. Whenever we
want to navigate through a collection, the IEnumerator interface is used. This
is the conventional way of moving sequentially through a list of objects or a
collection in C#.



 

C Language Help, C Language Tutorials, C Language Programming, C Language Tricks { The C# Language }

C Language Help, C Language Tutorials, C Language Programming, C Language Tricks
The DTD



Validations in XML



So far, we have only read an XML file, without catering to special cases,
wherein, either an entity has been used, or data has to be validated as per the
element. The XmlTextReader class is the most optimum choice for reading an XML
file, barring the cases where data has to be validated, or in cases where an
entity has to be replaced with a value. For such purposes, the
XmlValidatingReader class is more suited. This class is derived from XmlReader,
and it conducts three types of validations- DTD, XDR and XSD schema validations.




This class is used when the primary task is either to conduct data validations
or to resolve general entities or to provide support for default entities.



a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlValidatingReader r = null;

XmlParserContext p;

p = new XmlParserContext(null, null, "vijay", null, null, "<!ENTITY pr
'100'>","","", XmlSpace.None);

r = new XmlValidatingReader ("<vijay mukhi='great' price='Rs &pr;'></vijay>",
XmlNodeType.Element, p);

r.ValidationType = ValidationType.None;

r.MoveToContent();

while (r.MoveToNextAttribute())

{

Console.WriteLine("{0} = {1}", r.Name, r.Value);

}

r.Close();

}

}



Output

mukhi = great

price = Rs 100



To create the object p of type XmlParserContext, the constructor with nine
parameters of XmlParserContext class is called. The nine parameters are as
follows:

• The first parameter refers to the NameTable type. It has a value of null.

• The second parameter refers to NamespaceManager type. It also has a value of
null.

• The third Parameter is the DocType, i.e. the root tag 'vijay'.

• The fourth parameter is the pubid for the external DTD file.

• The fifth parameter is the sysid for the external DTD file.

• The sixth parameter is the internal DTD, where an ENTITY declaration <!ENTITY
pr '100'> has been created. This simply states that the word 'pr' is preceded by
a '&' and followed by a semi-colon must be replaced with the string '100'.

• The seventh parameter in sequence is the location from where the fragment is
to be loaded, i.e. the base URI.

• The eighth parameter stands for the xml:lang scope.

• The ninth parameter stands for the xml:space scope.

The parameters to the constructor of XmlValidatingReader class are similar to
those of the XmlTextReader, which we had encountered earlier. This class is
derived from the XmlTextReader as well as the IXmlLineInfo interface.



There are five different values that a Validationtype can be initialized to:



1. The first is Auto, which validates only when the DTD or schema information is
found.



2. The second is DTD, which validates based on the instructions found in the DTD.




3. The third option, which creates an XML 1.0 non-validation parser, validates
the default attributes and resolves entities without using the DOCTYPE. Thus, if
the root tag is changed from 'vijay' to 'vijay1', no errors will be generated.
Placing the ValidationType statement within comments will generate the following
exception:



"Unhandled Exception: System.Xml.Schema.XmlSchemaException: The root element
name must match the DocType name. An error occurred at (1, 2)."



4. The fourth option is XSD, which validates as per the XSD schemas.



5. The fifth option is XDR, which validates as per the XDR schemas. In our
program we have set this property to a value of None.



Once the required properties are set, the MoveToContent function is used to move
to the first element, 'vijay'. The next function, MoveToNextAttribute returns a
value of True when there are attributes remaining to be read. Otherwise, it
returns a value of False. In our case, it is similar to the MoveToFirstElement
function.



The while loop repeats twice, since there are two attributes. The Name and Value
properties for the first attribute are displayed as 'mukhi' and 'great'. This is
very similar to what we have observed in the earlier program. The name for the
second attribute is displayed as 'price'. However, its value is not the same,
because it has an entity &pr;. The XmlValidatingReader replaces the entity pr
with the string '100', prior to displaying the value. Therefore, the output is
displayed as 'price' and 'Rs. 100'.



a.cs

using System;

using System.IO;

using System.Xml;

using System.Xml.Schema;

class zzz

{

public static void Main()

{

XmlTextReader r = new XmlTextReader("b.xml");

XmlValidatingReader v = new XmlValidatingReader(r);

v.ValidationType = ValidationType.DTD;

v.ValidationEventHandler += new ValidationEventHandler (abc);

while(v.Read());

}

public static void abc(object s, ValidationEventArgs a)

{

Console.WriteLine("Severity:{0}", a.Severity);

Console.WriteLine("Message:{0}", a.Message);

}

}



b.xml

<?xml version="1.0" ?>

<!DOCTYPE vijay1 >

<vijay>

</vijay>



Output

Severity:Error

Message:The root element name must match the DocType name. An error occurred at
file:///c:/csharp/b.xml(3, 2).

Severity:Error

Message:The 'vijay' element is not declared. An error occurred at file:///c:/csharp/b.xml(3,
2).

In the above program, to begin with, an object r that looks like XmlTextReader
is created, and then, it is passed to the constructor of XmlValidatingReader,
while object v is being created. The ValidationType of the object v is modified
to DTD. The ValidationEventHandler event is set to the function abc, which gets
called whenever an error occurs. Under the aegis of the Read function, the
entire XML file is validated, using the while loop, and the function abc is
notified whenever an error is chanced upon.



In the function abc, the values contained in the properties - Severity and
Message, of the ValidationEventArgs parameter 'a', are printed. The Severity
property reveals whether it is an error or warning, whereas, the Message
property contains the precise text of the error or warning.



In the above case, an error is generated because the DOCTYPE expects the root
element to be 'vijay1', whereas, it has been specified as 'vijay'. When no error
message is displayed, it may be inferred that no errors have been found.



The DTD



Using the above C# program, we shall now create our own DTD file. Therefore, we
shall modify only the b.xml and b.dtd files.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE vijay SYSTEM "b.dtd" >

<vijay />



b.dtd

<!ELEMENT vijay >



A DTD is generally very protracted. So, an internal DTD is rarely used. If it is
used, its contents have to be placed within [] brackets. To use an external DTD,
we use the words SYSTEM followed by the name of the DTD file, which is b.dtd, in
this case.



In b.dtd, an element 'vijay' is created by inserting the reserved characters
'<!', followed by ELEMENT, and finally by the element name 'vijay'. When we run
the C# program 'a', the following error is generated:



Output

Unhandled Exception: System.Xml.XmlException: This is an invalid content model.
Line 1, position 17.



An error in the DTD file has resulted in the generation of an un-handled
exception. The error occurred due to an incomplete ELEMENT statement.



b.dtd

<!ELEMENT vijay EMPTY>



The addition of the word EMPTY salvages the situation. By specifying the word
EMPTY, it is amply clear that the element named 'vijay' is an empty element.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE vijay SYSTEM "b.dtd" >

<vijay>

</vijay>



Output

Severity:Error

Message:Element 'vijay' has invalid child element '#PCDATA'. An error occurred
at file:///c:/csharp/b.xml(3, 8).



The DTD file states, with absolute clarity, that the ELEMENT 'vijay' is EMPTY.
However, an open tag <vijay> and a close tag </vijay>have been added to the XML
file. Therefore, an error message is generated, which, as usual, is
unintelligible.



Instead of using tags such as 'vijay', let us consider a DTD that has been
implemented in real life. This one is used for the WML, or the Wireless Markup
Language. The rules or syntax of WML are available as a DTD.



In our book titled 'WML and WMLScript', we have endeavoured to elucidate the
concept of a DTD. You are at liberty to refer to the book. However, we must
caution you that, the approach and the explanation used here is entirely at
variance with the one used in the earlier book.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

</wml>



b.dtd

<!ELEMENT wml EMPTY>



Output

Severity:Error

Message:Element 'wml' has invalid child element '#PCDATA'. An error occurred at
file:///C:/csharp/b.xml(3, 6).



The word 'vijay' has merely been replaced by the word 'wml'. The error generated
is akin to the earlier one. At this juncture, we introduce a 'card' into the DTD
file.



b.dtd

<!ELEMENT wml (card)>



Output

Severity:Error

Message:Element 'wml' has incomplete content. Expected 'card'. An error occurred
at file:///c:/csharp/b.xml(4, 3).



Every WML document must commence with the root tag 'wml'. In the DTD file, we
have placed the word 'card' within round brackets, along with wml. This
signifies that the wml tag must contain a tag or an element called 'card'. Since
there is no card in the XML file, an error is reported, stating that a card is
expected, and on account of its unavailability, the wml element is incomplete.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

<card />

</wml>



Output

Severity:Error

Message:The 'card' element is not declared. An error occurred at file:///c:/csharp/b.xml(4,
2)



We add the card tag as a single tag to our XML file, in an endeavour to
eliminate the error. But, as we have not specified 'card' as a valid element in
the DTD file, yet another error message is displayed. Unless 'card' appears as
an ELEMENT in the DTD file, it is not possible to use it in the XML file.
Therefore, we now include 'card' as an EMPTY element in b.dtd



b.dtd

<!ELEMENT wml (card)>

<!ELEMENT card EMPTY>



Now, all the errors just vanish. In the DTD file, we had affirmed that the
element 'card' shall be empty i.e. it will not have any content.



The XML file depicted below displays an error, because the 'card' tag is not a
single tag any longer.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

<card>

</card>

</wml>



Output

Severity:Error

Message:Element 'card' has invalid child element '#PCDATA'. An error occurred at
file:///C:/csharp/b.xml(4, 7).



The error message displayed here is very similar to the one seen with the wml
tag.

The element 'wml' has an invalid child element '#PCDATA'

A slight modification to the XML file is desirable, before we endeavour to
eliminate the error.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

<card>

hi

</card>

</wml>



Output

Severity:Error

Message:Element 'card' has invalid child element 'Text'. An error occurred at
file:///c:/csharp/b.xml(4, 7).



Inserting the word 'hi' between the card tags results in a slightly altered
error messages. In place of PCDATA, we get to see Text. Resorting to the
following modifications to the DTD file, both the error messages can be
eliminated.



b.dtd

<!ELEMENT wml (card)>

<!ELEMENT card (#PCDATA)>



To eradicate the errors, the EMPTY word is replaced with #PCDATA, enclosed
within round brackets. The word PCDATA is an acronym for Parseable Character
Data. In plain English, it represents text that can be entered from the
keyboard. Thus, we are at liberty to write as many lines of text as we want,
within the card tag. Even if the word 'hi' is removed from within the tags, no
error is generated.



Our DTD expects a root tag or starting tag of wml. Only a card tag can be
inserted amidst within this tag, which is capable of containing limitless
content. Insertion of anything else in this tag is a sure recipe for disaster.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

<card>

</card>

<card>

</card>

</wml>



Output

Severity:Error

Message:Element 'wml' has invalid content. Expected ''. An error occurred at
file:///c:/csharp/b.xml(6, 2).



The above error has occurred because, the DTD clearly specifies that the root
tag wml must have one, and only one, occurrence of the tag called 'card' within
it. Here, we have created two tags, thereby, causing the error.



b.dtd

<!ELEMENT wml (card)*>

<!ELEMENT card (#PCDATA)>



The * symbol, placed after the round brackets, is indicative of the fact that,
it can be replaced with zero to infinite values. Thus, the XML file can now
either have zero or countless card elements. If you do not give credence to this
statement of ours, you may either delete all the card elements from the XML
file, or add numerous cards. Either way, no error will be generated.



b.dtd

<!ELEMENT wml (card)+>

<!ELEMENT card (#PCDATA)>



Replacing the symbol * with a + transforms the meaning from 'zero to infinity'
to 'one to infinity'. The only difference between the * symbol and the + symbol
is that, the + sign mandates at least one occurrence of the element whereas, the
* signs makes it optional. Thus, in the aboveXMLfile, at least a single card
element is required.







b.dtd

<!ELEMENT wml (card)?>

<!ELEMENT card (#PCDATA)>



The last of the special characters is the symbol ? that specifies the number of
elements to be from 'zero or one'. Thus in the XML file, we may have either one
card element or none at all. The presence of two or more cards will generate an
error. You should try out various possible combinations for each of the symbols
*, + and?.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

<card>

<p> hi </p>

</card>

</wml>



b.dtd

<!ELEMENT wml (card)*>

<!ELEMENT card (p)>

<!ELEMENT p (#PCDATA)>



No error is generated because, in the DTD file, we have now stated that, the
card element can have a tag p, which can contain any text. We have, however,
done away with the provision of placing any text within the card tag.



Add in a new modification to the file.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

<card>

<p> <b/> </p>

</card>

</wml>





b.dtd

<!ELEMENT wml (card)*>

<!ELEMENT card (p)>

<!ELEMENT p (br | b)>

<!ELEMENT br EMPTY>

<!ELEMENT b EMPTY>



The DTD appears extensively complicated. The p tag is now competent of
containing only two tags, br and b. Text is not allowed any more. The | sign
signifies the OR condition, which implies that either tag b or tag br is
allowed. The two aforesaid tags are defined as EMPTY tags. To summarise, our DTD
states that the p tag can contain a single tag of either b or br.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

<card>

<p> <b/> <br/></p>

</card>

</wml>



Output

Severity:Error

Message:Element 'p' has invalid content. Expected ''. An error occurred at
file:///c:/csharp/b.xml(5, 11).



All is not well, because we are allowed to place either a 'b' or a 'br' at a
time, but not both together. To remedy the situation, we place a * symbol after
the p tag.



b.dtd

<!ELEMENT wml (card)*>

<!ELEMENT card (p)*>

<!ELEMENT p (br | b)*>

<!ELEMENT br EMPTY>

<!ELEMENT b EMPTY>







The above DTD provides us the flexibility of having multiple p tags within n
number of cards. These, in turn, may have as many b or br tags as desired.



By replacing the b tag with #PCDATA, a p tag is in a position to accommodate
multiple br tags, as well as an indefinite amount of text.



<!ELEMENT p (br | #PCDATA)*>



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

<card />

<head />

</wml>



b.dtd

<!ELEMENT wml (card,head)>

<!ELEMENT card EMPTY>

<!ELEMENT head EMPTY>



The above DTD file permits the wml tag to contain a card tag, which is then to
be strictly followed by a head tag. The comma signifies that one tag is to be
followed by the other. If we refrain from using the head tag in the XML file,
the following error message will be generated:



Output

Severity:Error

Message:Element 'wml' has incomplete content. Expected 'head'. An error occurred
at file:///C:/csharp/b.xml(5, 3).



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

<head />

<card />

</wml>

Output

Severity:Error

Message:Element 'wml' has invalid content. Expected 'card'. An error occurred at
file:///c:/csharp/b.xml(4, 2).



If the order of the tags is interchanged, an error is thrown. The card tag must
be followed by the head tag. Besides, there is a restriction imposed that there
can be only one insertion of each tag. If there are multiple insertions, it will
result in an error.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

<card />

<card />

<head />

</wml>



b.dtd

<!ELEMENT wml (card+,head?)>

<!ELEMENT card EMPTY>

<!ELEMENT head EMPTY>



When the plus sign is inserted after the card, it allows the use of more that
one card tag in the file. The ? sign denotes 'zero or one' insertions of the
head tag. Thus, we can have more than one card tag and have either a single head
tag or none at all. If the head tag is present, it must be placed after the card
tag, since the order of the tags is sacrosanct.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

<card />

<head />

<card />

</wml>





Output

Severity:Error

Message:Element 'wml' has invalid content. Expected ''. An error occurred at
file:///c:/csharp/b.xml(6, 2).



The Draconian restrictions imposed by the DTD file prohibit us from altering the
sequence of the above tags. The card tag has to come first, followed by the head
tag. We cannot interchange a head tag with a card tag. So, the only solution to
this problem is to abide by the stipulated sequence.



b.dtd

<!ELEMENT wml (card+,head?,template*)*>

<!ELEMENT card EMPTY>

<!ELEMENT head EMPTY>

<!ELEMENT template EMPTY>



In the DTD file, we have added a * symbol to the entire set of tags, which make
up the wml element. The set consists of the following individual elements in a
sequential order:

• More than one card tags.

• Zero or one head tag.

• Zero to many template tags.



This set can constitute of numerous permutations and combinations of the above
conditions, in the specified order. Thus, the card and head can appear together,
or the card can appear by itself without the head tag, or the template tag may
not be present at all, and so on. Every occurrence, however, needs to begin with
a card tag.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

<card aa="hi"/>

</wml>









b.dtd

<!ELEMENT wml (card)>

<!ELEMENT card EMPTY>

<!ATTLIST card aa CDATA #IMPLIED>



In the above example, the card tag has an attribute called aa initialized to
'hi'. To implement an attribute, we include the word ATTLIST, which is a short
form for 'a list of attributes', in the DTD file. This is followed by the name
of the tag that the attribute is associated with. Then, the actual name of the
attribute aa is specified, followed by the datatype it will hold, which is
character data, in our case. The last parameter, #IMPLIED permits the attribute
aa to be optional. Therefore, even if you remove it, no error will be generated.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

<card />

</wml>



b.dtd

<!ELEMENT wml (card)>

<!ELEMENT card EMPTY>

<!ATTLIST card aa CDATA #IMPLIED bb CDATA #REQUIRED>



Output

Severity:Error

Message:The required attribute 'bb' is missing. An error occurred at file:///c:/csharp/b.xml(4,
2).



The error message clearly mentions that the attribute bb is missing. The
#REQUIRED demands the presence of attribute bb, along with the card, whenever
the card tag is used. Further, the attributes are to be placed one after the
other. However, the order of placement is not significant.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

<card bb="no"/>

</wml>



No errors are generated since the attribute bb, which is mandatory, has been
specified. You can avoid aa, since it is implied.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

<card aa="no"/>

</wml>



b.dtd

<!ELEMENT wml (card)>

<!ELEMENT card EMPTY>

<!ATTLIST card aa (hi | bye ) "bye">



Output

Severity:Error

Message:'no' is not in the enumeration list. An error occurred at file:///c:/csharp/b.xml(4,
7).



The values assigned to attributes can be restricted to specific values. This can
be achieved by specifying the values along with ATTLIST in the DTD file and
using the OR sign (|) as the separator. The attribute aa can only be assigned
the value of either 'hi' or 'bye'. Specifying any other value would result in an
error.



If the attribute is not initialized, it assumes the default value of 'bye'.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

<card aa="hi"/>

</wml>





The error disappears because the attribute has been assigned a value of 'hi'.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

<card aa="hi"/>

</wml>



b.dtd

<!ELEMENT wml (card)*>

<!ELEMENT card EMPTY>

<!ATTLIST card aa ID #IMPLIED>



We have created an attribute aa, with a data type of ID. This does not result in
any error.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

<card aa="hi"/>

<card aa="hi"/>

</wml>



Output

Severity:Error

Message:'hi' is already used as an ID. An error occurred at file:///c:/csharp/b.xml(5,
7).



The card tag can be used multiple times, due to the presence of the * sign in
the DTD file. By associating the type of ID to the attribute aa, it is
guaranteed that the same value of 'hi' is not assigned to the attribute. The
error message conveys that 'hi' has already been assigned as an ID to the
attribute aa, and hence, it cannot be used again.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

<card aa="hi"/>

<card aa="hi1"/>

</wml>



If we assign a different value to the attribute, the error is dispensed with.
Thus, a data type of ID guarantees that the attribute shall never have a
duplicate value.



b.xml

<?xml version="1.0" ?>

<!DOCTYPE wml SYSTEM "b.dtd" >

<wml>

<card>

Hi &sonal;

</card>

</wml>



b.dtd

<!ELEMENT wml (card)*>

<!ELEMENT card (#PCDATA)*>

<!ENTITY sonal "hi" >



Entities have been touched upon earlier. Here, the word 'sonal' will be replaced
with 'hi'. This is called an Entity Reference. The DTD file requires an ENTITY
word with the variable 'sonal', and the value 'hi'.



 

C Language Help, C Language Tutorials, C Language Programming, C Language Tricks { The C# Language }

 
C Language Help, C Language Tutorials, C Language Programming, C Language Tricks

XML Classes



eXtensible Markup Language i.e. XML is a subset of the Standard Generalized
Markup Language
(SGML), which is an ISO standard numbered ISO 8879. SGML was perceived to be
remarkably colossal
and extremely convoluted to be put to any pragmatic use. Thus, a subset of this
language, XML,
was developed to work seamlessly with both SGML and HTML. XML may be considered
as a restricted
form of SGML, since it conforms to the rules of an SGML document.

XML was created in the year 1996 under the auspices of the World Wide Web
Consortium (W3C),
under the chairmanship of Jon Bosak. This group spelt out 10 ground rules for
XML, with 'ease of
use' as its fundamental philosophy. From thereon, the expectations reached a
threshold wherein,
XML was expected to eradicate world poverty and generally rid the world of all
its tribulations.
To be precise, XML was overvalued, way beyond realistic levels. There are people
who appear to
be extremely infatuated by XML, even though they may not have read through a
single rule or
specification of the language.


The specifications of XML laid down by its three primary authors- Tim Bray, Jean
Paoli and C. M.
Sperberg-McQueen, are accessible at the web site http://www.w3.org/XML.





XML documents consists entities comprising of Characters or Markups. An XML file
is made up of a
myriad components, which shall be unravelled one at a time, after we have
discerned the basic concepts of this language. We commence this chapter by introducing a program
that generates an

XML file.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.Flush();

a.Close();

}

}



In this program, we use a class called XmlTextWriter, which comes from the
System.Xml namespace.
An instance 'a' of the XmlTextWriter class is created, by passing two parameters
to the

constructor:

• The first parameter, b.xml, is a string and represents the name of the file to
be
created. If the file exists in the current directory, it gets deleted and then
recreated, but
with zero bytes.

• The second parameter is null. It represents the Encoding type used.



Unicode is a standard whereby each character is assigned 16 bits. All the
languages in the
world can now be easily represented by this standard. In the .Net world, we are
furnished with
classes whose methods facilitate conversion of arrays and strings made up of
Unicode characters,
to and from arrays made up of bytes alone.



The System.Text namespace has a large number of Encoding implementations, such
as the following:

• The ASCII Encoding encodes the Unicode characters as 7-bit ASCII.

• The UTF8 Encoding class encodes Unicode characters using UTF-8 encoding.

UTF-8 stands for UCS Transformation Format 8 bit. It supports all Unicode
characters. It is

normally accessed as code page 65001. UTF-8 is the default value and represents
all the letters
from the English alphabet. Here, since we have specified the second parameter as
null, the
default value of UTF-8 encoding is taken.



If we execute the program at this stage, you would be amazed by the fact that no
file by the
name of b.xml will be displayed. To enable this to happen, a function named
Flush needs to be
called.



Each time we ask the class XmlTextWriter to write to a file, it may not oblige
immediately, but
may place the output in a buffer. Only when the buffer becomes full, will it
write to the file.
This approach is pursued to avoid the overhead of accessing the file on the disk
repetitively.
This improves efficiency. The Flush function flushes the buffer to the file
stream, but it does
not close the file.



The Close function has to be employed to execute the twin tasks of flushing the
buffer to the
file, and closing the file. It is sagacious to call Flush, and then call Close,
even though
Close is adequate to carry out both these tasks.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>



Here, we have called a function called WriteStartDocument from the XmlTextWriter
class, which
does not take any parameters. It produces the line <?xml version="1.0"?>, in the
file b.xml.
Any line that begins with <?xml is called an XML declaration. Every entity in
XML is described
as a node. Every XML file must begin with an XML Declaration node. There can be
only one such
node in our XML file and it must be placed on the first line. Following it is an
attribute
called version, which is initialized to a value of 1.0.



The XML specifications lucidly stipulate that there would be no attribute called
version in the
next version of the software. Even if there is, its value would be
indeterminate. In other
words, in the foreseeable future, the only mandatory attribute would be
version=1.0.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.WriteDocType("vijay", null, null ,null);

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?><!DOCTYPE vijay>



The next vital declaration is the DOCTYPE declaration. Every XML file must have
one DOCTYPE

declaration, as it specifies the root tag. In our case, the root tag would be 'vijay'.



An XML file is made up of tags, which are words enclosed within angular
brackets. The file also
contains rules, which bind the tags. The next three parameters of the function
WriteDocType are
presently specified as null. You may refer to the documentation to decipher the
remaining

values, since these may be used in place of null. If this does not appeal to
you, you may have
to hold your horses, till we furnish the explanation at an appropriate time.

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>



In the earlier example, all the nodes were displayed on the same line. We would
indubitably
desire that every node be displayed on a new line. The property Formatting in
XmlTextWriter, is
used to accomplish this task. Formatting can be assigned only one of the
following two values:
Indented or None. By default, the value assigned is None.



The Indented option indents the child elements by 2 spaces. The magnitude of the
indent may be
altered, by stipulating a new value for the Indentation field. In our program,
we want the
indent to be 3 spaces deep. Hence, we stipulate the value as 3. As is evident,
all nodes do not
get indented. For example, the DOCTYPE node does not get indented; instead, it
is placed on a
new line.



The IndentChar property may be supplied with the character that is to be
employed for
indentation. By default, a space character is used for this purpose.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay />



The function WriteStartElement accepts a single parameter, which is the tag
name, to be written
to the XML file. This is an oft-repeated instruction, to be iterated in almost
every program,
since an XML file basically comprises of tags. A tag normally has a start point
and an end
point, and it confines entities within these two extremities. However, there are
tags that do
not accept any entities. Such tags end with a / symbol.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteAttributeString ("wife","sonal");

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay wife="sonal" />



The newly added function WriteAttributeString accepts two parameters, which it
writes in the
form of a name-value pair. Thus, along with 'vijay', we see the attribute named
'wife', having a
value of 'sonal'. An attribute is analogous to an adjective of the English
language, in that, it
describes the object. In our case, it describes the tag 'vijay'. It divulges
additional
information about the properties of a tag.



XML does not interpret the contents of these tags. The word 'wife' or the value
'sonal', have no
special significance for XML, which is absolutely unconcerned about the
information provided
within the tags.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteAttributeString ("wife","sonal");

a.WriteElementString("surname", "mukhi");

a.Flush();

a.Close();

}

}

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay wife="sonal">

<surname>mukhi</surname>

</vijay>



An element represents entities within a tag. We have a tag surname containing
the value 'mukhi'.
We can have multiple tags within the root tag.



We have been reiterating the fact that we need to adhere to specific rules. You
may steer clear
of the beaten path and interchange the following two newly added functions as
follows:



a.WriteElementString("surname", "mukhi");

a.WriteAttributeString ("wife","sonal");



As a fallout of this interchange, the following exception will be thrown:



Unhandled Exception: System.InvalidOperationException: Token StartAttribute in
state Content
would result in an invalid XML document.



This exception is triggered off due to the fact that the attribute must be
specified first.
Then, and only then, should the child tags within the tag, be specified.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteAttributeString ("wife","sonal");

a.WriteAttributeString ("friend","two");

a.WriteElementString("surname", "mukhi");

a.WriteElementString("books", "67");

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay wife="sonal" friend="two">

<surname>mukhi</surname>

<books>67</books>

</vijay>



To summarize, the WriteDocType function specifies the root tag, the
WriteStartElement the tag,
the WriteAttributeString, the attributes for the active tag and
WriteElementString function, a
tag within a tag. We can enumerate as many attributes as we desire. They will
eventually be
clustered together. The WriteElementString function is also capable of creating
as many tags, as
are needed under a tag.



In the file b.xml, we see two attributes and two tags, under the root tag 'vijay'.




a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteAttributeString ("friend","two");

a.WriteStartElement("mukhi");

a.WriteAttributeString ("wife","sonal");

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay friend="two">

<mukhi wife="sonal" />

</vijay>



In the above example, 'vijay' is the root tag, with the attribute 'friend',
which is assigned a
value of 2. It also has a child tag 'mukhi' having the attribute of 'wife'
initialized to
'sonal'. Both the tags, 'vijay' and 'mukhi', are created using the function
WriteStartElement.
Unlike function WriteElementString, which creates a start and end tag,
WriteStartElement creates
only a start tag.



A tag too can be endowed with attributes. The active tag is the last inserted by
the
WriteStartElement function. Functions such as WriteAttributeString, act on the
active tag. Thus,
we notice that the attribute of 'wife' has the tag 'mukhi' and not 'vijay'.
Finally, since the
tag 'mukhi' is devoid of any contents, it ends with a / symbol on the same line.




a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteAttributeString ("friend","two");

a.WriteStartElement("mukhi");

a.WriteAttributeString ("wife","sonal");

a.WriteFullEndElement();

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay friend="two">

<mukhi wife="sonal">

</mukhi>

</vijay>



The function WriteFullEndElement marks the end of the active tag. Therefore, the
single tag
'mukhi', does not end with a / symbol on the same line. It has an ending tag
instead. Both these
possibilities are equally valid in this case. But, if the tags embody any
contents, then both
the start and the end tags are mandatory. In such situations, a single empty tag
would just not suffice.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

//a.WriteComment("comment 1");

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteStartDocument();

a.WriteComment("comment 1");

a.WriteDocType("vijay", null, null ,null);

a.WriteComment("comment 2");

a.WriteStartElement("vijay");

a.WriteAttributeString ("wife","sonal");

a.WriteComment("comment 3");

a.WriteElementString("surname", "mukhi");

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!--comment 1-->

<!DOCTYPE vijay>

&<!--comment 2-->

<vijay wife="sonal">

<!--comment 3-->

<surname>mukhi</surname>

</vijay>



Every programming language extends the facility of writing comments, even though
it may be a
seldom used feature. Programmers insert comments amidst their code to document
or explain the
functioning of their programs. At times, comments assist in deciphering the code
from the
programmer's perspective. Practically, it may be easier to teach an elephant how
to tap-dance,
than to convince a programmer to write comments.



In the XML world, comments begin with <!-, and end with -->. This is somewhat
similar to the
HTML syntax. In fact, the rules of HTML are written in XML.



Comments are like a liquid, since they can be moulded to fit-in anywhere, except
on the first
line of a program. The first line in an XML file has to be a declaration. If you
dispense with
the comments given with the function WriteComment, an exception will be thrown
with the
following message:



Unhandled Exception: System.InvalidOperationException: WriteStartDocument should
be the first
call.





Thus, functions such as WriteComment, can be used to insert comments anywhere in
the code,
primarily for the purpose of documentation, which would enable even an alien
from outer space to
decipher the code better.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteProcessingInstruction ("sonal", "mukhi=no");

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay>

<?sonal mukhi=no?>

</vijay>



A line beginning with <?, Is called a Processing Instruction (PI). This line is
inserted using
the function WriteProcessingInstruction, and is passed two parameters:

• the first is the name of the processing instruction.


• the second is the text that is to be inserted for the processing instruction.




A Procession Instruction is used by XML to communicate with other programs
during the
performance of certain tasks. XML does not have the wherewithal to execute
instructions. It
therefore delegates this task to the XML processor. The processor is a program
that is able to
recognise an XML file. When it encounters the processing instruction, and if it
is able to
understand it, it executes it. In cases where it cannot comprehend it, the
processor simply
ignores the instruction. This is the methodology by which XML communicates with
external
programs.



In our program, the instruction 'sonal' is ignored, as it does not provide any
meaningful input

to the processor.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteString("mukhi");

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay>mukhi</vijay>



An XML file mainly consists of strings and tags. The WriteString function is
very extensively
exploited, since it writes content/strings between tags.



In the above example, the text 'mukhi' is enclosed within the tags of 'vijay'.
Even though we
have not explicitly asked the XmlTextWriter class to close the tag, the ending
tag has been used
because there exists some content after the opening tag.

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteAttributeString ("friend","two");

a.WriteString("hi");

//a.WriteAttributeString ("friend","three");

a.WriteStartElement("mukhi");

a.WriteAttributeString ("friend","two");

a.WriteString("bye");

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay friend="two">hi<mukhi friend="two">bye</mukhi></vijay>



The function WriteString can be inserted almost anywhere in the program. The
first WriteString

function writes 'hi' between the tags of 'vijay', while the second WriteString
function writes

'bye' between the tags of 'mukhi'. The WriteString is aware of the active tag.
Therefore, it

inserts the text accordingly. Here also, if we uncomment the line,

a.WriteAttributeString("friend","three"), the following exception will be
generated.



Unhandled Exception: System.InvalidOperationException: Token StartAttribute in
state Content

would result in an invalid XML document.



XML is very strict and meticulous in the sense that, it expects a certain order
to be
maintained, or else, it throws an exception. For instance, an element or a tag
has to be created
first. Only then, can all the attributes be written; and finally, the text or
content has to be
supplied. We are not permitted to write the text first and enter the attributes
later. In the
XmlTextWriter class, there is no going back. It is a one-way path, which only
moves in the
forward direction.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteCharEntity ('A');

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay>&#x41;</vijay>



During our exploratory journey of XML, we shall discuss a large number of
characters that are
'reserved'. They have a special significance and cannot be used literally. These
Unicode
characters have to be written in a hex format. The function WriteCharEntity
performs this task.
It accepts a char or a Unicode character as a parameter and returns a number in
hex, prefaced
with the &# symbol.



For those who do not understand hexadecimal and consider it Greek and Latin, 41
hex is equal to
ASCII 65, which is the ASCII value for the capital letter A. You can pass
different characters
to this function and see their equivalent hex values.

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteCData("mukhi & <sonal>");

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay><![CDATA[mukhi & <sonal>]]></vijay>



The above program introduces a new function called WriteCData, which creates a
node called
CDATA. The parameter passed to this function is placed as it is, but is enclosed
within square
brackets.



A CDATA section is used whenever we want to use characters such as <, >, & and
the likes, in
their literal sense, which would otherwise be mistaken for Markup characters.
Thus, in the above
program, the CDATA section that contains the symbol &, interprets it as the
literal character &,
and not as a special character. Also, <sonal> is not recognized as a tag in this
section. A
CDATA section cannot be nested within another CDATA section.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteString("<A>&");

a.WriteCData("<A>&");

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay>&lt;A&gt;&amp;

<![CDATA[<A>&]]>

</vijay>



This program illustrates certain characters that are special to XML. These are
the obvious
characters, such as <, > and &, since they are used whilst an XML file is being
created. Thus,
whenever XML comes across the following symbols, it replaces them with the
symbols depicted
against each:

• < is replaced with '&lt;'

• > is replaced with '&gt;'

• & is replaced with '&amp;'.



If the same string that contains the above mentioned special characters is
placed within a CDATA

statement, gets written verbatim, without any conversions.





a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteEntityRef("Hi");

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay>&Hi;</vijay>



The entity ref is very straightforward to understand. The string passed to the
function
WriteEntityRef is placed in the XML file, preceded by a '&' sign and followed by
a semi-colon.
An entity ref in XML is equivalent to a variable. It is included to provide
flexibility to the
program.

Thus in the above code, a variable called 'hi' is created. The task of stating
what 'hi'
signifies, can be defined in the XML file.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteRaw("<A>&");

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay><A>&</vijay>



The WriteRaw function writes the characters passed to it, without carrying out
any conversions.
The above XML file is obviously erroneous, as no end tag has been specified for
the tag A. Also,
no name has been specified after the & sign.

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

Boolean b = true;

a.WriteElementString("Logical", XmlConvert.ToString(b));

Int32 c = -2147483648;

a.WriteElementString("SmallInt", XmlConvert.ToString(c));

Int64 d = 9223372036854775807;

a.WriteElementString("Largelong", XmlConvert.ToString(d));

Single e = ((Single)22)/((Single)7);

a.WriteElementString("Single", XmlConvert.ToString(e));

Double f = 1.79769313486231570E+308;

a.WriteElementString("Double", XmlConvert.ToString(f));

DateTime h = new DateTime(2001, 07, 08 ,22, 0, 30, 500);

a.WriteElementString("DateTime", XmlConvert.ToString(h));

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay>

<Logical>true</Logical>

<SmallInt>-2147483648</SmallInt>

<Largelong>9223372036854775807</Largelong>

<Single>3.142857</Single>

<Double>1.7976931348623157E+308</Double>

<DateTime>2001-07-08T22:00:30.5000000+05:30</DateTime>

</vijay>



The above example contains a plethora of data types such as, boolean, int,
double and Data Time.



The XmlConvert class has a large number of static functions that help us convert
one data type

to another. One such function is the ToString function. For types such as int or
long, the

smallest and the largest values are used, in order to check the veracity of the
ToString

function.



The ToString function is overloaded to handle many more data types than we have
shown. The point
here is that, it is possible for us to convert any data type into a string and
write it to disk.
This factor gains immense importance when data is being received from a
database, and requires
to be converted into a string in an XML file.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteStartAttribute("hi", "mukhi", "xxx:yyy");

a.WriteString("1-861003-78");

a.WriteEndAttribute();

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay hi:mukhi="1-861003-78" xmlns:hi="xxx:yyy" />



In the above example, we have introduced the WriteStartAttribute function. As is
apparent from
its name, it starts an attribute. The first parameter to this function is 'hi',
which is the
namespace, to which the prefix of the attribute belongs. The second parameter
'mukhi' is the
name of the attribute.



The names assigned to attributes and tags may not always result in a unique
name. A programmer
may inadvertently create a tag or an attribute with a name that already exists.
How then does
XML decide what the tag denotes?



To help resolve such potential conflicts, each tag or entity is prefaced with a
name known as
the namespace. This is followed by a colon sign. Normally, meaningful names are
assigned, rather
than words like 'hi'. Prefixes or namespaces like xmlns, are reserved by XML.
The concept of
namespaces in XML is identical to the concept of namespaces in C#.



The third parameter is a Uniform Resource Identifier (URI). This parameter
reveals greater
details about the location of the namespace. It informs XML that somewhere
within the document,
additional information about the namespace 'hi' is available. In this case it is
at xxx:yyy. As
the WriteStartAttribute function does not specify any value for the attribute,
the WriteString
function is employed to assign the value 1-861003-78, to the attribute 'mukhi'
in the namespace

'hi'.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteAttributeString("xmlns", "bk", null, "sonal:wife");

string p = a.LookupPrefix("sonal:wife");

a.WriteStartAttribute(p, "mukhi", "sonal:wife");

a.WriteString("sonal");

a.WriteEndAttribute();

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay xmlns:bk="sonal:wife" bk:mukhi="sonal" />



Here, the function WriteAttributeString is called with four parameters. The
first, as always, is
the name of the namespace, i.e. xmlns. The second is the name of the attribute
i.e. bk, which is
suffixed to the name of the namespace, as xmlns:bk. The third parameter is the
namespace URI. In
the earlier program, we had specified the value of xxx:yyy for the URI. For this
program, since
the namespace xmlns is a reserved namespace, the URI parameter is specified as
null. The last

parameter is the value of the attribute.

As a consequence, the above function takes the form of an attribute consisting
of
xmlns:bk=sonal:wife. The next function LookupPrefix, accepts a namespace URI and
returns the
prefix. As the parameter supplied to this function is sonal:wife, the prefix
returned is bk,
which is stored in p.



The WriteStartAttribute then uses the following:

• 'bk' as the namespace,

• 'mukhi' as the name of the attribute, and

• 'sonal:wife' as the namespace URI.



Thus, the attribute 'mukhi' is prefaced with the namespace 'bk'. Finally, the
WriteString

function assigns the value of 'sonal' to the attribute bk:mukhi.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteAttributeString("xmlns", "bk", null, "sonal:wife");

a.WriteAttributeString("jjj", "bk", "kkk", "sonal:wife");

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay xmlns:bk="sonal:wife" jjj:bk="sonal:wife" xmlns:jjj="kkk" />



In this version of the WriteAttributeString function, the namespace is jjj and
the attribute
name is bk, with the value sonal:wife. Thus, the attribute becomes
jjj:bk=sonal:wife. The third
parameter to the function is the namespace URI, which is now assigned a value of
kkk, instead of
null.


Thus, one more attribute xmlns:jjj gets added, which indicates that the
namespace URI is kkk. We
notice that this attribute does not get added for the xmlns namespace. We have
chosen the
attribute name 'bk' again, just to demonstrate that they belong to different
namespaces.

Therefore, this bk is considered to be a different attribute from the earlier
bk.

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteStartAttribute(null,"sonal", null);

a.WriteQualifiedName("mukhi", "http://vijaymukhi.com");

a.WriteEndAttribute();

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay sonal="n1:mukhi" xmlns:n1="http://vijaymukhi.com" />



In the WriteStartAttribute function, only the second parameter out of the three
parameters, has
a value 'sonal, which is the name of the attribute. The first parameter, which
is the name of
the namespace and the third parameter, which is the URI of the namespace, are
both assigned null
values.



The next function, WriteQualifiedName assigns a value to the attribute 'sonal'.
This function
takes two parameters, the value 'mukhi' and the namespace URI for the value.



The value 'mukhi' gets prefaced by a namespace n1, which is created dynamically
by XML. The name
n1 belongs to the reserved xmlns namespace and the URI to n1 is specified in the
second
parameter, http://vijaymukhi.com. The method WriteQualifiedName, then looks up
the prefix within
the scope for the given namespace.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.WriteStartElement("vijay");

a.WriteAttributeString("xmlns","mukhi",null,"xxx:yyy");

a.WriteString("Hi ");

a.WriteQualifiedName("sonal","xxx:yyy");

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<vijay xmlns:mukhi="xxx:yyy">Hi mukhi:sonal</vijay>



In this example, we first create an attribute 'mukhi' in the reserved namespace
xmlns. This
attribute is then rendered a value of xxx:yyy. The WriteString function writes
'Hi' as the
content and then, the WriteQualifiedName writes the string 'sonal'. However,
since 'sonal' is a
Qualified name, it is prefaced by 'mukhi' and not by xxx:yyy, because 'mukhi' is
equated to

xxx:yyy.



The prefix in the scope for the namespace is given precedence.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.WriteStartElement("vijay");

a.WriteElementString("vijay","mukhi");

a.WriteElementString("vijay","sonal","mukhi");

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<vijay>

<vijay>mukhi</vijay>

<vijay xmlns="sonal">mukhi</vijay>

</vijay>



As we have just observed, the WriteElementString function had only two
parameters in the earlier
program. However, here it has three parameters. The first and the third
parameters are the same,
i.e. the tag name and the value. The newly inducted second parameter indicates
the namespace
'sonal'. The tag in the first parameter 'vijay', has the namespace of sonal.
Thus, the XML file
contains the tag with the attribute of xmlns=sonal.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextWriter a = new XmlTextWriter (Console.Out);

a.WriteStartDocument();

a.WriteStartElement("vijay");

a.Close();

}

}



Output

<?xml version="1.0" encoding="IBM437"?><vijay />

The XmlTextWriter class can write to different entities, using the constructor
that accepts a
single parameter. The Console class has a static property out of datatype
TextWriter that
represents the console. Thus, the output is now displayed on the console. By
default, the
encoding attribute is assigned a value of IBM437.



One of the primary reasons for designing XML was to introduce validation of the
tags in order to
produce a well-evolved XML file.



There are a few validations that need to be performed in an XML file, such as:

• It should be ensured that the basic rules of XML as well as our indigenous
rules are

followed.

• Certain tags should be placed only within specified tags and cannot be used


independently.

• The number of times a tag is being used can be regulated, since it cannot be
used

infinite times.

• A check should be placed on the name and the number of times an attribute is
used within

a tag.



All such rules that need to be enforced are enunciated in XML parlance and then,
placed in a DTD
or a Document Type Description. The DTD may either be placed in a separate file
or may be made
part of the DOCTYPE declaration. In the XML file shown below, the DTD is
internal.



Thus, a DTD stores the grammar that is permissible in an XML file. The entity
refs are also
defined in a DTD. One of the reasons why HTML is also reffered to as XHTML is
that, the rules of
well-formed html are available in the form of a DTD.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

String s = "<!ELEMENT vijay (#PCDATA)>";

a.WriteDocType("vijay", null, null, s);

a.WriteStartElement("vijay");

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay[<!ELEMENT vijay (#PCDATA)>]>

<vijay />



The WriteDocType function accepts four parameters. The first parameter is the
starting or root
tag 'vijay'. Hence, it must contain a value. The last parameter is the subset
(as referred to by
the documentation), which follows the root tag 'vijay'. If you observe the
DOCTYPE statement
carefully, you will notice that an extra pair of square brackets [], have been
added.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.WriteDocType("vijay", null, "a.dtd", null);

a.WriteStartElement("vijay");

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay SYSTEM "a.dtd">

<vijay />



The third parameter to WriteDocType function specifies the name of the DTD file.
In other words,
it states the URI of the DTD. The second parameter is assigned the value of
null. Hence, the
word SYSTEM is displayed before the name of the file, in the XML file.



Whenever XML wishes to ensure the validity of an XML file, it ascertains the
rules from a.dtd.
If both internal and external DTDs are present, both of them are checked.
However, the internal
DTD is accorded priority.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.WriteDocType("vijay", "mmm", "a.dtd", null);

a.WriteStartElement("vijay");

a.Flush();

a.Close();

}

}





b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay PUBLIC "mmm" "a.dtd">

<vijay />



In the earlier program, SYSTEM was added in the XML file, since the second
parameter had been
specified as null. However, in this program, the second parameter is not null.
Hence, the word
PUBLIC gets added. Thereafter, the string or the id specified in the second
parameter is added.

And then, the dtd in the third parameter is specified.

Therefore, it is either the PUBLIC identifier or the SYSTEM identifier, which
would be present.

The XML program or the processor scanning the XML file, uses the PUBLIC
identifier to retrieve

the content for the entities that use the URI. If it fails, it falls back upon
to the SYSTEM

literal.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument(false);

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0" standalone="no"?>



The WriteStartDocument can take a boolean parameter that adds an attribute which
could either be
'standalone = yes' or 'standalone=no', depending upon the value specified. This
attribute
determines whether the DTD is in an external file or it is internal to the XML
file. If the
standalone has a value of 'yes', it is suggestive of the fact that there is no
external DTD, and
therefore, all the grammatical rules have to be placed within the XML file
itself.

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

System.Console.WriteLine(a.WriteState);

a.WriteStartDocument();

System.Console.WriteLine(a.WriteState);

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

System.Console.WriteLine(a.WriteState);

a.WriteStartElement("vijay");

System.Console.WriteLine(a.WriteState);

a.WriteAttributeString ("wife","sonal");

System.Console.WriteLine(a.WriteState);

a.WriteStartAttribute("hi", "mukhi", "xxx:yyy");

System.Console.WriteLine(a.WriteState);

a.WriteString("1-861003-78");

a.WriteElementString("surname", "mukhi");

a.Flush();

System.Console.WriteLine(a.WriteState);

a.Close();

System.Console.WriteLine(a.WriteState);

}

}



Output

Start

Prolog

Prolog

Element

Element

Attribute

Content

Closed



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay wife="sonal" hi:mukhi="1-861003-78" xmlns:hi="xxx:yyy">

<surname>mukhi</surname>

</vijay>



The XmlTextWriter object can be in any one of six different states. The
WriteState property
reveals its current state. When an XmlTextWriter Object is created, it is in the
Start state, as
may be evident from the fact that, no write method has been called so far. After
the Close
function, the Writer is in the Closed state. When the WriteStartDocument and
WriteDocType
functions are called, they reach the Prolog state, because the prolog is being
written.



The WriteStartElement function actually starts writing to the XML file, thereby,
morphing to the
Element state. The element start tag 'vijay' begins the XML file. The next
function
WriteAttributeString does not change the state, since the element in focus still
is 'vijay'. The
WriteStartAttribute function needs the WriteString to complete the attribute.
Thus, after the
WriteStartAttribute function executes, the Text Writer assumes the Attribute
mode. The surname
attribute becomes the content in the XML file. Hence, the state changes to
Content mode.



This goes on to prove that the TextWriter can possibly be in any one of the
above six states,
depending upon the entities written to the file. While the TextWrtier is in the
Attribute state,
it cannot switch to an element state to write an element. Therefore, it throws
an exception.



a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.Namespaces = false;

a.WriteStartDocument();

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteAttributeString("jjj", "bk", "kkk", "sonal:wife");

a.Flush();

a.Close();

}

}



Output

Unhandled Exception: System.ArgumentException: Cannot set the namespace if
Namespaces is

'false'.

at System.Xml.XmlTextWriter.WriteStartAttribute(String prefix, String localName,
String ns)

at System.Xml.XmlWriter.WriteAttributeString(String prefix, String localName,
String ns, String

value)

at zzz.Main()



The TextWriter class has a Namespaces property that is read-write, and it has a
default value of

true. The Namespace property is turned off, by setting this property to false.
The above runtime

exception is thrown because, we have attempted to introduce a namespace jjj, in
the

WriteAttributeString function.



a.cs

using System;

using System.Xml;

public class zzz {

public static void Main() {

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.QuoteChar = '\'';

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteAttributeString("jjj", "bk");

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay jjj='bk' />

Various facets of XML can be modified. By using the property QuoteChar, we can
modify the
default quoting character, from double inverted commas to single inverted
commas. Since a single
quote cannot be enclosed within a set of single quotes, we use the backslash to
escape it. All
attributes can now be placed in single quotes instead of double quotes.

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteStartAttribute("hi", "mukhi", "xxx:yyy");

a.WriteString("1-861003-78");

a.WriteEndAttribute();

a.WriteEndElement();

a.WriteEndDocument();

a.Flush();

a.Close();

}

}



b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay hi:mukhi="1-861003-78" xmlns:hi="xxx:yyy" />



Good programming style necessitates every 'open' to have a corresponding
'close'. Thus, the
Begin functions for an Element, Attribute and Document have corresponding Close
functions too.
However, if we do not End them, they close by default and no major calamity
befalls them. We are
using them in the above program as an abandon caution.
The WriteEndDocument function puts the Text Writer in the Start mode.



Reading an XML file



b.xml

<?xml version="1.0" standalone="yes"?>

<!DOCTYPE vijay SYSTEM "a.dtd" [<!ENTITY baby "No">]>

<vijay aa="no">

<!--comment 2--><?sonal mukhi=no?>

Hi&baby;

<![CDATA[,mukhi>]]><aa>bb</aa>

</vijay>



> copy con a.dtd

Enter

^Z



a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextReader r;

r = new XmlTextReader("b.xml");

while (r.Read())

{

Console.Write("{0} D={1} L={2} P={3} ", r.NodeType, r.Depth, r.LineNumber,
r.LinePosition );

Console.Write(" name={0} value={1} AC={2}",r.Name,r.Value,r.AttributeCount);

Console.WriteLine();

}

}

}



Output

XmlDeclaration D=0 L=1 P=3 name=xml value=version="1.0" standalone="yes" AC=2

Whitespace D=0 L=1 P=39 name= value= AC=0

DocumentType D=0 L=2 P=11 name=vijay value=<!ENTITY baby "No"> AC=1

Whitespace D=0 L=2 P=54 name= value= AC=0

Element D=0 L=3 P=2 name=vijay value= AC=1

Whitespace D=1 L=3 P=16 name= value= AC=0

Comment D=1 L=4 P=5 name= value=comment 2 AC=0

ProcessingInstruction D=1 L=4 P=19 name=sonal value=mukhi=no AC=0

Text D=1 L=4 P=36 name= value=Hi AC=0

EntityReference D=1 L=5 P=4 name=baby value= AC=0

Whitespace D=1 L=5 P=9 name= value= AC=0

CDATA D=1 L=6 P=10 name= value=,mukhi> AC=0

Element D=1 L=6 P=21 name=aa value= AC=0

Text D=2 L=6 P=24 name= value=bb AC=0

EndElement D=1 L=6 P=28 name=aa value= AC=0

Whitespace D=1 L=6 P=31 name= value= AC=0

EndElement D=0 L=7 P=3 name=vijay value= AC=0



In this program, we read an XML file and display all the nodes contained
therein. To avoid any
errors from being displayed, you should create an empty file by the name of
a.dtd.



We have a class called XmlTextReader that accepts a filename as a parameter. We
pass the
filename b.xml to it. This file contains most of the entities present in an XML
file. The Read
function in this class picks up a single node or XML entity at a time. It
returns true, if there
are more nodes to be read, or else, it returns false. Thus, when there are no
more nodes to be
read from the file, the while loop ends. The Read function scans the active node
and displays
its contents in the loop.

 


The NodeType property displays the name of the nodetype. As an XML file normally
starts with a
declaration, the NodeType property displays the NodeType as XMLDeclaration,
using the ToString
function.



The Depth property gets incremented by one, every time an element or a tag is
encountered. At
the Declaration statement, the depth is 0. At the EndElement or at the end of
the tag, its value
reduces by one. Thus, the Depth property reveals the number of open tags in the
file and it can
be used for indentation.

The Line Number indicates the line on which the statement is positioned, while
the LinePosition
property displays the position on the line at which the statement begins. The
Name property in
the class reveals the name of the tag, XML. The output displayed by this
property depends upon
the active node type. On acute observation, you shall notice that the word XML
is not preceded
by the symbol <? in the output.



The value property relates to the name property, in this case, to
XmlDeclaration. It displays
the entire gamut of attributes to the node. As there exist two attributes,
version and
standalone, the property AttributeCount displays a value of 2.



If the enter key is pressed after the node declaration, it is interpreted as a
Whitespace
character. Whitespace characters are separators, which could consist of an
enter, space et al.
The Position property specifies the character position as 39.



The XmlDeclaration has to be the first node in an XML file, and it cannot have
any children. The
DOCTYPE declaration, which is known as a DocumentType Node, displays the name as
vijay, which is
the root node. The value is displayed as <!ENTITY baby "No">, which includes
everything except
the SYSTEM and a.dtd. Thus, in the case of a DocumentType Node, value is the
internal DTD.


We shall encounter the Whitespace Node very frequently. Hence, we shall not
discuss it
hereinafter. The Attribute Count will be displayed in the next program. This
node can have the
Notation and Entity as child nodes.



The next node in sequence is our very first element or tag 'vijay', which is the
same value that
was displayed earlier, with the name property for the DocumentType Node. The
Value property for
this element shows null, since tags are devoid of Values. Instead, they have
Attributes.



The attribute Count displays a value of one. At the following Whitespace node,
the Depth
property gets incremented by one. This is the only way to ascertain whether we
are at the root
node or not. We now stumble upon a comment, which has no name. The value
displayed is the value
of the comment. And yet again, the <!-characters are not displayed along with
the value.



Thereafter, a processing instruction (PI) is encountered. No whitespace is
displayed between the
comment and the PI, since we have not pressed the Enter key. 'Sonal' becomes the
name of the
program that runs 'vijay'. The rest turns into the value property having no
attributes. TextNode
is displayed next because the text 'Hi' is displayed in the XML file. This node
too is not
assigned any name and the value is depicted as 'Hi'.



What follows the text is an Entity Reference. It is assigned the name 'baby' and
is devoid of
the ampersand sign. Its value is null and it does not have any attributes. The
CDATA section is
given the name as null. The value is assigned the content of the CDATA, after
stripping away the
square brackets.



The value of the Depth property is incremented by 1. The Text Node follows the
element aa. This
node does not have any name and it displays the value as 'bb'. In the following
program, we
explore the various attributes.



a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextReader r;

r = new XmlTextReader("b.xml");

r.WhitespaceHandling = WhitespaceHandling.None;

while (r.Read())

{

Console.Write("{0} D={1} L={2}
P={3}",r.NodeType,r.Depth,r.LineNumber,r.LinePosition);

Console.Write(" name={0} value={1} AC={2}",r.Name,r.Value,r.AttributeCount);

Console.WriteLine();

if (r.HasAttributes)

{

for ( int i =0; i < r.AttributeCount; i++)

{

r.MoveToAttribute(i);

System.Console.WriteLine("Att {0}={1}",r.Name,r[i]);

}

}

}

}

}



Output

XmlDeclaration D=0 L=1 P=3 name=xml value=version="1.0" standalone="yes" AC=2

Att version=1.0

Att standalone=yes

DocumentType D=0 L=2 P=11 name=vijay value=<!ENTITY baby "No"> AC=1

Att SYSTEM=a.dtd

Element D=0 L=3 P=2 name=vijay value= AC=1

Att aa=no

Comment D=1 L=4 P=5 name= value=comment 2 AC=0

ProcessingInstruction D=1 L=4 P=19 name=sonal value=mukhi=no AC=0

Text D=1 L=4 P=36 name= value=

Hi AC=0

EntityReference D=1 L=5 P=4 name=baby value= AC=0

CDATA D=1 L=6 P=10 name= value=,mukhi> AC=0

Element D=1 L=6 P=21 name=aa value= AC=0

Text D=2 L=6 P=24 name= value=bb AC=0

EndElement D=1 L=6 P=28 name=aa value= AC=0

EndElement D=0 L=7 P=3 name=vijay value= AC=0



A property called WhiteSpaceHandling is initialized to None, as a result of
which, the node
Whitespace is not visible in the output.



The XmlTextReader has a member HasAttributes, which returns a True value if the
node has
attributes and False otherwise. Alternatively, we could also have used the
property
AttributeCount to obtain the number of attributes that the node contains.





If the node has attributes, a 'for statement' is used to display all of them. In
the loop, we
first use the function MoveToAttribute to initially activate the attribute. This
is achieved by
passing the number as a parameter to the function. Bear in mind that the index
starts from Zero
and not One.



Thereafter, the Name property is used to display the name of the attribute. If
the attribute is
not activated, the Name property displays the name of the node. This explains
the significance
of the MoveToAttribute function.



As you would recall, the XmlTextReader class has an indexer for the attributes,
and like all
indexers, it is zero based, i.e. r[0] accesses the value of the first attribute.
This is how we
display the details of all attributes of the node.



For the node DOCTYPE, the SYSTEM becomes the name of the attribute and the value
becomes the
name of the DTD file. For an element, the attributes are specified in name-value
pairs.



a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static XmlTextReader r;

public static void Main()

{

r = new XmlTextReader("b.xml");

int declaration=0, pi=0, doc=0, comment=0, element=0, attribute=0, text=0,

whitespace=0,cdata=0,endelement=0,

entityr=0,entitye=0,entity=0,swhitespace=0,notation=0;

while (r.Read())

{

Console.Write("{0} D={1} L={2}
P={3}",r.NodeType,r.Depth,r.LineNumber,r.LinePosition);

Console.Write(" name={0} value={1} AC={2}",r.Name,r.Value,r.AttributeCount);

Console.WriteLine();

if (r.HasAttributes)

{

for ( int i =0; i < r.AttributeCount; i++)

{

r.MoveToAttribute(i);

System.Console.WriteLine("Att {0}={1}",r.Name,r[i]);

}

}

switch (r.NodeType)

{

case XmlNodeType.XmlDeclaration:

declaration++;

break;

case XmlNodeType.ProcessingInstruction:

pi++;

break;

case XmlNodeType.DocumentType:

doc++;

break;

case XmlNodeType.Comment:

comment++;

break;

case XmlNodeType.Element:

element++;

if (r.HasAttributes)

attribute += r.AttributeCount;

break;

case XmlNodeType.Text:

text++;

break;

case XmlNodeType.CDATA:

cdata++;

break;

case XmlNodeType.EndElement:

endelement++;

break;

case XmlNodeType.EntityReference:

entityr++;

break;

case XmlNodeType.EndEntity:

entitye++;

break;

case XmlNodeType.Notation:

notation++;

break;

case XmlNodeType.Entity:

entity++;

break;

case XmlNodeType.SignificantWhitespace:

swhitespace++;

break;

case XmlNodeType.Whitespace:

whitespace++;

break;

}

}

Console.WriteLine ();

Console.WriteLine("XmlDeclaration: {0}",declaration);

Console.WriteLine("ProcessingInstruction: {0}",pi);

Console.WriteLine("DocumentType: {0}",doc);

Console.WriteLine("Comment: {0}",comment);

Console.WriteLine("Element: {0}",element);

Console.WriteLine("Attribute: {0}",attribute);

Console.WriteLine("Text: {0}",text);

Console.WriteLine("Cdata: {0}",cdata);

Console.WriteLine("EndElement: {0}",endelement);

Console.WriteLine("Entity Reference: {0}",entityr);

Console.WriteLine("End Entity: {0}",entitye);

Console.WriteLine("Entity: {0}",entity);

Console.WriteLine("Whitespace: {0}",whitespace);

Console.WriteLine("Notation: {0}",notation);

Console.WriteLine("Significant Whitespace: {0}",swhitespace);

}

}



Output

XmlDeclaration D=0 L=1 P=3 name=xml value=version="1.0" standalone="yes" AC=2

Att version=1.0

Att standalone=yes

Whitespace D=0 L=1 P=39 name= value=

AC=0

DocumentType D=0 L=2 P=11 name=vijay value=<!ENTITY baby "No"> AC=1

Att SYSTEM=a.dtd

Whitespace D=0 L=2 P=54 name= value=

AC=0

Element D=0 L=3 P=2 name=vijay value= AC=1

Att aa=no

Whitespace D=1 L=3 P=16 name= value=

AC=0

Comment D=1 L=4 P=5 name= value=comment 2 AC=0

ProcessingInstruction D=1 L=4 P=19 name=sonal value=mukhi=no AC=0

Text D=1 L=4 P=36 name= value=

Hi AC=0

EntityReference D=1 L=5 P=4 name=baby value= AC=0

Whitespace D=1 L=5 P=9 name= value=

AC=0

CDATA D=1 L=6 P=10 name= value=,mukhi> AC=0

Element D=1 L=6 P=21 name=aa value= AC=0

Text D=2 L=6 P=24 name= value=bb AC=0

EndElement D=1 L=6 P=28 name=aa value= AC=0

Whitespace D=1 L=6 P=31 name= value=

AC=0

EndElement D=0 L=7 P=3 name=vijay value= AC=0

Whitespace D=0 L=7 P=9 name= value=

AC=0



XmlDeclaration: 0

ProcessingInstruction: 1

DocumentType: 0

Comment: 1

Element: 1

Attribute: 0

Text: 2

Cdata: 1

EndElement: 2

Entity Reference: 1

End Entity: 0

Entity: 0

Whitespace: 6

Notation: 0

Significant Whitespace: 0



The above program is a continuation from where we left off in the previous
program. The initial

portion of the code is identical. A colossal case statement is introduced in the
program to

check the NodeType.



For each Node Type, there is a corresponding variable, whose value is
incremented by 1 whenever

the Node Type matches. Then, the values contained in these variables are
displayed. For

inexplicable reasons, the NodeType property does not return the following node
types - Document,

DocumentFragment, Entity, EndEntity, or Notation.



a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextReader r = new XmlTextReader("b.xml");

r.WhitespaceHandling = WhitespaceHandling.None;

while (r.Read())

{

if (r.HasValue)

Console.WriteLine("{0} {1}={2}", r.NodeType, r.Name, r.Value);

else

Console.WriteLine("{0} {1}", r.NodeType, r.Name);

}

}

}



Output

XmlDeclaration xml=version="1.0" standalone="yes"

DocumentType vijay=<!ENTITY baby "No">

Element vijay

Comment =comment 2

ProcessingInstruction sonal=mukhi=no

Text =

Hi

EntityReference baby

CDATA =,mukhi>

Element aa

Text =bb

EndElement aa

EndElement vijay



The HasValue property simply identifies whether a Node can contain a value or
not. There are
nine nodes that can possess values. These nodes are Attribute, CDATA, Comment,
DocumentType,
ProcessingInstruction, Significant Whitespace, Whitespace, Text and
XmlDeclaration. All the
above nodes must have a value, but they need not necessarily have a name.



a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextReader r = new XmlTextReader("b.xml");

r.MoveToContent();

string s = r["mukhi"];

Console.WriteLine(s);

s = r.GetAttribute("sonal");

Console.WriteLine(s);

s = r[2];

Console.WriteLine(s);

}

}



b.xml

<vijay mukhi="no" sonal="yes" aaa="bad" />



Output

no

yes

bad



The MoveToContent function moves to the first element in the XML file.





In this program, we display the attributes using different methods. In the first
approach, the
indexer is passed a string, which is the name of the attribute 'mukhi'. It
receives 'no' as the
return value.



In the second approach, the indexer is passed the integer value 2 as a
parameter, to access the
value of the third attribute, which is 'bad'.



Alternatively, the WriteAttribue function could have been given the string
'sonal' as a
parameter, to return the value of the attribute as 'yes'. Thus, there are
multiple means to
achieving the same objective.



a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextReader r = new XmlTextReader("b.xml");

r.MoveToContent();

string s ;

s = r.GetAttribute("aa:bb");

Console.WriteLine(s);

s = r.GetAttribute("bb");

Console.WriteLine(s);

s = r.GetAttribute("bb","sonal:mukhi");

Console.WriteLine(s);

s = r.GetAttribute("bb","sonal:mukhi");

Console.WriteLine(s);

s = r.GetAttribute("bb","aa");

Console.WriteLine(s);

s = r.GetAttribute("xmlns:aa");

Console.WriteLine(s);

}

}



b.xml

<vijay xmlns:aa="sonal:mukhi" aa:bb="no" />



Output

no



no

no



sonal:mukhi

The MoveToContent function is used in this program, instead of the Read
function. In the file
b.xml, we have an attribute bb in the namespace aa. It is initialized to a value
of 'no'. The
namespace aa has a URI, sonal:mukhi, because of the xmlns declaration. Thus, the
full name of
the attribute becomes aa:bb i.e. prefix, followed by the colon, followed by the
actual name. As
a result, specifying aa:bb results in the display of 'no', but only specifying
bb as a parameter
to GetAttribute results in a null value.



The full name of an attribute includes the name of the namespace too. So, we can
use the second
form of the GetAttribute function that has an overload of two parameters, where
the second
parameter is the name of the URI and not the namespace. Hence, it is acceptable
to call the
function with the URI sonal:mukhi, but if we use the namespace aa, no output
will be produced.

The last GetAttribute utilizes the full name xmlns:aa to retrieve the URI for
the element. Thus,
we can use this variant of the GetAttribute function with the URI instead of the
namespace:name.

a.cs

using System;

using System.IO;

using System.Xml;

public class zzz {

public static void Main() {

XmlTextReader r = new XmlTextReader("b.xml");

r.WhitespaceHandling=WhitespaceHandling.None;

r.MoveToContent();

r.MoveToAttribute("cc");

Console.WriteLine(r.Name + " " + r.Value);

Console.WriteLine(r.ReadAttributeValue());

Console.WriteLine(r.Name + " " + r.Value);

}

}



b.xml

<vijay aa="hi" bb="bye" cc="no" />



Output

cc no

True

No



In this example, we directly focus on the attribute that we are interested in,
i.e. cc. The name
and value properties in XMLTextReader display 'cc' and 'no' respectively. As
there are numerous
attributes of the node remaining to be read, the ReadAttribute function returns
True. This
function is normally used to read text or entity reference nodes that constitute
the value of
the attribute.



The Name property of the XmlTextReader however becomes null after the function


ReadAttributeValue is called.