I suggest to use Eclipse as our IDE during lab classes. It handles XML, DTD, XML Schema, XSLT, and XPath. It will also enable us to run and write Java programs. It is enough for most of our classes.
Of course, it is possible to use other tools if you prefer, but we won't be able to give an extra support.
The best version to use is Eclipse for Java EE Developers, because it will handle XML without any additional plugins. The generation is less important, I like to use Kepler at the moment.
The most reliable way to use a proper installation of Eclipse is to download and unpack it in your personal account. Yes, it takes some MBs...:(
We will also use
Some instructions, e.g. installation procedures, will be given for Linux OS. Usually it will be possible to work in Windows environment, if you prefer, but in such cases you will have to adapt the scenarios accordingly. The same if you have your personal Mac.
Recommendations:
History and overview:
We'll try not to repeat too much, see lecture slides for an introduction if you were not present. You may also look at the old scenario (in Polish, no longer maintained).
The following examples contain well-formed XML documents.
<a/>
<?xml version="1.0" encoding="UTF-8" ?> <俄语>данные</俄语> <!-- Example from Wikipedia -->
<?xml version="1.0" encoding="utf-8"?> <example> The same text fragment written in 3 ways: <option>x > 0 & x < 100</option> <option>x > 0 & x < 100</option> <option><![CDATA[x > 0 & x < 100]]></option> </example>
<?xml version="1.0" encoding="iso-8859-2"?> <!DOCTYPE main_element [ <!ENTITY entity1 "This is the content of a simple entity"> <!ENTITY entity2 "<element>This is the content of a <subelement>complex</subelement> one</element>"> ] > <!-- Here (before the main element) a processing instruction or a comment may occur, but not an element nor any text other than whitespace. --> <main_element> <?instruction attribute="value" but you can also like this?> <subelement attribute='Attribute value' inny-atrybut="Referencje do encji prostych: &entity1; ""> Text content <elem>mixed model</elem> żółty żółw. <!-- Comments and PIs allowed --> <empty_element may_have="attributes"/> </subelement> Zawartość tekstowa &entity1; Ƕ &entity2; <![CDATA[x < 5 && x > -5]]> </main_element> <!-- Here also a PI or a comment may occur, but not an element nor text. -->
Create files with .xml extension in your workspace (not necessarily in Eclipse), copy the contents of the examples and open the files in your web browser.
Correct syntax errors in this document dok1.xml. Use a web browser or xmllint program to check the file.
Eclipse would also find the errors, but this would be to easy...
In the following, add the fragments to XML documents you have already created.
Write down in different ways the character sequence ]]>
in a document body.
Write as an attribute value the expression "x > -5" & 'x < 5'
.
Download and unpack XML01.zip. Import it as a project into your Eclipse workspace, although Eclipse is not required in this part of lecture.
Folder entities contains examples of entity definition and usage. Use this command to print a document with all entities resolved:
xmllint -loaddtd -noent file1.xml
Run the above xmllint command for documents file1.xml, file2.xml, and file3.xml.
Try to do the following things and verify the file with xmllint. Are they correct?
Designing the structure of XML documents is an important and non-trivial activity, comparable to domain analysis (something which usually results in a UML class diagram).
In a typical application of XML it is preferred to use semantic (aka descriptive) markup rather than presentational one. Tags should denote logical tree-like structure of a document, names should be descriptive but not too long ;) and the same names should be used for elements having the same role.
Sometimes it is good to start modelling a new XML application with a concrete example. Let's do it having the above hints in mind.
Create an XML document – your visit card.
Instead of a visit card, you can prepare an example document (or a fragment) related to your assesment project.
The mechanism of namespaces is presented in the following examples.
The example from lecture without any namespace.
<?xml version="1.0"?> <article code="A1250" xmlns:pre="http://xml.mimuw.edu.pl/ns/article"> <title>Assignment in Pascal and C</title> <author> <fname>Jan</fname> <surname>Mądralski</surname> <address xmlns:pre="urn:addresses">... <code>01-234</code> </address> </author> <body> <paragraph xmlns:pre="http://xml.mimuw.edu.pl/ns/text-document"> Assignment is written as <code>x = 5</code> in C and <code>x := 5</code> in Pascal. </paragraph> </body> </article>
Canonical use of namespaces with prefixes. All prefixes declared in the root element and used consequently throughout the document.
<?xml version="1.0"?> <art:article code="A1250" xmlns:art="http://xml.mimuw.edu.pl/ns/article" xmlns:t="http://xml.mimuw.edu.pl/ns/text-document" xmlns:ad="urn:addresses"> <art:title>Assignment in Pascal and C</art:title> <art:author> <fname>Jan</fname> <surname>Mądralski</surname> <ad:address>... <ad:code>01-234</ad:code> </ad:address> </art:author> <art:body> <t:paragraph> Assignment is written as <t:code>x = 5</t:code> in C and <t:code>x := 5</t:code> in Pascal. </t:paragraph> </art:body> </art:article>
Overriding of prefixes in parts of the document tree.
BTW, this artificial example shows a bad practice: the same prefix pre
is used for different namespaces.
<?xml version="1.0"?> <pre:article code="A1250" xmlns:pre="http://xml.mimuw.edu.pl/ns/article"> <pre:title>Assignment in Pascal and C</pre:title> <pre:author> <fname>Jan</fname> <surname>Mądralski</surname> <pre:address xmlns:pre="urn:addresses">... <pre:code>01-234</pre:code> </pre:address> </pre:author> <pre:body> <pre:paragraph xmlns:pre="http://xml.mimuw.edu.pl/ns/text-document"> Assignment is written as <pre:code>x = 5</pre:code> in C and <pre:code>x := 5</pre:code> in Pascal. </pre:paragraph> </pre:body> </pre:article>
Default namespace is leveraged in the following:
<?xml version="1.0"?> <article code="A1250" xmlns="http://xml.mimuw.edu.pl/ns/article"> <title>Assignment in Pascal and C</title> <author> <fname>Jan</fname> <surname>Mądralski</surname> <address xmlns:pre="urn:addresses">... <code>01-234</code> </address> </author> <body> <paragraph xmlns:pre="http://xml.mimuw.edu.pl/ns/text-document"> Assignment is written as <code>x = 5</code> in C and <code>x := 5</code> in Pascal. </paragraph> </body> </article>
Open document dok2.xml in a browser. Correct namespace-related errors.
Run program PrintAllTags (provided in XML01.zip project mentioned already) passing different files as arguments.
Experiment with namespaceAware
setting and namespace declarations in documents to see how namespace
declarations are interpreted by the parser.
Try to parse dok2.xml with different settings of namespace awarness.
visit-cards-set
,
the root element of which belongs to a separate namespace.
The rolodex should contain one or more visit-cards in their namespace (summarising: two namespaces should be used in a single document).If you chose to work on a different example than visit cards, now simply try to use more than one namespace in your document.