The below article will give you a idea of what is XML and use of Java & XML.
What Is XML?
XML is a text-based markup language that is fast becoming the standard for data interchange on the Web. Here data is identified using tags (identifiers enclosed in angle brackets, like this: <…>). Collectively, the tags are known as “markup”.
XML tags tell what the data means, rather than how to display it. It puts a label on a piece of data that identifies it (for example: <name>…</name>).
As in the field names for a data structure, any name can be used in XML tags that make sense for a given application. For multiple applications to use the same XML data, there should be an uniformity in the names used.
Here is an example of some XML data:
<personal_info>
<first_name>Chaitanya</first_name>
<last_name>Singh</last_name>
<address>Agra</address>
<message>
This is a sample XMl.
</message>
</personal_info>
The tags in this example identify the personal information as a whole, the first name, last name, address and the message. For every tag <tag> there is a matching end tag: </tag>. The data between the tag and its matching end tag defines an element of the XML data.
It is this ability for one tag to contain others that gives XML its ability to represent hierarchical data structures
Tags and Attributes
Tags can also contain attributes additional information included as part of the tag itself, within the tag’s angle brackets. The following example shows a personal information structure that uses attributes for the “first_name”, “last_name”, and “address” fields:
<personal_information first_name="Chaitanya" last_name="Singh"
address="Agra">
</personal_information>
The attribute name is followed by an equal sign and the attribute value, and multiple attributes are separated by spaces.
Empty Tags
Sometimes, there can be data which may be optional. The user may not supply the data. For example <Phone_Number>.
An empty tag without data can be represented as follows:
</Phone_Number>
The empty tag saves you from having to code <Phone_Number></ Phone_Number>
in order to have a well-formed document.
Comments in XML Files
XML comments look as follows:
<!– This is a comment statement –>
The XML Prolog
To complete this journeyman’s introduction to XML, note that an XML file always starts with a prolog. The minimal prolog contains a declaration that identifies the document as an XML document:
<?xml version="1.0"?>
The declaration may also contain additional information:
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
version
Identifies the version of the XML markup language used in the data. This attribute is not optional.
encoding
Identifies the character set used to encode the data. “ISO-8859-1” is “Latin-1” the Western European and English language character set. (The default is compressed Unicode: UTF-8.)
standalone
Tells whether or not this document references an external entity or an external data type specification (see below). If there are no external references, then “yes” is appropriate
Why Is XML Important?
There are a number of reasons for XML’s surging acceptance. This section lists a few of the most prominent.
Hierarchical
Finally, XML documents benefit from their hierarchical structure. Hierarchical document structures are, in general, faster to access because it can be drilled down to the part needed, like stepping through a table of contents. They are also easier to rearrange, because each piece is delimited.
Stylability
When display is important, the stylesheet standard, XSL dictates how to portray the data. More importantly, since XML is inherently style-free, you can use a completely different stylesheet to produce output in postscript, TEX, PDF, or some new format.
Data Identification
XML tells what kind of data is present, not how to display it. Because the markup tags identify the information and break up the data into parts, a search program can look for specific information. In short, because the different parts of the information have been identified, they can be used in different ways by different applications.
Plain Text
Since XML is not a binary format, you can create and edit files with anything from a standard text editor to a visual development environment. At the other end of the spectrum, an XML front end to a database makes it possible to efficiently store large amounts of XML data as well. So XML provides scalability for anything from small configuration files to a company-wide data repository.
Inline Reusabiliy
One of the nicer aspects of XML documents is that they can be composed from separate entities. Unlike HTML, XML entities can be included “in line” in a document.
Easily Processed
As mentioned earlier, regular and consistent notation makes it easier to build a program to process XML data. In XML, the <dt> tag must always have a </dt> terminator, or else it will be defined as a <dt/> tag. That restriction is a critical part of the constraints that make an XML document well-formed.
How Can You Use XML?
There are several basic ways to make use of XML:
· Document-driven programming, where XML documents are containers that build interfaces and applications from existing components
· Archiving — the foundation for document-driven programming, where the customized version of a component is saved (archived) so it can be used later
· Traditional data processing, where XML encodes the data for a program to process
· Binding, where the DTD or schema that defines an XML data structure is used to automatically generate a significant portion of the application that will eventually process that data
Is there a relation between Java and Xml?
Yes, there is a relation. Java + XML = JDOM
What is JDOM?
JDOM is the Java Document Object Model. It is a way to represent an XML document for easy and efficient reading, manipulation, and writing.
It is a Straightforward API which is lightweight and fast and also Java-optimized.
Do you need JDOM?
JDOM is a lightweight API. It is a benchmarks of “load and print” show performance on par with SAX. Manipulation and output are also lightning fast.
JDOM can represent a full document, though not all must be in memory at once.
JDOM supports document modification and document creation from scratch, no “factory”.
It doesn’t require in-depth XML knowledge
The Document class
Documents are represented by the org.jdom.Document class.
A lightweight object holding a DocType, ProcessingInstructions, a root Element, and Comments.
• It can be constructed from scratch:
• Or it can be constructed from a file, stream, or URL:
Document doc = new Document(new Element(“rootElement”));
Builder builder = new SAXBuilder();
Document doc = builder.build(url);
The Build Process
A Document can be constructed using any build tool. The SAX build tool uses a SAX parser to create a JDOM document.
• Current builders are SAXBuilder and DOMBuilder
– org.jdom.input.SAXBuilder is fast and recommended
– org.jdom.input.DOMBuilder is useful for reading an existing DOM tree
– A builder can be written that lazily constructs the Document as needed
– Other possible builders: LDAPBuilder, SQLBuilder
Builders have optional parameters to specify implementation classes and whether DTD-based validation should occur.
SAXBuilder(String parserClass, boolean validate);
DOMBuilder(String adapterClass, boolean validate);
The Output Process
A Document can be written using any output tool
– org.jdom.output.XMLOutputter
tool writes the document as XML
– org.jdom.output.SAXOutputter
tool generates SAX events
– org.jdom.output.DOMOutputter tool creates a DOM document
– Any custom output tool can be used
To output a Document as XML:
XMLOutputter outputter = new XMLOutputter();
outputter.output(doc, System.out);
outputter = new XMLOutputter("", false);
outputter.output(doc, System.out);
The DocType class
A Document may have a DocType This specifies the DTD of the document
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
DocType docType = doc.getDocType();
System.out.println("Element: " + docType.getElementName());
System.out.println("Public ID: " + docType.getPublicID());
System.out.println("System ID: " + docType.getSystemID());
doc.setDocType( new DocType("html", "-//W3C...", "http://..."));
The Element class
A Document has a root Element:
Get the root as an Element object:
An Element represents something like web-app
– Has access to everything from the open
<web-app> to the closing </web-app>
<web-app id="demo">
<description>
Gotta fit servlets in somewhere!
</description>
<distributable/>
</web-app>
Element webapp = doc.getRootElement();
Accessing the Children
An element may contain child elements
getChild() may throw NoSuchElementException
// Get a List of direct children as Elements
List allChildren = element.getChildren();
out.println("First kid: " +
allChildren.get(0).getName());
// Get all direct children with a given name
List namedChildren = element.getChildren("name");
// Get the first kid with a given name
Element kid = element.getChild("name");
// Namespaces are supported
kid = element.getChild("nsprefix:name");
kid = element.getChild("nsprefix", "name");
Grandkids can be retrieved easily:
<linux-config>
<gui>
<window-manager>
<name>Enlightenment</name>
<version>0.16.2</version>
</window-manager>
<!-- etc -->
</gui>
</linux-config>
String manager = root.getChild("gui").getChild("window-manager").getChild("name").getContent();
Children can be added and removed through List manipulation or convenience methods:
List allChildren = element.getChildren();
// Remove the fourth child
allChildren.remove(3);
// Remove all children named "jack"
allChildren.removeAll( element.getChildren("jack"));
element.removeChildren("jack");
// Add a new child
allChildren.add(new Element("jane"));
element.addChild(new Element("jane"));
// Add a new child in the second position
allChildren.add(1, new Element("second"));
Elements are constructed directly, no factory method needed
Some prefer a nesting shortcut, possible since addChild() returns the Element on which the child was added:
A subclass of Element can be made, already containing child elements and content
Element element = new Element("kid");
Document doc = new Document( new Element("family").addChild(new Element("mom"))
.addChild(new Element("dad").addChild("kidOfDad")));
root.addChild(new FooterElement());
Getting Element Attributes
Elements often contain attributes:
Attributes can be retrieved several ways:
getAttribute() may throw NoSuchAttributeException
<table width="100%" border="0"> </table>
String value = table.getAttribute("width").getValue();
// Get "border" as an int, default of 2
int value = table.getAttribute("border").getIntValue(2);
// Get "border" as an int, no default
try {
value = table.getAttribute("border").getIntValue();
}
catch (DataConversionException e) { }
Setting Element Attributes
Element attributes can easily be added or removed
// Add an attribute
table.addAttribute("vspace", "0");
// Add an attribute more formally
table.addAttribute( new Attribute("prefix", "name", "value"));
// Remove an attribute
table.removeAttribute("border");
// Remove all attributes
table.getAttributes().clear();
Element Content
Elements can contain text content, The content is directly available and can easily be changed:
<description>A cool demo</description>
String content = element.getContent();
// This blows away all current content
element.setContent("A new description");
Mixed Content
Sometimes an element may contain comments, text
content, and children
Text and children can be retrieved as always:
This keeps the standard uses simple
<table>
<!-- Some comment -->
Some text
<tr>Some child</tr>
</table>
String text = table.getContent();
Element tr = table.getChild("tr");
Reading Mixed Content
To get all content within an Element, use getMixedContent()
– Returns a List containing Comment, String, and Element objects
List mixedContent = table.getMixedContent();
Iterator i = mixedContent.iterator();
while (i.hasNext()) {
Object o = i.next();
if (o instanceof Comment) {
// Comment has a toString()
out.println("Comment: " + o);
}
else if (o instanceof String) {
out.println("String: " + o);
}
else if (o instanceof Element) {
out.println("Element: " +
((Element)o).getName());
}
}
Exceptions
JDOMException is the root exception
– Thrown for build errors
– Always includes a useful error message
– May include a “root cause” exception
Subclasses include:
NoSuchAttributeException
NoSuchElementException
NoSuchProcessingInstructionException
DataConversionException
Search keywords:
XML, Java, Java and XML, JDOM
Leave a Reply