Extensible Markup Language (XML)
Explaining Extensible Markup Language
The Extensible Markup Language (XML) can be referred as a set of rules used for encoding documents in machine-readable form. It is delineated in the XML1.0 Specification created by the W3C, and numerous other related specifications, all gratis open standards.
The design goals of XML emphasize simplicity, usability, and generality over the Internet. It is a textual data format featuring strong support through Unicode for the languages of the world. Even though the design of XML focuses on documents, it is widely used for representing arbitrary data structures, for instance in web services.
Also, various application programming interfaces (APIs) have been developed which are used by software developers to process XML data, and numerous schema systems exist to assist in the delineation of the XML-based languages. Moreover, numerous XML-based languages have been developed, counting RSS, Atom, SOAP, and XHTML. The XML-based formats have become the default for many office-productivity tools, comprising MS Office (Office Open XML), OpenOffice.org (Open Document), and Apple’s iWork.
As per definition, an XML document is a string of characters. More or less every legal Unicode character can appear in an XML document.
Processor and Application
The ‘processor’ is used for analyzing the markup and passes structured information to an ‘application’. The specification places requirements on what an XML processor should do and not do., but the application is, however, outside its scope. As addressed by the specification, the processor is often called colloquially as an XML parser.
Markup and Content
The characters forming an XML document are categorized into ‘markup’ and ‘content'. Markup and content can be distinguished by the application of simple syntactic rules. All strings which include Markup to begin either with the character “<” and end with a “>”, or begin with the character “&” and end with a “;”. The strings of characters not included in markup are referred as content.
XML supports the direct use of almost any Unicode character in element names, comments, attributes, character data, and processing instructions, excluding the ones featuring special symbolic meaning in XML itself, like the less-than sign, “<”.