What is XML and what can it be used for?




What is XML?

XML is an abbreviation for Extended Mark-up Language. Like HTML, it is a mark-up language, i.e. a way of formatting text. XML is not an alternative to HTML, but a way of adding more properties to HTML. Where HTML is made for structuring text, XML is made for structuring data. This is a list of the most important differences:


HTMLXML
Structure text.Structure data.
Comes with predefined tags, e.g. <B> and <DIV>.No predefined tags. Tags have to be defined by the programmer.
Tags are not case sensitive. <DIV> is the same as <div> or <Div>.Tags are case sensitive. <TEST> and <test> are treated as two different tags.
A document needs DOCTYPE, HTML, HEAD and BODY to work properly.A document needs DOCTYPE to work properly.


You can format the content of am XML file, so you get something resembling a website. For news feeds it works well, but for ordinary web sites, it is not a good solution. Here a combination of HTML and XML will be an advantage. There is a ombination language called XHTML, which should have been the replacement for HTML, but it isn't used very much. Presumably because HTML 5 has fixed the weaknesses XHTML was supposed to fix.

Most people can't relate to the concept of XML databases, and this type of database really is living rather anonymous life, but actually it is used quite a lot. A Word document is really a set of XML databases. If you try renaming a Word document from .doc or .docx to .zip, i.e. telling the programs to treat the file as if it was a compressed file or archive made by the program WinZip, you can now open the document as a zip file. Now you can see the hidden XML files that enables the Word document to tell where you want the images and where the text is in bold or italics, etc.


Why use XML?

In regards to websites, it can be necessary to use databases for various purposes. Searches in dictionaries automatically gives you an sense of looking up something in a database, but something like web fora and WordPress are also databases. Here it's just search and retrieval running in the background, that the user doesn't see.

In regards to web sites, XML has three advantages:
  1. It's easy to create small to medium size databases, with a flexible design, for fast lookups on the web site.
  2. The database is placed with the rest of the web site and can be accessed locally on the server. Separate or dedicated servers as required when using SQL databases are not necessary
  3. The databases are independent of third party programs like Access, MySQL or Lotus Notes, but can be accessed on any type of browser, using JavaScript.
In regards to the choice of web host, 2 and 3 offers som freedom of choice, because you don't have to worry about whether the host support the format you are using or not, and if you change host, you don't have to save the site from one place and the SQL databases from another and later upload to several places, etc.

Obviously XML also has some disadvantages, e.g.:
  1. You can't read and write in the database, like you can with an SQL database for discussion groups, blogs with comment sections etc.
  2. Very large databases becomes slow when accessing the page. This, however, can be solved by splitting up the database into several smaller databases
XML is thus not a solution for everything, but an outstanding tool for some jobs.