Searching in XML databases from search fields




Searching in XML files

Searching and extraction of data is probably what most people associate with databases, and this can also be done with XML databases.

With JavaScript it is a relatively simple task to search ane return information. The solution shown here, is not the one true way of doing it, it is one way of doing it, that works.


The XML file

The XML file is an edited excerpt from one of the XML databases for the EU project called SAMANCTA. The original list is much longer, and contains links to other pages. The structure is like this:

<?xml version="1.0" encoding="UTF-8">
<HS_Numbers>

<HS_Number_text>
<HS_Chapter>0101</HS_Chapter>
<HS_SubNumber>0101.21</HS_SubNumber>
<HS_SubNumber_Title_EN><![CDATA[Horses; live pure-bred breeding animals]]></HS_SubNumber_Title_EN>
<Products_EN><![CDATA[animals; mammals; horses]]></Products_EN>
</HS_Number_text>

<HS_Number_text>
<HS_Chapter>0101</HS_Chapter>
<HS_SubNumber>0101.29</HS_SubNumber>
<HS_SubNumber_Title_EN><![CDATA[Horses; other than live pure-bred breeding animals]]></HS_SubNumber_Title_EN>
<Products_EN><![CDATA[animals; mammals; horses]]></Products_EN>
</HS_Number_text>



<HS_Number_text>
<HS_Chapter>0106</HS_Chapter>
<HS_SubNumber>0106.49</HS_SubNumber>
<HS_SubNumber_Title_EN><![CDATA[Live insects not specified elsewhere in chapter 1]]></HS_SubNumber_Title_EN>
<Products_EN><![CDATA[animals; insects]]></Products_EN>
</HS_Number_text>

<HS_Number_text>
<HS_Chapter>0106</HS_Chapter>
<HS_SubNumber>0106.90</HS_SubNumber>
<HS_SubNumber_Title_EN><![CDATA[Live animals not specified elsewhere in chapter 1]]></HS_SubNumber_Title_EN>
<Products_EN><![CDATA[animals]]></Products_EN>
</HS_Number_text>

</HS_Numbers>

The list is given the name HS_Codes.xml, and placed in the same directory as the HTML file with the JavaScript seaching through the file.


The JavaScript

When doing a search like this, splitting up the JavaScript in multiple parts is a sound approach. If, for some reason, you want to do it differently, you are most welcome to do so.

The JavaScript consists of five parts:
  1. Accessing the XML file
  2. Fetching the text to be searched for
  3. Parsing the XML file's Products_En field
  4. If the Products_EN field contains the search criteria, the content of the fields HS_SubNumber and HS_SubNumber_Title_EN (the two sets of information the user need from the search) are inserted in an array for each field, i.e. a list.
  5. If one or more match are found, the content of the two arrays are written in a designated field for the results. Otherwise a message is written, that no match was found.

In this example, we split the task in 2 functions in the same file. It isn't strictly necessary, but it is good for maintaining an overview of the task, especially in the development phase. The file we call SearchInProduccts.js, and place in the directory JavaScripts, as described on the page about internal and external JavaScripts here.

The two functions we call SearchProductIndex() and ShowProductResults(). So far, the JavaScript looks like this:

<SCRIPT TYPE="text/javascript">

function SearchProductIndex() { }


function ShowProductResults() { }

</SCRIPT>

First thing that needs to be done is accessing the XML file HS_Codes.xml. For this we use the JavaScript loadXMLDoc(). How this is done, can be read here. Then the JavaScript looks like this:

<SCRIPT TYPE="text/javascript">

function SearchProductIndex() {
xmlDoc=loadXMLDoc("HS_Codes.xml");
}


function ShowProductResults() { }

</SCRIPT>

Now we need four variables. One for reading the content of the search field, given the ID SearchProduct, three for reading the fields Products_DA, HS_SubNumber and HS_SubNumber_Title_EN in the XML file. The first of the three is the field we parse in the search and the other two are the fields to be returned. Hence the names for the variables. Then the code looks like this:

<SCRIPT TYPE="text/javascript">

function SearchProductIndex() {
xmlDoc=loadXMLDoc("HS_Codes.xml");

var SearchTerm = document.getElementById("SearchProduct").value;
var AllItems = xmlDoc.getElementsByTagName("Products_EN");
var ReturnItemSubNumber = xmlDoc.getElementsByTagName("HS_SubNumber");
var ReturnItemTitle = xmlDoc.getElementsByTagName("HS_SubNumber_Title_EN");

}


function ShowProductResults() { }

</SCRIPT>

As it makes no sense, searching for something in an empty field, the first thing we do is checking to see whether the search field, i.e. SearchTerm, is empty. If it is, you need an error message popping up, otherwise you need the process to continue. Then it looks like this:

<SCRIPT TYPE="text/javascript">

function SearchProductIndex() {
xmlDoc=loadXMLDoc("HS_Codes.xml");

var SearchTerm = document.getElementById("SearchProduct").value;
var AllItems = xmlDoc.getElementsByTagName("Products_EN");
var ReturnItemSubNumber = xmlDoc.getElementsByTagName("HS_SubNumber");
var ReturnItemTitle = xmlDoc.getElementsByTagName("HS_SubNumber_Title_EN");

if (SearchTerm.length < 1) {
alert("You forgot to enter a search term!");
}

else {
}

}


function ShowProductResults() { }

</SCRIPT>

The "else" that needs to be done is two arrays, which we call Results1 and Results2. The variable AllItems, i.e. the field Products_EN in the XML file will then be parsed. If the routine finds the text string from SearchTerm, the content from the fields HS_SubNumber and HS_SubNumber_Title_EN are saved in the corresponding arrays. Now the code looks like this:

<SCRIPT TYPE="text/javascript">

function SearchProductIndex() {
xmlDoc=loadXMLDoc("HS_Codes.xml");

var SearchTerm = document.getElementById("SearchProduct").value;
var AllItems = xmlDoc.getElementsByTagName("Products_EN");
var ReturnItemSubNumber = xmlDoc.getElementsByTagName("HS_SubNumber");
var ReturnItemTitle = xmlDoc.getElementsByTagName("HS_SubNumber_Title_EN");

if (SearchTerm.length < 1) {
alert("You forgot to enter a search term!");
}

else {
Results1 = new Array;
for (var i=0;i<AllItems.length;i++) {
var name = AllItems[i].lastChild.nodeValue;
var exp = new RegExp(SearchTerm,"i");
if (name.match(exp) != null) {
Results1.push(ReturnItemSubNumber[i]);
}
}

Results2 = new Array;
for (var i=0;i
var name2 = AllItems[i].lastChild.nodeValue;
var exp2 = new RegExp(SearchTerm,"i");
if (name2.match(exp2) != null) {
Results2.push(ReturnItemTitle[i]);
}
}
}

}


function ShowProductResults() { }

</SCRIPT>

We now have the two lists we need and can show them, using the function named ShowProductResults(). To do this, we need to send the variables Results1, Results2 and SearchTerm to ShowProductResults(). This is done like this:

<SCRIPT TYPE="text/javascript">

function SearchProductIndex() {
xmlDoc=loadXMLDoc("HS_Codes.xml");

var SearchTerm = document.getElementById("SearchProduct").value;
var AllItems = xmlDoc.getElementsByTagName("Products_EN");
var ReturnItemSubNumber = xmlDoc.getElementsByTagName("HS_SubNumber");
var ReturnItemTitle = xmlDoc.getElementsByTagName("HS_SubNumber_Title_EN");

if (SearchTerm.length < 1) {
alert("You forgot to enter a search term!");
}

else {
Results1 = new Array;
for (var i=0;i<AllItems.length;i++) {
var name = AllItems[i].lastChild.nodeValue;
var exp = new RegExp(SearchTerm,"i");
if (name.match(exp) != null) {
Results1.push(ReturnItemSubNumber[i]);
}
}

Results2 = new Array;
for (var i=0;i
var name2 = AllItems[i].lastChild.nodeValue;
var exp2 = new RegExp(SearchTerm,"i");
if (name2.match(exp2) != null) {
Results2.push(ReturnItemTitle[i]);
}
}
}

ShowProductResults(Results1, Results2, SearchTerm);
}


function ShowProductResults() { }

</SCRIPT>


We now need to have the three variables received by ShowProductResults(), and we need to see if the search found anything. To see if anything was found, or not, we ask whether the variable Results1 is empty. Then the code looks like this:

<SCRIPT TYPE="text/javascript">

function SearchProductIndex() {
xmlDoc=loadXMLDoc("HS_Codes.xml");

var SearchTerm = document.getElementById("SearchProduct").value;
var AllItems = xmlDoc.getElementsByTagName("Products_EN");
var ReturnItemSubNumber = xmlDoc.getElementsByTagName("HS_SubNumber");
var ReturnItemTitle = xmlDoc.getElementsByTagName("HS_SubNumber_Title_EN");

if (SearchTerm.length < 1) {
alert("You forgot to enter a search term!");
}

else {
Results1 = new Array;
for (var i=0;i<AllItems.length;i++) {
var name = AllItems[i].lastChild.nodeValue;
var exp = new RegExp(SearchTerm,"i");
if (name.match(exp) != null) {
Results1.push(ReturnItemSubNumber[i]);
}
}

Results2 = new Array;
for (var i=0;i
var name2 = AllItems[i].lastChild.nodeValue;
var exp2 = new RegExp(SearchTerm,"i");
if (name2.match(exp2) != null) {
Results2.push(ReturnItemTitle[i]);
}
}
}

ShowProductResults(Results1, Results2, SearchTerm);
}


function ShowProductResults(Results1, Results2, SearchTerm) {
if (Results1.length > 0) {
}

else {
}

}


</SCRIPT>

If a match is found, returning a readable answer consists of grabbing the field where you want to write the answer, using med getElementById(), and insert the content using appendChild(). To make it look nice, you set it up using tags with createElement() and styling of these using style.cssText. Be aware that when adding text with codes, you need to use innerHTML instead of createTextNode(), because otherwise the code is shown in the text field instead of being used for formatting. Now the code looks like this:

<SCRIPT TYPE="text/javascript">

function SearchProductIndex() {
xmlDoc=loadXMLDoc("HS_Codes.xml");

var SearchTerm = document.getElementById("SearchProduct").value;
var AllItems = xmlDoc.getElementsByTagName("Products_EN");
var ReturnItemSubNumber = xmlDoc.getElementsByTagName("HS_SubNumber");
var ReturnItemTitle = xmlDoc.getElementsByTagName("HS_SubNumber_Title_EN");

if (SearchTerm.length < 1) {
alert("You forgot to enter a search term!");
}

else {
Results1 = new Array;
for (var i=0;i<AllItems.length;i++) {
var name = AllItems[i].lastChild.nodeValue;
var exp = new RegExp(SearchTerm,"i");
if (name.match(exp) != null) {
Results1.push(ReturnItemSubNumber[i]);
}
}

Results2 = new Array;
for (var i=0;i
var name2 = AllItems[i].lastChild.nodeValue;
var exp2 = new RegExp(SearchTerm,"i");
if (name2.match(exp2) != null) {
Results2.push(ReturnItemTitle[i]);
}
}
}

ShowProductResults(Results1, Results2, SearchTerm);
}


function ShowProductResults(Results1, Results2, SearchTerm) {
if (Results1.length > 0) {

var ResultsProduct = document.getElementById("ResultsProduct");
while(ResultsProduct.firstChild)ResultsProduct.removeChild(ResultsProduct.firstChild)
var Header = document.createElement("H5");
var List = document.createElement("DIV");
List.style.cssText = 'margin-left:20px';
var SearchedFor = document.createTextNode("Documents with the keyword \""+SearchTerm+"\":");
ResultsProduct.appendChild(Header);
Header.appendChild(SearchedFor);
ResultsProduct.appendChild(List);
for (var i=0;i<Results1.length;i++) {
var ListItem = document.createElement("DIV");
ListItem.style.cssText = 'font-weight:bold';
var ChapterItem = document.createTextNode(Results1[i].lastChild.nodeValue);
var TextItem = document.createTextNode(": ");
var SubtitleDiv = document.createElement("SPAN");
SubtitleDiv.innerHTML = Results2[i].lastChild.nodeValue;
SubtitleDiv.style.cssText = "font-weight:normal";
List.appendChild(ListItem);
ListItem.appendChild(ChapterItem);
ListItem.appendChild(TextItem);
ListItem.appendChild(SubtitleDiv);
}
}

else {
}

}


</SCRIPT>

If the search found nothing, we need to return a message, saying that nothing was found. This is done in the same manner as when something was found, only in a simpler form. The the finished code looks like this:

<SCRIPT TYPE="text/javascript">

function SearchProductIndex() {
xmlDoc=loadXMLDoc("HS_Codes.xml");

var SearchTerm = document.getElementById("SearchProduct").value;
var AllItems = xmlDoc.getElementsByTagName("Products_EN");
var ReturnItemSubNumber = xmlDoc.getElementsByTagName("HS_SubNumber");
var ReturnItemTitle = xmlDoc.getElementsByTagName("HS_SubNumber_Title_EN");

if (SearchTerm.length < 1) {
alert("You forgot to enter a search term!");
}

else {
Results1 = new Array;
for (var i=0;i<AllItems.length;i++) {
var name = AllItems[i].lastChild.nodeValue;
var exp = new RegExp(SearchTerm,"i");
if (name.match(exp) != null) {
Results1.push(ReturnItemSubNumber[i]);
}
}

Results2 = new Array;
for (var i=0;i
var name2 = AllItems[i].lastChild.nodeValue;
var exp2 = new RegExp(SearchTerm,"i");
if (name2.match(exp2) != null) {
Results2.push(ReturnItemTitle[i]);
}
}
}

ShowProductResults(Results1, Results2, SearchTerm);
}


function ShowProductResults(Results1, Results2, SearchTerm) {
if (Results1.length > 0) {

var ResultsProduct = document.getElementById("ResultsProduct");
while(ResultsProduct.firstChild)ResultsProduct.removeChild(ResultsProduct.firstChild)
var Header = document.createElement("H5");
var List = document.createElement("DIV");
List.style.cssText = 'margin-left:20px';
var SearchedFor = document.createTextNode("Documents with the keyword \""+SearchTerm+"\":");
ResultsProduct.appendChild(Header);
Header.appendChild(SearchedFor);
ResultsProduct.appendChild(List);
for (var i=0;i<Results1.length;i++) {
var ListItem = document.createElement("DIV");
ListItem.style.cssText = 'font-weight:bold';
var ChapterItem = document.createTextNode(Results1[i].lastChild.nodeValue);
var TextItem = document.createTextNode(": ");
var SubtitleDiv = document.createElement("SPAN");
SubtitleDiv.innerHTML = Results2[i].lastChild.nodeValue;
SubtitleDiv.style.cssText = "font-weight:normal";
List.appendChild(ListItem);
ListItem.appendChild(ChapterItem);
ListItem.appendChild(TextItem);
ListItem.appendChild(SubtitleDiv);
}
}

else {

var ResultsProduct = document.getElementById("ResultsProduct");
while(ResultsProduct.firstChild)ResultsProduct.removeChild(ResultsProduct.firstChild)
var Para = document.createElement("P");
var NotFound = document.createTextNode("Sorry, no documents with the keyword \""+SearchTerm +"\" available!");
ResultsProduct.appendChild(Para);
Para.appendChild(NotFound);
}

}


</SCRIPT>

Now we have covered all the possible outcomes of the search, and we are ready to read, search and write.


Fields for reading and writing

For reading and writing, we need a FORM with a field for input, here given ID="SearchProduct", and a button for executing the JavaScript SearchProductIndex(). After this, we need a field for the output, here a DIV with ID="ResultsProduct".

As code it looks like this:

<B>Produkttype</B><BR>

<FORM ONSUBMIT="SearchProductIndex(); return false;" ACTION="">
<INPUT TYPE="text" ID="SearchProduct">
<INPUT TYPE="submit" VALUE="Search" ONCLICK="SearchProductIndex(); return false;">
</FORM>
<BR><BR>

<B>List of procedures for sampling</B><BR>
<DIV ID="ResultsProduct"></DIV>
<BR><BR>


On the screen it looks like this:

Product type


List of procedures for sampling