Parsing XML in Perl

Today I started to design the config engine to remove the hardcode from my Camel Web Server (You can refer to my previous posts for more details about this light weighted pure Perl Web container). Basically my target is that all of the configuration files should come from the config engine which work upon context.xml file.

In a short, Perl provides three ways to interactive with XML files.

Method 1,

Parsing XML into Dta Structures. In this way, Perl process XML data with a combination of hashes and arrays. Say, you can use $data->(author}{book}{ISBN} to access the content of ISBN.

In method 1, you will need to use XML::Simple;

Method 2,

Parsing XML into a DOM Tree.

DOM is abbr. of Document Object Model. Each element is a node in the tree, with which you can do operations like find its children nodes. E.g, $dom->getElementsByTagName(“book”);

In method2, you will need to use XML::LibXML;

Method 3,

Parsing XML into SAX Events.

Simple API for XML (SAX) is faster and uses less memory than parsers that build a DOM tree.

In method 3, you will need to use XML::SAX::ParserFactory.

In my case, I just used the Method 1 as my configuration file is very simple. Below is the receipt.

XML file:

<?xml version=’1.0′ encoding=’utf-8′?>
<!– Camel Web Server –>
<!– The contents of this file will be loaded for each web application –>
<Context>
<!– Server Host –>
<ServerHost>localhost</ServerHost>
<!–Define a non-SSL HTTP/1.1 port –>
<PortNumber>8080</PortNumber>
<!– root dir for all of the web application –>
<BaseDir>../webapps</BaseDir>
<!– log dir –>
<LogDir>../logs</LogDir>
<!– temp dir –>
<TemporaryDir>../temp</TemporaryDir>
<!– debug level –>
<DebugLevel>2</DebugLevel>
<WelcomeList>
<WeclomeFile>index.html</WeclomeFile>
<WeclomeFile>index.htm</WeclomeFile>
<WeclomeFile>index.pl</WeclomeFile>
</WelcomeList>
</Context>

.pl file:

use strict;
use warnings;
use XML::Simple;

my $ctxfile = “../conf/context.xml”;

my $parser = XML::Simple->new();
my $data = $parser->XMLin($ctxfile);
debug();
sub debug{
print $data->{ServerHost} . “\n”;
print $data->{PortNumber} . “\n”;
print $data->{BaseDir} . “\n”;
print $data->{TemporaryDir} . “\n”;
print $data->{LogDir} . “\n”;
print $data->{DebugLevel} . “\n”;
print $data->{WelcomeList}->{WeclomeFile}->[0] . “\n”;
print $data->{WelcomeList}->{WeclomeFile}->[1] . “\n”;
print $data->{WelcomeList}->{WeclomeFile}->[2] . “\n”;
}

# It is just in a very hard-coded way to demo how to parse XML with Perl.

Actually you can use below snippet to parse,


sub readConfig{
 my %config;
 my @element = qw(ServerHost PortNumber BaseDir TemporaryDir LogDir DebugLevel);
 foreach my $key (@element){
 $config{$key} = $data->{$key};
 }
 my $count = 1;
 for (@{ $data->{WelcomeList}->{WeclomeFile} }) {
 my $id = "WelcomeFile" . $count++;
 $config{$id} = $_;
 }
 return %config;
 }

#It is a bit elegant, but we can improve it further like reading the element name from its tag.