mdibb.net

Getting started with PHP and XML parsing

Increasingly more and more sites and companies are offering XML data on their websites, such as the team at Audioscrobbler, and of course there has been a huge proliferation of blogs which all seem to offer their own RSS feeds - even I've got one!

For those of us who are technically curious and like playing around with this kind of thing, there are all sorts of tools and things you can download to play with XML, but what particularly interests me is using a server-side language such as ASP.NET or PHP to do some interesting stuff on a website. So, having got my own PHP hosting I thought I'd have a play around with some of the excellent data Audioscrobbler are offering.

The trouble was I found that although there is a lot of information lurking out there on the web about PHP and XML, there doesn't seem to be very much in the way of "getting started" guides. The usually very good official PHP docs are useful as a reference here but lack decent examples I thought. Other examples I found on the net tended to skip over the basics - instead they'd usually bombard you with source listings from the author's latest attempts at World Domination through their pet Genetic Algorithm powered RSS Collation & Summation project (ok thats not a real one!) as an example... great applications maybe, but not great if you just want the absolute minimum to get you started.

So here is a quick and simple bit of PHP code that will Hopefully this example will act as a decent framework that others can hang their own work off of, so feel free to take this and poke, tweak and otherwise modify it how you like. At the end I'll give you an example of using this code as a basis for doing something "real" - in this case using the Audioscrobbler data to find similar music artists to your favourite band - why not try an example now with a search for artists similar to the Sneaker Pimps?

If you just want the code (I dont blame you) then you can download the code & example right now.

Creating the XML Parser

Ok first things first, we need to create the XML parser object - its pretty straight forward.

<?PHP

// Setup XML load and parse functions
// ==================================

// Create parser
$parser = xml_parser_create();

Setting up the Element and Character Data Handlers/call-backs

The next stage is setting up the handlers the parser will use when stepping through the XML file. Here we're just passing the name of the XML parser we created above, and the names of two functions - elementStart and elementEnd - for the element handlers, and the name of one function - cdata - for the character handlers.

// Setup end/start handlers
xml_set_element_handler($parser,"elementStart","elementEnd");

// Ok we need to handle character data
xml_set_character_data_handler($parser,"cdata");

Loading the XML file

This is where I've seen some crazy stuff done on other websites. As far as I know the method I've used below is the most sensible one as this just loads in the XML into a variable; other sites I have seen have tried to include the XML file which is a bit odd, not least because PHP tries to parse the XML declaration as PHP - they claimed this is not a problem as you can "just leave the XML declaration off"...not a good way of doing things.

Anyway - this code will attempt to load in the document provided in the "src" querystring value, and feed it to the parser 4KB at a time. If it fails for whatever reason I've put some basic error reporting in there - you'll probably want to change this for a proper application. Finally at the end, remember to close the file reader.

// Get the data
$source = $_GET['src'];
$file = fopen($source,"r");

// See if we got a file or not.
if ($file === false) {
echo "Error: Couldn't open XML file (". $source . ").";
} else {
// Ok we're ready to read. Just blurt the lot here
while ($data = fread($file, 4096))
xml_parse($parser, $data, feof($file)) or die (
"Error: Couldn't parse the XML:<br/><blockquote>" .
xml_error_string(xml_get_error_code($xml_parser)) .
"</blockquote>At line " . xml_get_current_line_number($xml_parser)
);
fclose($file);
}
// Ok now close the reader.
xml_parser_free($parser);

Creating Element and Character Data Handlers/call-backs functions

Each time the parser comes across a new element, or exits an element, the elementStart and elementEnd functions will get called. A similar thing happens whenever the parser comes across some character data - the cdata function is called. For now we're just going to echo out what is going on, but you can do anything here really.

// Callback functions from reader
// ==============================

function elementStart($parser, $tag, $attributes) {
echo "Started tag " . $tag . "<br/>";
}

function elementEnd($parser, $tag) {
echo "Ended tag " . $tag . "<br/>";
}

function cdata($parser, $data) {
echo "Data " . $data . "<br/>";
}
%>

And that's all there is to it! Put together this XML file will try and open the XML file provided in the querystring, feel the XML to the parser, echoing out the names of the tags and the values of the character data as it goes.

Download code & example

I've thrown together a quick example of how to use this framework in a "real" situation and included it in the download along with the basic framework shown above, so feel free to have a look and a play. If you have any comments or suggestions please feel free to leave a comment!

Back 20.04.2006.

very good article for novice people

> raju mutyala > 10.06.2006

votsorg, pefrect!

> Loan > 06.06.2007