Getting started with PHP and XML parsing
Increasingly more and more sites and companies are offering XML data on their websites, such as the team at Audioscrobbler, and of course there has been a huge proliferation of blogs which all seem to offer their own RSS feeds - even I've got one!For those of us who are technically curious and like playing around with this kind of thing, there are all sorts of tools and things you can download to play with XML, but what particularly interests me is using a server-side language such as ASP.NET or PHP to do some interesting stuff on a website. So, having got my own PHP hosting I thought I'd have a play around with some of the excellent data Audioscrobbler are offering.
The trouble was I found that although there is a lot of information lurking out there on the web about PHP and XML, there doesn't seem to be very much in the way of "getting started" guides. The usually very good official PHP docs are useful as a reference here but lack decent examples I thought. Other examples I found on the net tended to skip over the basics - instead they'd usually bombard you with source listings from the author's latest attempts at World Domination through their pet Genetic Algorithm powered RSS Collation & Summation project (ok thats not a real one!) as an example... great applications maybe, but not great if you just want the absolute minimum to get you started.
So here is a quick and simple bit of PHP code that will
- show you how to setup the XML parser
- show you how to load in an XML document (either locally or from a URL)
- show you how to setup the call-backs for the XML parsing
If you just want the code (I dont blame you) then you can download the code & example right now.
Creating the XML Parser
Ok first things first, we need to create the XML parser object - its pretty straight forward.
<?PHP
// Setup XML load and parse functions
// ==================================
// Create parser
$parser = xml_parser_create();
Setting up the Element and Character Data Handlers/call-backs
The next stage is setting up the handlers the parser will use when stepping through the XML file. Here we're just passing the name of the XML parser we created above, and the names of two functions - elementStart and elementEnd - for the element handlers, and the name of one function - cdata - for the character handlers.
// Setup end/start handlers
xml_set_element_handler($parser,"elementStart","elementEnd");
// Ok we need to handle character data
xml_set_character_data_handler($parser,"cdata");
Loading the XML file
This is where I've seen some crazy stuff done on other websites. As far as I know the method I've used below is the most sensible one as this just loads in the XML into a variable; other sites I have seen have tried to include the XML file which is a bit odd, not least because PHP tries to parse the XML declaration as PHP - they claimed this is not a problem as you can "just leave the XML declaration off"...not a good way of doing things.Anyway - this code will attempt to load in the document provided in the "src" querystring value, and feed it to the parser 4KB at a time. If it fails for whatever reason I've put some basic error reporting in there - you'll probably want to change this for a proper application. Finally at the end, remember to close the file reader.
// Get the data
$source = $_GET['src'];
$file = fopen($source,"r");
// See if we got a file or not.
if ($file === false) {
echo "Error: Couldn't open XML file (". $source . ").";
} else {
// Ok we're ready to read. Just blurt the lot here
while ($data = fread($file, 4096))
xml_parse($parser, $data, feof($file)) or die (
"Error: Couldn't parse the XML:<br/><blockquote>" .
xml_error_string(xml_get_error_code($xml_parser)) .
"</blockquote>At line " . xml_get_current_line_number($xml_parser)
);
fclose($file);
}
// Ok now close the reader.
xml_parser_free($parser);
Creating Element and Character Data Handlers/call-backs functions
Each time the parser comes across a new element, or exits an element, the elementStart and elementEnd functions will get called. A similar thing happens whenever the parser comes across some character data - the cdata function is called. For now we're just going to echo out what is going on, but you can do anything here really.
// Callback functions from reader
// ==============================
function elementStart($parser, $tag, $attributes) {
echo "Started tag " . $tag . "<br/>";
}
function elementEnd($parser, $tag) {
echo "Ended tag " . $tag . "<br/>";
}
function cdata($parser, $data) {
echo "Data " . $data . "<br/>";
}
%>
And that's all there is to it! Put together this XML file will try and open the XML file provided in the querystring, feel the XML to the parser, echoing out the names of the tags and the values of the character data as it goes.
I've thrown together a quick example of how to use this framework in a "real" situation and included it in the download along with the basic framework shown above, so feel free to have a look and a play. If you have any comments or suggestions please feel free to leave a comment!
Back
20.04.2006.
votsorg, pefrect!
Loan
06.06.2007
very good article for novice people