Simply iterate over XML with plain PHP using little memory and CPU
One of the things I have been working on lately was a simple XML parser. It’s a simple XML structure in my case though it could be more complex without much change. My solution was a quite powerful yet simple combination of XMLReader and the Iterator interface.
I started with this XML that I needed to import into the database. I had no control over its structure. It looks similar to this.
There can be hundreds of <item> elements but the structure stays stable and doesn’t get deeper. Each element will always contain those three elements <v0> to <v2>.
To import them to the database it seemed plausable to try and iterate over them so – using Doctrine 2.0 – I could persist them in each loop and flush them down in one go.
As the structure is very straight forward and there is no need to traverse XMLReader was the obvious choice as it works on a stream and keeps nothing but the current element in memory.
This is what it looks like.
You see that XMLReader can easily be extended and implement the Iterator interface. The quite ugly nested readItem() method might be improvable but for this kind of structure it suffices. Within it you will get a mapped array that is far more meaningful that the v0, v1 and v2 fields.
The usage is also very simple.
As you see the XML can now simply be iterated over. This will probably also work with simple RSS feed XML and with a little more code you will also be able to adjust it to deeper nested structures.
I’ve tested this on XML files with more than 40 value elements for each of the about 10.000 item elements and it does run a while. But CPU usage doesn’t go up and memory usage stays low as well.
Maybe this can be of help to some of you?