<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: XML is a huge mess</title>
	<atom:link href="http://martian.org/marty/2009/09/30/xml-is-a-huge-mess/feed/" rel="self" type="application/rss+xml" />
	<link>http://martian.org/marty/2009/09/30/xml-is-a-huge-mess/</link>
	<description>Marty was here!</description>
	<lastBuildDate>Mon, 21 Mar 2011 14:26:38 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: Beating down the XML &#8212; バカな火星人</title>
		<link>http://martian.org/marty/2009/09/30/xml-is-a-huge-mess/comment-page-1/#comment-93164</link>
		<dc:creator>Beating down the XML &#8212; バカな火星人</dc:creator>
		<pubDate>Tue, 06 Oct 2009 18:04:42 +0000</pubDate>
		<guid isPermaLink="false">http://martian.org/marty/?p=196#comment-93164</guid>
		<description>[...] &#8592; XML is a huge mess [...]</description>
		<content:encoded><![CDATA[<p>[...] &larr; XML is a huge mess [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: nperez</title>
		<link>http://martian.org/marty/2009/09/30/xml-is-a-huge-mess/comment-page-1/#comment-93125</link>
		<dc:creator>nperez</dc:creator>
		<pubDate>Tue, 29 Sep 2009 23:03:53 +0000</pubDate>
		<guid isPermaLink="false">http://martian.org/marty/?p=196#comment-93125</guid>
		<description>I think you should have gone with the SAX approach in the first place with such a large document. Then you an build your own simple data structure that likely is much more efficient (and not as featureful) in terms of memory usage.

Now, you talk about processing, but not exactly what you are doing. It is very rare the people need the whole DOM. If this is merely transforming the data from XML to something else, then take a look at using some XSLT. If the data is structured as a root node with many top level children, then I would approach it like so many XMPP developers and treat the document as a stream.

If you go that route, you can make use of POE::Filter::XML (outside of POE even), and it will push parse the document (provided you feed it to how POE would feed it) returning top level document fragments from which you can do things like apply XPATH expressions since the underlying nodes that PFX spits out are XML::LibXML::Element based.</description>
		<content:encoded><![CDATA[<p>I think you should have gone with the SAX approach in the first place with such a large document. Then you an build your own simple data structure that likely is much more efficient (and not as featureful) in terms of memory usage.</p>
<p>Now, you talk about processing, but not exactly what you are doing. It is very rare the people need the whole DOM. If this is merely transforming the data from XML to something else, then take a look at using some XSLT. If the data is structured as a root node with many top level children, then I would approach it like so many XMPP developers and treat the document as a stream.</p>
<p>If you go that route, you can make use of POE::Filter::XML (outside of POE even), and it will push parse the document (provided you feed it to how POE would feed it) returning top level document fragments from which you can do things like apply XPATH expressions since the underlying nodes that PFX spits out are XML::LibXML::Element based.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Chris Prather</title>
		<link>http://martian.org/marty/2009/09/30/xml-is-a-huge-mess/comment-page-1/#comment-93122</link>
		<dc:creator>Chris Prather</dc:creator>
		<pubDate>Tue, 29 Sep 2009 21:31:47 +0000</pubDate>
		<guid isPermaLink="false">http://martian.org/marty/?p=196#comment-93122</guid>
		<description>XML::SAX::Machines and possibly some of the modules from XML::Toolkit would probably fit too. Depends on what you want to do. LibXML has a SAX parser API too.</description>
		<content:encoded><![CDATA[<p>XML::SAX::Machines and possibly some of the modules from XML::Toolkit would probably fit too. Depends on what you want to do. LibXML has a SAX parser API too.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: gray</title>
		<link>http://martian.org/marty/2009/09/30/xml-is-a-huge-mess/comment-page-1/#comment-93121</link>
		<dc:creator>gray</dc:creator>
		<pubDate>Tue, 29 Sep 2009 20:11:28 +0000</pubDate>
		<guid isPermaLink="false">http://martian.org/marty/?p=196#comment-93121</guid>
		<description>Take a look at &lt;a href=&quot;http://search.cpan.org/perldoc?XML::LibXML::Reader&quot; rel=&quot;nofollow&quot;&gt;XML::LibXML::Reader&lt;/a&gt;.</description>
		<content:encoded><![CDATA[<p>Take a look at <a href="http://search.cpan.org/perldoc?XML::LibXML::Reader" rel="nofollow">XML::LibXML::Reader</a>.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: anon_anon</title>
		<link>http://martian.org/marty/2009/09/30/xml-is-a-huge-mess/comment-page-1/#comment-93120</link>
		<dc:creator>anon_anon</dc:creator>
		<pubDate>Tue, 29 Sep 2009 19:55:25 +0000</pubDate>
		<guid isPermaLink="false">http://martian.org/marty/?p=196#comment-93120</guid>
		<description>You probably haven&#039;t tried vtd-xml...give it a try and 512 MB may be quite ok</description>
		<content:encoded><![CDATA[<p>You probably haven&#8217;t tried vtd-xml&#8230;give it a try and 512 MB may be quite ok</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: redbrain</title>
		<link>http://martian.org/marty/2009/09/30/xml-is-a-huge-mess/comment-page-1/#comment-93119</link>
		<dc:creator>redbrain</dc:creator>
		<pubDate>Tue, 29 Sep 2009 18:40:17 +0000</pubDate>
		<guid isPermaLink="false">http://martian.org/marty/?p=196#comment-93119</guid>
		<description>I&#039;ve always hated xml, i guess it keeps everything really structured but tbh it makes everything over the top as in it can cause a lot of over the top solutions talking about standardization. 

Though to help i&#039;ve had a very good experience with libxml2 if your in C anyways i have tried it in C++ but I&#039;ve used the python bindings for about 10 min&#039;s but the C api is very nice :). You have to do your own memory management as you go but it works very well.</description>
		<content:encoded><![CDATA[<p>I&#8217;ve always hated xml, i guess it keeps everything really structured but tbh it makes everything over the top as in it can cause a lot of over the top solutions talking about standardization. </p>
<p>Though to help i&#8217;ve had a very good experience with libxml2 if your in C anyways i have tried it in C++ but I&#8217;ve used the python bindings for about 10 min&#8217;s but the C api is very nice :). You have to do your own memory management as you go but it works very well.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

