<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Schadenfreude &#187; xpath</title>
	<atom:link href="http://www.ralree.com/tag/xpath/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.ralree.com</link>
	<description>Malicious enjoyment derived from observing someone else's misfortune</description>
	<lastBuildDate>Thu, 09 Feb 2012 01:49:15 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Ruby to generate RSS feeds for sites that don&#8217;t offer them</title>
		<link>http://www.ralree.com/2009/08/23/ruby-to-generate-rss-feeds-for-sites-that-dont-offer-them/</link>
		<comments>http://www.ralree.com/2009/08/23/ruby-to-generate-rss-feeds-for-sites-that-dont-offer-them/#comments</comments>
		<pubDate>Sun, 23 Aug 2009 14:53:41 +0000</pubDate>
		<dc:creator>Erik</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[generation]]></category>
		<category><![CDATA[gist]]></category>
		<category><![CDATA[github]]></category>
		<category><![CDATA[rss]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[rubyrss]]></category>
		<category><![CDATA[scraping]]></category>
		<category><![CDATA[web]]></category>
		<category><![CDATA[xpath]]></category>

		<guid isPermaLink="false">http://www.ralree.com/?p=22648</guid>
		<description><![CDATA[There&#8217;s this site that has an equipment exchange I wanted to keep track of. Yet, it&#8217;s done with what seems to be a custom php file rather than vbulletin, so none of the usual RSS feeds from the site apply to it. So, I decided to make a scraper/feed-generator to get me the latest version every 5 minutes and generate a nice RSS feed, so I can view it in Google Reader. The volume of posting is low enough that [...]]]></description>
			<content:encoded><![CDATA[<p>There&#8217;s this site that has an equipment exchange I wanted to keep track of.  Yet, it&#8217;s done with what seems to be a custom php file rather than vbulletin, so none of the usual RSS feeds from the site apply to it.  So, I decided to make a scraper/feed-generator to get me the latest version every 5 minutes and generate a nice RSS feed, so I can view it in Google Reader.  The volume of posting is low enough that this won&#8217;t be annoying to see in my daily feeds.</p>
<p>I usually use Ruby for this because it offers Hpricot, a very nice and fast scraper and XPath interface. This time, I resolved to find something that does RSS generation better, and I stumbled upon <a href="http://rubyrss.com/">RubyRSS</a>, which <strong>happens to be in the core ruby distribution</strong>!<br />
<span id="more-22648"></span><br />
Here&#8217;s what I ended up with after about an hour:</p>
<p><script src="http://gist.github.com/173318.js"></script></p>
<p>Now this is impressive if you look at the fail of html <code>id</code> and <code>class</code> attributes coming out of the original page.  I had to base everything off of the links to the items that were not images, and then the structure <em>up the tree</em> from there (see the liberal use of <code>.parent</code>).  I&#8217;ve rediscovered that Hpricot is awesome (_why, come back to us!), and that it truly only takes 30 lines of code to generate a nice RSS feed in ruby.  The resultant RSS feed for MDShooters Classifieds site is <a href="http://www.ralree.com/mdshooters_classifieds.xml">here</a>.</p>
<p>And now, yet another RSS feed generator: <a href="http://ralree.com/md_super_ads.xml">MD Super Ads</a></p>
<p>Here&#8217;s the code:</p>
<p><script src="http://gist.github.com/173623.js"></script></p>
]]></content:encoded>
			<wfw:commentRss>http://www.ralree.com/2009/08/23/ruby-to-generate-rss-feeds-for-sites-that-dont-offer-them/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

