Finding Bad Feeds in Your Rawdog Feed List


Posted

in


Rawdog has been segfaulting on me for awhile now.  It was definitely not Adam Sampson’s fault, it was totally mine.  I aparently copied and pasted an HTML link to my feed list instead of an RSS feed.  Oops!

Anyway, I’ve not been able to get my RSS fix for some time, and I finally got around to writing a few lines of Python to diagnose the problem.  In about 9 lines of code I’m able to read my config file line by line, check to see if the line represents a feed listing, grab the url from the line, print the feed url, and parse it using Mark’s feed parser (the same version that Rawdog is using).

The feed section of a Rawdog config file (usually in ~/.rawdog/config) can look like either of the following:

feed 60 http://postneo.com/rss.xml
feed 1h http://postneo.com/rss.xml

The top line is what the older version of Rawdog uses, the lower line is what newer versions of Rawdog uses.  Luckily each line is “feed” + <time> + <url>, so the string can just be split and I can grab the URL with foo[2].

Here is the result of my two minute hack to figure out where the segfaults are coming from:

import feedparser

f=open('config', 'r')
for line in f.readlines():
  if line.startswith('feed'):
    foo = line.split()
    print 'parsing ' + foo[2]
    data = feedparser.parse(foo[2])
f.close()

This will skip over any comments (lines that start with #) and other directives in the Rawdog config file.  I ended up making a backup, running the checker until I came across an error, commented out the offending feed in the backup file, and then removed all feeds in the file I was checking up to and including the offending feed.  This way I wasn’t hammering feeds in the beginning of the list.  Rinse, later, repeat.  It wasn’t until down at the bottom that I found the link that was causing the segfault: a link to one of my posts.  Sheesh.

But hey, with a few lines of python and a few minutes, I’m back up and running.