Category: Python

  • cgi_buffer and Ivy

    Via Simon Willison, cgi-buffer takes care of the little things that can make your CGI programs go faster.  We’re talking things like gzip, ETags, and persistent connections.  Libraries for Perl, PHP, and Python are available.  At least in python, using it is as simple as import cgi_buffer.

    The Ivy software bus also looks interesting:

    Ivy is a simple protocol and a set of open-source (LGPL) libraries and programs that allows applications to broadcast information through text messages, with a subscription mechanism based on regular expressions. Ivy libraries are available in C, C++, Java and Perl, on Windows and Unix boxes and on Macs. Several Ivy utilities and hardware drivers are available too.

  • Python At LinuxWorld

    Jeremy Hylton:

    Steve Holden and I are speaking at the LinuxWorld Expo in January at the Javits Center in New York City. Steve is giving his popular Network Programming in Python course. I am talking about Programming Weblogs with Python.

    This is good to know.  I’ll see if I can plan my schedule around one or both sessions.

  • Upgraded to Rawdog 1.6

    I upgraded my aggregator to Rawdog 1.6 today.  I was previously using 1.4.  The new version fixes some bugs and allows some global and item level templating.  It was pretty much a drop in replacement and it’s running via cron quite happily.

  • isbn.py

    Via Pythonware’s Daily Python-URL, isbn.py is an ISBN formatter.  It also allows you to strip non-ISBN characters, verify that a list of numbers is a valid ISBN, and also verify the check digit.  See the authors entry for other open source code dealing with ISBNs.

  • CamelCase Parser Updated

    I didn’t realize it until this evening, but when Radio rendered my code, it was munching some characters.  I have updated my CamelCase Parser post.  What was incorrectly (S*?) is now (\S*?) and should behave properly.  My apologies to anyone who was led astray by the incorrect version.  The source has been correct the whole time, so if you were working from that, you’re good to go.

  • A Very Simple CamelCase Parser in Python

    In playing around with regular expressions in Python, I came up with the following very simple CamelCase parser.  I really like this style of writing out regex.  It’s much more readable than the typical compacted regex that I am used to seeing. simplecamelcase.py:

    
    # simplecamelcase.py - a really simplistic CamelCase parser
    import re
    pattern = re.compile(r'''
        (?x)(   # Begin group
        \b      # word boundry
        [A-Z]   # Find an upper case letter
        (\S*?)  # consume non whitespace
        [A-Z]   # Find a second upper case letter
        (\S*?)  # consume more whitespace
        \b      # end word boundry
        )       # end group, repeat as neccesary
        ''')
    testString = "This is a TestCase of a VerySimple CamelCaseParser."
    find_camel = lambda s: [u[0] for u in re.findall(pattern, s)]
    print find_camel(testString)
    # Prints ['TestCase', 'VerySimple', 'CamelCaseParser']
    

    I have found that Python is a pleasure to putz around with pretty much everything, and regexes are no exception.  You can find more information at Kuchling’s Regular Expression HOWTO and Chapter 3 of David Mertz’ Text Processing in Python.  Both are well worth reading.

    The above code is extremely naive, and of course use it at your own risk.  It would be trivial to modify this code to use re.sub in order to create a very naive wiki parser.  That might be fun.

  • Rawdog 1.5

    Moof gave me a heads up the other day that a new version of Rawdog was working its way out. Version 1.5 was released today, fixing some timeout problems. I’m still recovering from Foo, so I probably won’t upgrade my aggregator for a day or two.

  • Upgrading to Rawdog 1.3

    I recently upgraded Rawdog, my current aggregator of choice, to version 1.3.  I was able to run it at the command prompt (python2 rawdog --update --write) but it wasn’t working as the cron job that I had set up.  The simple answer is that in previous versions, you would produce output with rawdog update write.  Add those dashes in and you’re good to go.

    The config file didn’t look different to me, but I played it safe and appended my feed list to the new config file.

    Overall Rawdog has been treating me well, I’m extremely happy with it.  Thanks are due to Adam Sampson and of course Mark and his amazing dancing feed parser.

  • Cracking Roundup Gromit!

    Weekend Roundup:

    • Hans Nowak shares his “dead simple” options parsing system in Python.
    • You know that you’re a geek when your snooze bar is ‘snooze’ at the command line.
    • Boing Boing links to a pdf file of a paper covering the google file system.  It’s mind boggling fun.
    • John Robb notes that Ray Ozzie is looking for web services wizards to work at Groove.  They’ve got to have something big, as there is already a great team there working on web services stuff.
    • Root Prompt points to a Linux Planet review of a turnkey MySQL server running on hardware by Pogo Linux.  Now even PHB’s can run MySQL…
    • Jenny points to the new Wallace and Gromit game for PS2.  Cracking console game, Gromit!
    • Mark Pilgrim has released Dive Into Python v4.3.
  • __magic__ Variable Conventions

    Via PythonWare’s Daily Python-URL, Alan Green covers module level __magic__ variables in Python.  I’ll have to admit that I knew about some of them and was clueless about others.  For example, __dict__ is a module global read-only dictionary.  You can also explicitly set the public names of a module with __all__.

    I’m guilty of using several __magic__ veriables that aren’t explicitly defined in the language reference.  Most came from looking at the source code of third party Python modules, particularly those by Mark Pilgrim.  If it’s good enough for Mark, it should work for me, right?  Here’s a list of __magic__ variables that you’ll commonly find in modules written by myself and others:

    • __license__: String.  Useful for identifying the license of the code.  Examples: “GPL”, “LGPL”, “BSD”, etc.
    • __history__: String.  Excellent for keeping track of changes between versions.  Similar to a changelog.txt file.  See Mark’s feedparser for an example.  This works for me, but could get out of hand for larger modules with multiple authors.
    • __copyright__: String.  Copyright info goes here.  I’ve also seen llicense info here (for example xmltramp by Aaron Swartz).

    Are there other useful but unofficial __magic__ variables that are useful?  Email me and I’ll append them to the list.  Would it be worthwhile to compile a list of commonly used but unofficial variables into a PEP?  It would be excellent if pydoc took advantage of this extra information.

  • Python 2.3.1

    Python 2.3.1 has been released:

    Python 2.3.1 is a bugfix release of Python 2.3. No new features have been added. Instead, this release is the result of two months of bug hunting. A number of obscure bugs that could cause crashes have been fixed, as well as a number of memory leaks.

    Excellent.  I remember reading on python-dev that they were trying to push 2.3.1 out the door pretty quickly, and here it is.  Thanks again to the Python developers and bug squashers for all the hard work.

  • Roundup: Athlon64, G5, Wireless, Java, Python, Storage, and Design (Oh My!)

    I really need to automate this:

    • CNet: AMD is set to unveil the Athlon64 tomorrow.  Meanwhile, Athlon XP prices are becoming more and more attractive.
    • Emmanuel needs more memory.  Don’t we all?
    • Macworld has more coverage of Virginia Tech’s G5 supercomputer.  Hopefully the cluster will be up and running by October 1 so they can make the next Top500 list.
    • Newsforge covers the ratification of SAML (Security Assertion Markup Language) 1.1.
    • They’re having problems with Linux on Opteron over at OSNews.  Hopefully Linux on x86-64 will be ready for primetime soon.  The desktop chips are coming.
    • Wi-Fi Networking News reports that Boston-Logan aiport will have Wi-Fi by next summer.  When will they be done with the Big Dig?
    • Clustered JDBC 1.0beta11 is out.  There are lots of fixes in this release.
    • Python releases:
      • pyTerra allows you to download images from Terraserver the Python way.
      • Twisted 1.0.7 is out, along with 1.0.8alpha2.
      • PyTables, a “hierarchical database for Python” turns 0.7.2 today.
    • Wei-Meng Lee at the O’Reilly DevCenter shows us how to share files with Bluetooth under Windows XP.
    • Jabber news: JEP-0079: Advanced Message Processing is nearing its final version.
    • CNet notes that Network Appliance is selling cheap storage gear: starting around $10k.
    • Zeldman has a design-related roundup today.
    • CGI:IRC rocks (thanks to Frank for the link).  There’s another thing that I can do in any browser now.
  • Catchup Roundup

    I’ve lost some links over the long weekend with and without power.  Here’s what I’ve gathered this morning:

    • Linux on the WRT54G 0.2: put the penguin on your access point.
    • El Reg: G4 iBook? (via #mobitopia)
    • Rawdog, my current RSS aggregator or choice, has been updated to version 1.3.
    • Dealnews: An intel gigabit PCI card for $33.  Not bad at all.
    • Via OSNews, informIT covers installing and using GCC under Linux.  There are examples of compiling ASM, Objective C, Java, and others with GCC.
    • Noble Ape Simulation “creates a random island environment and simulates the ape inhabitants of the island’s cognitive processes.”
    • Reuters/Washington Post: “Galileo Probe Ends in Deliberate Dive”
    • CNN covers Swen, the latest email worm.  I’ve had a few in the last couple of days.  Symantec has more information.
    • The 2004 Nissan Quest is a sweet minivan.  3.5 litre V6, agressive styling (for a minivan), sunroofs galore, dual-screen display in the back (for $1900).  Prices range from $24k base to $38k tricked  out.
    • PyBackend 0.1 is “a relational database backed object development framework written
      in python and released under GNU Library General Public License.”
    • Knoppix STD is based on Knoppix with a focus on security tools.
  • XML In Python and PyRSS2Gen

    Uche Ogbuji  wraps up the current state of XML in Python.  There is an extensive list of XML software projects and their status, as well as a roundup of recent and current trends.

    PyRSS2Gen is a new module for producing an RSS 2.0 feed in a pythonic way.

  • WebEnvironment.py

    Rick points to WebEnvironment.py by Patrick Lioi.  WebEnvironment.py is a really simple way to use python as a CGI script.  the cgi module is easy enough to use as it is, but WebEnvironment.py allows you to worry about even less.  Writing content out to the client is as easy as server.write(content).  I also like the ability to write out the contents of a file to the user by using server.file("head.html").

  • A Little Bit of This, A Little Bit of That

    It must be Monday again.  Here’s a collection of links from the weekend:

  • My PythonToolbox

    I have posted PythonToolbox, a list of modules and resources that I use quite often with Python.  It includes modules that I commonly use for database access, markup, XML input/output, templating, searching, and classification.  It also has a section on resources that I always seem to be turning to.  Russ also helped out by adding some additional resources.

    As always, it’s a wiki page, so if there’s something that you think should be in my PythonToolbox, feel free to add it.  Enjoy!

  • A Collection of Links

    Yet another weekend offline.  Here are many links that are currently cluttering the taskbar:

    I had more yesterday, but I lost a post somewhere in the process.

  • Problems with Gadfly (Python 2.3/Win32)

    Hmm, I seem to be banging my head against the wall here.  I’ve been looking at Gadfly, a python database that supports a subset of SQL.  I haven’t tested it against other versions of Python on other platforms, but I’m having some issues with it under Python 2.3 on Win32.  Here’s a code snippet copied and pasted from the Gadfly documentation:

    
    import gadfly
    connection = gadfly.gadfly()
    connection.startup("mydatabase", "./")
    cursor = connection.cursor()
    cursor.execute("create table ph (nm varchar, ph varchar)")
    cursor.execute("insert into ph(nm, ph) values ('arw', '3367')")
    cursor.execute("select * from ph")
    for x in cursor.fetchall():
      print x
    # prints ('arw', '3367')
    connection.commit()
    
    

    The only problem is, here’s what I get as output:

     ('3367', 'arw') 

    A simple reversing wouldn’t be too hard to deal with, but in a more complex situation, I had an ID field first that was ending up somewhere in the middle during output.  It definately wasn’t the order that I was expecting.

    Any thoughts?  At first I thought that I was doing something horribly stupid, but I’m supposed to get one answer and I’m getting another! 🙂

  • LOAF.py

    I released version 0.3.1p1 of my LOAF project this evening.

    LOAF.py is released under a BSD license.  More information can be found at the project page.