Day: March 25, 2005

  • Q&A With Greg Stein

    My notes on the Q&A after the keynotes are a bit rougher, so I’ll summarize.

    Guido asked when they would open source their (really cool sounding) built system. Greg said that the build system, like a lot of the stuff they use doesn’t make a whole lot of sense if you aren’t running on top of their google platform. Later in the Q&A Greg noted that when they perpare bits of code for release as open source the software usually gets better in the process. They also have some people working on untangling some bits from the google platform for release, so keep an eye on code.google.com

    Someone brought up Boost.Python, which is said to handle templated C++ code better. Greg was unaware of it, but they seem quite happy with SWIG for exposing C++ code to Python.

    Someone else wanted to know if they used Python for network monitoring and SNMP. Greg said that the ops guys keep a close watch on the traffic and that he usually has to inform them in advance when he needs to transfer “big” files.

    Alex Martelli, who starts working at Google in 3 days, wanted to know how SWIG dealt with templated code. It’s not great, but there are ways to get around simple templating.

    Another audience member wanted to know what Greg thought was missing from Python and what could be done about it, and how google dealt with programmers that feel more comefortable in c++ or Java. The first question was answered by the fact that they hired Alex. It was also interesting to learn that they run Python 2.2 on their servers. They would like to upgrade to 2.3, but that’ s a non-trivial task. Each engineering team decides what language they work in, and that’s not a huge problem because they make use of SWIG to cover the C++/Python bridge and they make extensive use of RPC so that it doesn’t matter what language each little bit is written in.

    When asked about how many engineers work at Google, Greg pointed to the public numbers, but wasn’t able to break it down further. A couple hundred perhaps.

    Google uses a derivative of Bugzilla for their bug tracking, but they would like something better, and are investigating other options.

    Everyone seems to equate Python with slow, but that hasn’t really been a problem with Greg or Google. When eShop got bought by Microsoft a lot of Python code was rewritten in ASP/COM and the resulting code was slower than Python.

    When asked about the total number of lines of code written in various languages, Greg reckoned that there were probably more lines of C++ than anything else, followed by Python and then Java (Blogger is written in Java).

    David Asher asked Greg about a patch to Python 1.4 a few years back and how that might be useful in the future with multiprocessors becoming so popular. Back in 1996 Greg patched Python 1.4 to remove the global interpreter lock and keep track of things that needed to be locked in other ways. His patches worked great on a single processor machine, 2 processors was a bonus, but once you got to 3-4 processors it was slower. He hasn’t run in to trouble with the global interpreter lock at google.

    Google doesn’t have debugging tools per se, but they do extensive logging (Greg likes “print”) and have good tools to analize those logs.

    Greg’s work projects include code.google.com as well as some internal stuff. For fun he works on Subwiki (in Python of course) along with the ton of other projects that he’s worked on over the years.

    They didn’t use an off the shelf web application framework for code.google.com, they built on top of the Google http server written in c++. GMail was written in C++ not Python.

    When asked how they stage, Greg said that they can route a small amount of traffic (say 1%) to “Canary Servers”. If these servers don’t fall over, they can slowly pump more traffic to the new version but can easily and quickly set a previous version as the live version.

  • PyCon Day 3: Greg Stein Keynote

    The keynote this morning was given by Greg Stein. Audio/video of his talk should be available at some point, but his slides are not available for download. Because of that I took extensive notes in Tomboy, which should be about as close to the slides as I could get with bad eyesight a the back of the room:

    GregSteinKeynote

    Notes taken during the Friday keynote at PyCon.

    Note before the keynote: Python success stories: pythonology.org.

    Quote by Peter about python. [[sorry, the text was too small for me –Ed. ]]

    Python 10 years ago, contributed to Python, authored some modules and apps. Open source, contributed (c and python). Current chairman of ASF. viewcvs.py. Subversion, Apache httpd.

    “We consider Python to be our secret sauce.” –Paul Everett

    Python at eShop:
    1995 “What in the world is python?”
    1996 “this is great stuff”

    [[eShop gets assimilated. –Ed.]]

    Python at Microsoft:
    1996 “it’s called what?”
    1997 “You actually shipped Python code?”
    1998 “Nice prototype. We’ll rewrite it in the next version.”

    Python at CollabNet:
    2001 “No we don’t really use Python here”
    2003 “Definitely write that in Python”

    Python at Google:
    2004 “Of course we use Python. Why wouldn’t we?”

    Small companies eventually “got it” ahread of the curve
    – Champion was needed

    Larger companies follow Python’s growth curve
    – Supporting environment was needed

    Python had to grow for it to becoming business acceptable
    – Large enough talent pool
    – Support services
    – Books
    – Consulting
    – World wide web
    – Follow the trail-blazers

    Python passed the tipping point years ago.

    Highly adaptable
    – Changing requirements
    – Changes in computing environment

    Rapid Development
    – For new and experienced developers

    Easy to maintain

    Primary languages
    – C++
    – Java
    – Python

    Miscellaneous
    – Some perl used by Operations
    – PHP creeps in for internal webapps
    – Saw Ruby sneaking around
    – Small amount of C#

    SWIG: Simplified Wrapper Interface Generator
    – www.swig.org
    – Started by David Beazley

    Multi-language Environment
    – SWIG pulls these “islands” together
    – Very fast mechanism for integration

    Integrated into build system

    Where do we use it?

    – Across our internal network
    – Across a system lifecycle
    – Live services

    Basic network

    development cloud to infrastructure to a whole bunch of servers. “we have quite a few servers”

    In development build system
    – wrappers for version control (they use Perforce). No, you don’t really understand Java code. Forces code reviews. Sends mail out.
    – build system (written in Python)
    – Packaging. Bundles of data packaged up and sent to servers. All built on Python. 3rd generation, it’s a complex problem on a gigantic scale.

    Some usage in the network infrastructure
    – Binary data publisher
    – package repository

    This is written in Python too of course. Push it out. They keep increasing the scale of the problem.

    Some usage on production servers
    – Monitoring (health, temp, hardware, etc)
    – Auto-restart

    Complete the Lifecycle
    Log reporting
    – We generate a “large” amount of log information
    – Data is pulled back from the servers
    – Analyzed using lots of Python tools
    – Easy to alter the reports based on ever-changing needs

    Python-based services

    Google groups
    – “Python old-timers” Jeske and Long (of eGroups and ClearSilver)

    code.google.com
    – Stein and DiBona

    Others? We have so much going on..

    How code.google.com was built:
    front end stuff on top
    code.google.com server
    SWIG
    google stuff on bottom

    code.google.com

    goopy package
    – Functional stuff to start with
    – Place to put future modules

    Closing
    We have lots of Python code covering a broad range of needs

    Python has helped Google for many, many years

    SWIG is underrated

    We are now starting to open-source some of the code.