Day: February 6, 2009

  • Sphinx Search with PostgreSQL

    While I don’t plan on moving away from Apache Solr for my searching needs any time soon, Jeremy Zawodny’s post on Sphinx at craigslist made me want to take a closer look. Sphinx works with MySQL, PostgreSQL, and XML input as data sources, but MySQL seems to be the best documented. I’m a PostgreSQL guy so I ran in to a few hiccups along the way. These instructions, based on instructions on the Sphinx wiki, got me up and running on Ubuntu Server 8.10.

    Install build toolchain:

    $ sudo aptitude install build-essential checkinstall

    Install Postgres:

    $ sudo aptitude install postgresql postgresql-client \\
    postgresql-client-common postgresql-contrib \\

    Get Sphinx source:

    $ wget
    $ tar xzvf sphinx-
    $ cd sphinx-

    Configure and make:

    $ ./configure --without-mysql --with-pgsql \\
    --with-pgsql-includes=/usr/include/postgresql/ \\
    $ make

    Run checkinstall:

    $ mkdir /usr/local/var
    $ sudo checkinstall

    Sphinx is now installed in /usr/local. Check out /usr/local/etc/ for configuration info.

    Create something to index:

    $ createdb -U postgres test
    $ psql -U postgres test
    test=# create table test (id integer primary key not null, text text);
    test=# insert into test (text) values ('Hello, World!');
    test=# insert into test (text) values ('This is a test.');
    test=# insert into test (text) values ('I have another thing to test.');
    test=# -- A user with a password is required.
    test=# create user foo with password 'bar';
    test=# alter table test owner to foo;
    test=# \\q

    Configure sphinx (replace nano with your editor of choice):

    $ cd /usr/local/etc
    $ sudo cp sphinx-min.conf.dist sphinx.conf
    $ sudo nano sphinx.conf

    These values worked for me. I left configuration for indexer and searchd unchanged:

    source src1
      type = pgsql
      sql_host = localhost
      sql_user = foo
      sql_pass = bar
      sql_db = test
      sql_port = 5432
      sql_query = select id, text from test
      sql_query_info = SELECT * from test WHERE id=$id
    index test1
      source = src1
      path = /var/data/test1
      docinfo = extern
      charset_type = utf-8


    $ sudo mkdir /var/data
    $ sudo indexer --all

    Run searchd:

    $ sudo searchd


    $ search world
    Sphinx (r1533)
    Copyright (c) 2001-2008, Andrew Aksyonoff
    using config file '/usr/local/etc/sphinx.conf'...
    index 'test1': query 'world ': returned 1 matches of 1 total in 0.000 sec
    displaying matches:
    1. document=1, weight=1
    1. 'world': 1 documents, 1 hits

    Use Python:

    cd sphinx-
    >>> import sphinxapi, pprint
    >>> c = sphinxapi.SphinxClient()
    >>> q = c.Query('world')
    >>> pprint.pprint(q)
    {'attrs': [],
     'error': '',
     'fields': ['text'],
     'matches': [{'attrs': {}, 'id': 1, 'weight': 1}],
     'status': 0,
     'time': '0.000',
     'total': 1,
     'total_found': 1,
     'warning': '',
     'words': [{'docs': 1, 'hits': 1, 'word': 'world'}]}

    If you add new data and want to reindex, make sure you use the --rotate flag:

    sudo indexer --rotate --all

    This is an extremely quick and dirty installation designed to give me a sandbox
    to play with. For production use you would want to run as a non-privileged user
    and would probably want to have an /etc/init.d script for searchd or run it
    behind supervised. If you’re looking to experiment with Sphinx and MySQL,
    there should be plenty of documentation out there to get you started.