Busy making things: github, links, photos, @mc.

Sphinx Search with PostgreSQL

Posted: February 6th, 2009 | Author: | Filed under: Open Source, Python | Comments Off

While I don’t plan on moving away from Apache Solr for my searching needs any time soon, Jeremy Zawodny’s post on Sphinx at craigslist made me want to take a closer look. Sphinx works with MySQL, PostgreSQL, and XML input as data sources, but MySQL seems to be the best documented. I’m a PostgreSQL guy so I ran in to a few hiccups along the way. These instructions, based on instructions on the Sphinx wiki, got me up and running on Ubuntu Server 8.10.

Install build toolchain:

$ sudo aptitude install build-essential checkinstall

Install Postgres:

$ sudo aptitude install postgresql postgresql-client \\
postgresql-client-common postgresql-contrib \\
postgresql-server-dev-8.3

Get Sphinx source:

$ wget http://www.sphinxsearch.com/downloads/sphinx-0.9.8.1.tar.gz
$ tar xzvf sphinx-0.9.8.1.tar.gz
$ cd sphinx-0.9.8.1

Configure and make:

$ ./configure --without-mysql --with-pgsql \\
--with-pgsql-includes=/usr/include/postgresql/ \\
--with-pgsql-lib=/usr/lib/postgresql/8.3/lib/
$ make

Run checkinstall:

$ mkdir /usr/local/var
$ sudo checkinstall

Sphinx is now installed in /usr/local. Check out /usr/local/etc/ for configuration info.

Create something to index:

$ createdb -U postgres test
$ psql -U postgres test
test=# create table test (id integer primary key not null, text text);
test=# insert into test (text) values ('Hello, World!');
test=# insert into test (text) values ('This is a test.');
test=# insert into test (text) values ('I have another thing to test.');
test=# -- A user with a password is required.
test=# create user foo with password 'bar';
test=# alter table test owner to foo;
test=# \\q

Configure sphinx (replace nano with your editor of choice):

$ cd /usr/local/etc
$ sudo cp sphinx-min.conf.dist sphinx.conf
$ sudo nano sphinx.conf

These values worked for me. I left configuration for indexer and searchd unchanged:

source src1
{
  type = pgsql
  sql_host = localhost
  sql_user = foo
  sql_pass = bar
  sql_db = test
  sql_port = 5432
  sql_query = select id, text from test
  sql_query_info = SELECT * from test WHERE id=$id
}

index test1
{
  source = src1
  path = /var/data/test1
  docinfo = extern
  charset_type = utf-8
}

Reindex:

$ sudo mkdir /var/data
$ sudo indexer --all

Run searchd:

$ sudo searchd

Play:

$ search world

Sphinx 0.9.8.1-release (r1533)
Copyright (c) 2001-2008, Andrew Aksyonoff

using config file '/usr/local/etc/sphinx.conf'...
index 'test1': query 'world ': returned 1 matches of 1 total in 0.000 sec

displaying matches:
1. document=1, weight=1

words:
1. 'world': 1 documents, 1 hits

Use Python:

cd sphinx-0.9.8.1/api
python
>>> import sphinxapi, pprint
>>> c = sphinxapi.SphinxClient()
>>> q = c.Query('world')
>>> pprint.pprint(q)
{'attrs': [],
 'error': '',
 'fields': ['text'],
 'matches': [{'attrs': {}, 'id': 1, 'weight': 1}],
 'status': 0,
 'time': '0.000',
 'total': 1,
 'total_found': 1,
 'warning': '',
 'words': [{'docs': 1, 'hits': 1, 'word': 'world'}]}

If you add new data and want to reindex, make sure you use the --rotate flag:

sudo indexer --rotate --all

This is an extremely quick and dirty installation designed to give me a sandbox
to play with. For production use you would want to run as a non-privileged user
and would probably want to have an /etc/init.d script for searchd or run it
behind supervised. If you’re looking to experiment with Sphinx and MySQL,
there should be plenty of documentation out there to get you started.


Comments are closed.