Managing loosely coupled databases with SQLite and PostgreSQLOur site is made up of many smaller projects - the different game clients talk to the different game servers (or pixies as we call them) which each have their own information to store. We want to keep each game's specific data separate from that of all the other games, so that our database does not become polluted with a mass of tables and hard to understand. There is also data that is shared, such as our member accounts table, so we looked for ways to share what needed to be shared and separated what should be logically separated. Because of the shared data, our most obvious solution was to store everything in a single database and organise tables by using a prefix, so all tables used for the Set server would start with "set_". This is a portable solution which would work with any SQL database, but it has a few drawbacks. Firstly it makes writing queries that much longer, and once written the relevant details of the query are even harder to read because of all the long names (not too bad with "set_" but even "lost_cities_" is a bit cumbersome). Secondly it does little to help the developer work with the tables (and only those tables) relevant to the project they are working on. Although this method works, both SQLite and PostgreSQL offer cleaner ways to organise distinct data groups. SQLiteSQLite database are stored in a single file. When opening a database its filename is specified and then any commands and queries work with that file. SQLite connections can also be attached to other databases at the same time, effectively joining separate databases. In this example I will use two databases, the first is the members database shared between all the pixies: $ sqlite3 members.db sqlite> create table accounts (id integer primary key, name text); sqlite> .schema CREATE TABLE accounts (id integer primary key, name text); sqlite> .exitThe second stores some time trial information specific to Set: $ sqlite3 set.db sqlite> create table trials (id integer primary key, date timestamp, deck text); sqlite> create table times (id integer primary key, trial_id integer, account_id integer, seconds integer); sqlite> .schema CREATE TABLE times (id integer primary key, trial_id integer, account_id integer, seconds integer); CREATE TABLE trials (id integer primary key, date timestamp, deck text); sqlite> .exitWith these databases in place it is now possible to connect to the Set database, then attach that connection to the members database as well and perform cross-database queries:
$ sqlite3 set.db
sqlite> attach 'members.db' as members;
sqlite> insert into accounts (name) values ('bob');
sqlite> insert into accounts (name) values ('anne');
sqlite> insert into trials (date, deck) values (datetime('now'), null);
sqlite> insert into times (trial_id, account_id, seconds) values ((select max(id) from trials), (select id from accounts where name = 'bob'), 300);
sqlite> insert into times (trial_id, account_id, seconds) values ((select max(id) from trials), (select id from accounts where name = 'anne'), 299);
sqlite> select name, seconds from times inner join accounts on accounts.id = account_id;
bob|300
anne|299
sqlite> .exit
PostgreSQLPostgreSQL does not support cross database queries, however it does support a powerful way of organising a single database to the same effect by using schemas. Each database has a few schemas set up by default, including the 'public' schema that is initially in use when connecting to a database. This means that any tables, sequences, views etc. are created in the 'public' schema and visible from the 'public' schema which is why you can use PostgreSQL without even being aware that schemas exist. To see the effect of changing schemas we can start out by creating a new one and using it for a bit from a psql prompt: dgym=# \dt No relations found. dgym=# create table accounts (id bigserial unique, name text);Now we have the accounts table in the default 'public' schema and the trials and times tables in the 'set' schema, and by changing the 'search_path' we can change which tables are visible. Regardless of the search path we can still refer to any table in any schema explicitly using 'schema.table' notation: dgym=# select * from public.accouts; id | name ----+------ (0 rows) dgym=#However it is much more convenient to set the search_path to include all the schemas we are currently interested in:
dgym=# set search_path to set, public;
SET
dgym=# \dt
List of relations
Schema | Name | Type | Owner
--------+----------+-------+-------
public | accounts | table | dgym
set | times | table | dgym
set | trials | table | dgym
(3 rows)
dgym=# insert into accounts (name) values ('bob');
INSERT 34542 1
dgym=# insert into accounts (name) values ('anne');
INSERT 34543 1
dgym=# insert into trials (date, deck) values (now(), null);
INSERT 34544 1
dgym=# insert into times (trial_id, account_id, seconds) values ((select currval('trials_id_seq')), (select id from accounts where name = 'bob'), 300);
INSERT 34545 1
dgym=# insert into times (trial_id, account_id, seconds) values ((select currval('trials_id_seq')), (select id from accounts where name = 'anne'), 299);
INSERT 34546 1
dgym=# select name, seconds from times inner join accounts on accounts.id = account_id;
name | seconds
------+---------
bob | 300
anne | 299
(2 rows)
ConclusionsKeeping databases clean and orderly is an important consideration. Deciding what data should be kept together or separated in various tables is at the very core of SQL database design, and when there are distinct uses of the database it is well worth considering which tables should be kept together or separated too. By using these features we can work on different aspects of the system knowing that there will not be any conflicts, and we are better able to focus on only the parts of the system that are relevant at the time. dgym 2007-04-17 |
|