The release of the first beta of version 9.1 of the open source PostgreSQL database has opened a new era in enterprise-class reliability and data integrity that can compete with the big names, say its developers.
PostgreSQL core team member Josh Berkus said the upcoming 9.1 release introduces a certain amount of fault tolerance through the synchronous replication feature so the database can now guarantee that a write will survive a failure of the master node.
“With this version, we have completed the feature set offered by Oracle DataGuard, for example, which will allow users like NTT to migrate away from Oracle,” Berkus said.
There are more fault-tolerance and clustering features in the pipeline for 9.2, but the core team felt it was important to make the most basic synchronous replication case work as well as it could before moving on to more complex configurations.
Berkus said a number of improvements on this have been discussed for 9.2, including quorum commit (where the write needs to persist to, say, three out of five replicas), and in-memory synch, so that synch to disk needs not be a concern and you can have durability through lots of replicas at network speeds.
For “hot-hot” fault-tolerance in a cluster configuration, Berkus said there are some projects in development outside the core project.
“It’s already possible to construct a large fault-tolerant cluster of PostgreSQL using Skype's Skytools framework if you are building a new application,” he said, adding Skype has some 200 servers running PostgreSQL in production.
“EnterpriseDB and NTT are working on a project called PostgresXC which is intended to fill the use-case of Oracle RAC, although its design is somewhat different. Thirdly, the Postgres-R project, a group-communication based clustered PostgreSQL, has recently become fully funded and may produce a release version this year.”
The 9.1 release is set to include a number of features aimed at helping database administrators, which Berkus said is a direct result of the development model.
“Through the mailing lists and chat, the PostgreSQL developers probably hear from 1000 PostgreSQL DBAs a month. So yes, a lot of our development is in response to direct user feedback,” he said, adding a “good number” of code contributors also user PostgreSQL in production.
Wisconsin Courts does Serializable Snapshot Isolation
The court system in the US state of Wisconsin is using some 100 PostgreSQL databases for the core application used by 72 county circuit courts.
About 3000 users directly connect to the production databases in the various court houses and there are dozens of Web applications receiving millions of hits each day.
Kevin Grittner, database administrator, expert (actual title) at the Wisconsin Court System (WCS) said with a lot of replication between the disparate courts and the central information repository, he is tasked with making sure performance is good, managing the security and integrity of the data and work with programmers to ensure “sound database design” in new development.
“Our current application framework is over 10 years old, and we have decided to rework it using newer technology,” Grittner said. “Now that we have converted everything to PostgreSQL, we intend to eliminate some of the database portability code and use more PostgreSQL-specific features.”
In moving closer to the database, WCS is committing to PostgreSQL for the long-term and needed to deal with problems posed by snapshot isolation.
Grittner received approval to work on Serializable Snapshot Isolation (SSI) so PostgreSQL would maintain data integrity in this new environment “without excessive cost or performance problems”.
“The circuit court database schema has about 400 tables [and] we have over 20 programmers working full time on the applications which hit these databases, with frequent new releases which extend the schema and add new transaction types,” Grittner said.
“Trying to train all of these programmers in the tricks of maintaining data integrity under snapshot isolation amidst all of this flux is not really feasible.”
With SSI, Grittner says, the rule is simply “make sure this transaction will do the right thing if it is the only transaction running”.
“If they do that, it will do the right thing in any mix of transactions, or roll back with a serialisation failure,” he said. “This protects our data integrity while keeping down the development costs.”
Integrated extension management
Extension management is more associated with Web browsers than databases, but at PostgreSQL services company 2ndQuadrant, database architect and PostgreSQL contributor Dimitri Fontaine the extension problem has been an itch he has been “wanting to scratch” for a long time.
“In my usage of PostgreSQL using add-on packages has been a customary choice for years, yet dump and restore was always problematic because of that,” Fontaine said. “It was high time for me to find the time to work on solving that, and I've spent two years to talk about the problem and propose a design that would be widely accepted.”
Fontaine, who is based in France, said no two people within PostgreSQL agreed on what an extension was at first, but since the extension feature relates to many objects, it has a big impact on the source code files so it needed strong community buy-in.