Saturday, December 13, 2014

Switchover/Switchback in Slony-I while upgrading PostgreSQL major versions 8.4.x/9.3.x

Every new release of PostgreSQL comes with a packed of exciting features. To benefit new features, database server should be upgraded. Choosing traditional upgrade paths like pg_dump/pg_restore or pg_upgrade requires a significant downtime of application. Today, if you are looking for minimum downtime upgrade path among major PostgreSQL versions with perfect rollback plan, then it will be accomplished by asynchronous Slony-I replication. Since Slony-I (know more about it here) has the capability to replicate between different PostgreSQL versions,OS and bit architectures easily, so upgrades are doable without requiring a substantial downtime. In addition, it has a consistent switchover and switchback functionality in its design.

IMO, while doing major version upgrades there should be a proper fallback plan because just in case application turn out to buggy or fail to perform well on upgraded version, then we should be able to rollback to older version immediately. Slony-I provides such functionality in the way of switchback. This post demonstrates, minimum downtime upgradation including switchover/switchback steps.

Before going to demo, one important step to be noted, earlier to PG 9.0.x version bytea datatype columns use to store data in ESCAPE format and later version its in HEX format. While performing switchback (newer version to older version), this kind of bytea format differences are not support by Slony-I hence ESCAPE format should be maintained through out upgrade duration, else you may encounter with an error:
ERROR  remoteWorkerThread_1_1: error at end of COPY IN: ERROR:  invalid input syntax for type bytea
CONTEXT:  COPY sl_log_1, line 1: "1     991380  1       100001  public  foo I       0       {id,500003,name,"A         ",b,"\\\\x41"}"
ERROR  remoteWorkerThread_1: SYNC aborted
To fix, no changes required on PG 8.4.x but on PG 9.3.5 bytea_output parameter should be set from HEX to ESCAPE as shown. We can set it at cluster-level ($PGDATA/postgresql.conf) or user-level (ALTER TABLE...SET), I have preferred to go with user-level changes.
slavedb=# alter user postgres set bytea_output to escape;
ALTER ROLE
Lets proceed with upgrade steps. Below are my two versions server details used in this demo, change it accordingly as per your server setup if you are trying:
Origin Node (Master/Primary are called as Origin)                     Subscriber Node (Slave/Secondary are called as Subscriber)
-------------------------------------------------                     ----------------------------------------------------------
Host IP     : 192.168.22.130                                          192.168.22.131
OS Version  : RHEL 6.5 64 bit                                         RHEL 6.5 64 bit 
PG Version  : 8.4.22 (5432 Port)                                      9.3.5 (5432 Port)
Slony Vers. : 2.2.2                                                   2.2.2
PG Binaries : /usr/local/pg84/bin                                     /opt/PostgreSQL/9.3/
Database    : masterdb                                                slavedb 
PK Table    : foo(id int primary key, name char(20), image bytea)     ...restore PK tables structure from Origin... 
For simple understanding and easy implementation, I have divided demo in three sections

1. Compiling for Slony-I binaries against PostgreSQL versions
2. Creating Replication Scripts and executing
3. Testing Switchover/Switchback.

1. Compiling for Slony-I binaries against PostgreSQL version
Download Slony-I sources from here, and perform source installation against PostgreSQL binaries on Origin and Subscriber nodes.
On Origin Node:
# tar -xvf slony1-2.2.2.tar.bz2
# cd slony1-2.2.2
./configure --with-pgbindir=/usr/local/pg84/bin 
            --with-pglibdir=/usr/local/pg84/lib 
            --with-pgincludedir=/usr/local/pg84/include 
            --with-pgpkglibdir=/usr/local/pg84/lib/postgresql 
            --with-pgincludeserverdir=/usr/local/pg84/include/postgresql/
make 
make install

On Subscriber Node: (assuming PG 9.3.5 installed)
# tar -xvf slony1-2.2.2.tar.bz2
# cd slony1-2.2.2
./configure --with-pgconfigdir=/opt/PostgreSQL/9.3/bin 
            --with-pgbindir=/opt/PostgreSQL/9.3/bin 
            --with-pglibdir=/opt/PostgreSQL/9.3/lib 
            --with-pgincludedir=/opt/PostgreSQL/9.3/include 
            --with-pgpkglibdir=/opt/PostgreSQL/9.3/lib/postgresql 
            --with-pgincludeserverdir=/opt/PostgreSQL/9.3/include/postgresql/server/ 
            --with-pgsharedir=/opt/PostgreSQL/9.3/share
make 
make install
2. Creating Replication Scripts and executing
To setup replication, we need create few scripts that take care of replication including switchover/switchback.

1. initialize.slonik - This script holds the Origin/Subscriber nodes connection information.
2. create_set.slonik - This script holds all the Origin PK Tables that replicate to Subscriber Node.
3. subscribe_set.slonik - This script starts replicating sets data to Subscriber Node.
4. switchover.slonik - This script helps to move control from Origin to Subscriber.
5. switchback.slonik - This script helps to fallback control from Subscriber to Origin.

Finally, two more startup scripts "start_OriginNode.sh" and "start_SubscriberNode.sh" that starts slon processes according to the binaries compiled on Origin/Subscriber Nodes.

Download all scripts from here.

Here's the sample data on Origin Node(8.4.22) in Foo Table with a column of bytea datatype, that we will replicate it to Subscriber Node(9.3.5) with the help of scripts created.
masterdb=# select * from foo;
 id |         name         | image
----+----------------------+-------
  1 | Raghav               | test1
  2 | Rao                  | test2
  3 | Rags                 | test3
(3 rows)
Lets call the scripts one by one to setup replication. REMEMBER ALL SLONIK SCRIPT SHOULD BE EXECUTED ON ORIGIN NODE ONLY, EXCEPT "start_OriginNode.sh" AND "start_SubscriberNode.sh" THAT SHOULD BE EXECUTED INDIVIDUALLY.
-bash-4.1$ slonik initalize.slonik
-bash-4.1$ slonik create_set.slonik
create_set.slonik:13: Set 1 ...created
create_set.slonik:16: PKey table *** public.foo *** added.
-bash-4.1$ sh start_OriginNode.sh      
-bash-4.1$ sh start_SubscriberNode.sh  //ON SUBSCRIBER NODE   
-bash-4.1$ slonik subscribe_set.slonik
After successful execution of above script, you can notice the data on Origin(masterdb) has replicated to Subscriber(slavedb). Also not allowing any DML operation on Subscriber node:
slavedb=# select * from foo;
 id |         name         | image
----+----------------------+-------
  1 | Raghav               | test1
  2 | Rao                  | test2
  3 | Rags                 | test3
(3 rows)

slavedb=# insert into foo values (4,'PG-Experts','Image2');
ERROR:  Slony-I: Table foo is replicated and cannot be modified on a subscriber node - role=0
Cool... We have moved data to newer version of PostgreSQL 9.3.5. At this stage if you feel all data have replicated to Subscriber Node, then you can do switchover.

3. Testing Switchover/Switchback.

Let's switchover to latest version with the script and try inserting data on Subscriber/Origin Nodes.
-bash-4.1$ slonik switchover.slonik
switchover.slonik:8: Set 1 has been moved from Node 1 to Node 2

slavedb=# insert into foo values (4,'PG-Experts','Image2');
INSERT 0 1

masterdb=# select * from foo ;
 id |         name         | image
----+----------------------+-------
  1 | Raghav               | test1
  2 | Rao                  | test2
  3 | Rags                 | test3
  4 | PG-Experts           | Image2
(4 rows)

masterdb=# insert into foo values (5,'PG-Experts','Image3');
ERROR:  Slony-I: Table foo is replicated and cannot be modified on a subscriber node - role=0
Perfect... This is what we are looking, now slavedb(Subscriber Node) running PG 9.3.5 version accepting data and masterdb(Origin Node) receiving the slavedb data. Also its rejecting DMLs executed on masterdb.

Slony-I Logs shows the origin/subscriber node id movements at the time of Switchover:
2014-12-12 04:55:06 PST CONFIG moveSet: set_id=1 old_origin=1 new_origin=2
2014-12-12 04:55:06 PST CONFIG storeListen: li_origin=1 li_receiver=2 li_provider=1
2014-12-12 04:55:06 PST CONFIG remoteWorkerThread_1: update provider configuration
2014-12-12 04:55:06 PST CONFIG remoteWorkerThread_1: helper thread for provider 1 terminated
2014-12-12 04:55:06 PST CONFIG remoteWorkerThread_1: disconnecting from data provider 1
...
...
2014-12-12 04:55:11 PST INFO   start processing ACCEPT_SET
2014-12-12 04:55:11 PST INFO   ACCEPT: set=1
2014-12-12 04:55:11 PST INFO   ACCEPT: old origin=1
2014-12-12 04:55:11 PST INFO   ACCEPT: new origin=2
2014-12-12 04:55:11 PST INFO   ACCEPT: move set seq=5000006393
2014-12-12 04:55:11 PST INFO   got parms ACCEPT_SET
If you encounter any issues at this stage, you can switchback to older version. After switchback you can continue with Older version until your application or other issues fixed. This is the perfect rollback plan without wasting much of time in case of issues after switchover..
-bash-4.1$ slonik switchback.slonik
switchback.slonik:8: Set 1 has been moved from Node 2 to Node 1

slavedb=# insert into foo values (5,'PG-Experts','Image3');
ERROR:  Slony-I: Table foo is replicated and cannot be modified on a subscriber node - role=0

masterdb=# insert into foo values (5,'PG-Experts','Image3');
INSERT 0 1

slavedb=# select * from foo ;
 id |         name         | image
----+----------------------+-------
  1 | Raghav               | test1
  2 | Rao                  | test2
  3 | Rags                 | test3
  4 | PG-Experts           | Image2
  5 | PG-Experts           | Image3
(5 rows)
Very Nice...!!! Is this not the exact rollback with minimum downtime ? Yes, its a perfect switching between nodes without missing a transaction.

Logs showing the switchback from Subscriber to Origin Node:
2014-12-12 04:58:45 PST CONFIG moveSet: set_id=1 old_origin=2 new_origin=1
2014-12-12 04:58:45 PST CONFIG storeListen: li_origin=2 li_receiver=1 li_provider=2
2014-12-12 04:58:45 PST CONFIG remoteWorkerThread_2: update provider configuration
2014-12-12 04:58:45 PST CONFIG remoteWorkerThread_2: helper thread for provider 2 terminated
2014-12-12 04:58:45 PST CONFIG remoteWorkerThread_2: disconnecting from data provider 2
2014-12-12 04:58:46 PST CONFIG storeListen: li_origin=2 li_receiver=1 li_provider=2
...
...
2014-12-12 04:58:47 PST INFO   start processing ACCEPT_SET
2014-12-12 04:58:47 PST INFO   ACCEPT: set=1
2014-12-12 04:58:47 PST INFO   ACCEPT: old origin=2
2014-12-12 04:58:47 PST INFO   ACCEPT: new origin=1
2014-12-12 04:58:47 PST INFO   ACCEPT: move set seq=5000006403
2014-12-12 04:58:47 PST INFO   got parms ACCEPT_SET
2014-12-12 04:58:48 PST CONFIG moveSet: set_id=1 old_origin=2 new_origin=1
By this time you might have noticed, none of the transactions are lost during switching operation  between PostgreSQL versions. Only downtime might be your application to start/stop for connecting  to Origin and Subscriber nodes, but whereas Origin/Subscriber nodes are never taken down they are just up and running.

Remember, the method shown here is not only useful for upgrades however its the same method in Slony-I  for moving between Nodes.

Thank you for your patience :). Hope this post helps you to upgrade PostgreSQL with minimum downtime using Slony-I including proper rollback plan.

--Raghav

6 comments :

Jing Tao said...

Dear Raghav:

I have a hard time to understand this sentence:

REMEMBER ALL SLONIK SCRIPT SHOULD BE EXECUTED ON ORIGIN NODE ONLY, EXCEPT "start_OriginNode.sh" AND start_SubscriberNode.sh THAT SHOULD BE EXECUTED INDIVIDUALLY.

Base on your following code, I guess it means - all scripts should be executed on original node only, except "start_SubscriberNode.sh" which should be executed on subscriber node.
-bash-4.1$ slonik initalize.slonik
-bash-4.1$ slonik create_set.slonik
create_set.slonik:13: Set 1 ...created
create_set.slonik:16: PKey table *** public.foo *** added.
-bash-4.1$ sh start_OriginNode.sh
-bash-4.1$ sh start_SubscriberNode.sh //ON SUBSCRIBER NODE
-bash-4.1$ slonik subscribe_set.slonik

Is my understanding true?

Thank you very much!

Jing

Raghavendra said...

Sorry if my statement has confused.

"Base on your following code, I guess it means - all scripts should be executed on original node only, except "start_SubscriberNode.sh" which should be executed on subscriber node.
Is my understanding true? "

That is true. "start_SubscriberNode.sh" should be executed on subscriber node.

--Raghav

Jing Tao said...

Thank you, Raghav!

gkyshnirev said...

Thank you very match!

But I have some advice for generation *.slonik files.
First we need to configure build --with--perltools, it's make possible to use perl scripts slonik_init_cluster, slonik_create_set, ...
Second use --prefix=/opt/slon-2.2.4-pg-8.3 and install it to /opt directory.

Third need to install postgresql c-headers:

$ sudo apt-get install libpq-dev postgresql-server-dev-8.3

Configure with options, make and install to /opt/slon-2.2.4-pg-8.3:

$ cd slony1-2.2.4/
$ ./configure \
--prefix=/opt/slon-2.2.4-pg-8.3 \
--sysconfdir=/etc/slony1 \
--with-perltools \
--with-pgbindir=/usr/lib/postgresql/8.3/bin \
--with-pglibdir=/usr/lib/postgresql/8.3/lib \
--with-pgincludedir=/usr/include/postgresql \
--with-pgpkglibdir=/usr/lib/postgresql/8.3/lib \
--with-pgincludeserverdir=/usr/include/postgresql/8.3/server
$ make
$ sudo make install

Now we able to generate init.slonik, create_set.slonik, subscribe_set.slonik
from config file /etc/slony1/slon_tools.conf with folow commands:

$ /opt/slon-2.2.4-pg-8.3/bin/slonik_init_cluster > init.slonik
$ /opt/slon-2.2.4-pg-8.3/bin/slonik_create_set set1 > create_set.slonik
$ /opt/slon-2.2.4-pg-8.3/bin/slonik_subscribe_set set1 2 > subscribe_set.slonik

(arguments: set1 -- set name, 2 -- slave node id, see slon_tools.conf)

It really usefull instead write *.slonik files by hand.
You can use /etc/slony1/slon_tools.conf-sample as base for you config.

Also you can use slon_start script, to start slon daemon,
which using its config file slon_tools.conf to get host,port,user,dbname, and write pid file.

$ /opt/slon-2.2.4-pg-8.3/slon_start 1

----- slon_tools.conf example -----

1 if ($ENV{"SLONYNODES"}) {
2 require $ENV{"SLONYNODES"};
3 } else {
4 $CLUSTER_NAME = 'replication';
5 $PIDFILE_DIR = '/var/run/slony1';
6 $LOGDIR = '/var/log/slony1';
7 $MASTERNODE = 1;
8 $DEBUGLEVEL = 2;
9 $SLEEP_TIME = 60;
10 $WATCHDOG_VERBOSE = 0;
11
12 # master
13 add_node(
14 node => 1,
15 host => '192.168.1.1',
16 dbname => 'test',
17 port => 5432,
18 user => 'postgres',
19 password => 'password',
20 );
21
22 # slave
23 add_node(
24 node => 2,
25 host => '192.168.1.2',
26 dbname => 'test',
27 port => 5432,
28 user => 'postgres',
29 password => 'password',
30 );
31
32 }
33
34 $SLONY_SETS = {
35 "set1" => {
36 "set_id" => 1,
37 "table_id" => 1,
38 "sequence_id" => 1,
39
40 # tables with primary key
41 "pkeyedtables" => [
42 'test',
43 ],
44
45 "sequences" => [
46 'test_seq',
47 ],
48
49 },
50 };
51
52 if ($ENV{"SLONYSET"}) {
53 require $ENV{"SLONYSET"};
54 }
55
56 # Please do not add or change anything below this point.
57 1;

Raghavendra said...

Still you are writing your own edits on generated scripts by perl-tools. Basically its upto you how it can be automated by custom one or depending on --perl-tools. I tried to give an idea how a simple table can be used with a scripts. :)

Dhavalkiri Patil said...

Hi Raghav,

I need to upgarde POstgres version 8.4 to 9.3 with DB size 2.2 TB. In DB we have data type bytea.
We have any idea,
1. how much time it will take to complete?.
2.If the replication is going on in that time can we do changes to bytea cloumn (on Master ) ? (Insert , update,delete)

Plz reply me ASAP.

Thanks,
Dhaval

Post a Comment