Monday, July 29, 2013

PostgreSQL Compressed Archive Logs in Windows

Many have blogged and offered glaringly on how to compress archive logs in PostgreSQL, I desire to share one of a short version from my end which's on Windows.

If archive storage is concern, then you can choose compressed archive logging feature in PostgreSQL.

"archive_command(string)" in $PGDATA/postgresql.conf, is like a shell command to execute what's passed in string section to copy the completed source file (WAL file segment in $PGDATA/pg_xlog) to destination(ARCHIVE LOCATION). "string" can be anything like shell script(batch in Windows) itself, OS compression utilites, and a special tool pg_compresslog. In Windows, cmd.exe will execute the command passed in archive_command "string".

Since we are applying on Windows platform, pre-requesites are:
  1. Archive directory should have full postgres user access. ("C:\Program Files\PostgreSQL\9.2\archives" in my case)
  2. Window version GZIP utility. Though there are numerious good windows variant compression utilities, I choosed gzip because its supported both on Linux & Windows.
  3. Gzip.exe should have access to Postgres User and also in PATH. ("C:\Program Files\GnuWin32\bin" in my case).
Assuming all pre-requisites are in place and next step should be editing the $PGDATA/postgresql.conf file and changing the archiving related parameters and restart the cluster:
wal_level=archive
archive_mode=on
archive_command = '"C:\\Program Files\\GnuWin32\\bin\\gzip.exe -1 " < "%p" > "C:\\Program Files\\PostgreSQL\\9.2\\archives\\%f.gz"'

c:\Program Files\PostgreSQL\9.2\bin>pg_ctl.exe -D ..\data start      (You can also start from services.msc)
As per PG documentation, changes has been made and restarted the cluster, anticipating from hereon my archives will be compressed one. Lets look at the logs:
2013-07-26 16:07:22 IST LOG:  archive command failed with exit code 1
2013-07-26 16:07:22 IST DETAIL:  The failed archive command was: """C:\Program Files\GnuWin32\bin\gzip.exe" -1 < "pg_xlog\000000010000000000000002" > "C:\Program Files\PostgreSQL\9.2\archives\000000010000000000000002.gz"
'""C:\Program' is not recognized as an internal or external command,
operable program or batch file
Hmmm Ok, archiver process has failed with an interesting error '""C:\Program' is not recognized as an internal or external command ", which signify cmd.exe utility utilized to parse the string has failed to execute the string we passed. First, lets know how its parsing the string from command line and then fix the archive_command(string).
c:\Program Files\PostgreSQL\9.2\bin>cmd.exe /C "C:/Program Files/PostgreSQL/9.2/bin/psql.exe --version"
'C:/Program' is not recognized as an internal or external command,
operable program or batch file.

"/C" mean, Carry out the command specified by the string and then terminate.
Check the above simple command passed in string to find the "psql version" and the error it has thrown which's similar to the one I have in logs. Lets check what "cmd.exe /?" say about parsing the string.
If /C or /K is specified, then the remainder of the command line after
the switch is processed as a command line, where the following logic is
used to process quote (") characters:

    1.  If all of the following conditions are met, then quote characters
        on the command line are preserved:

        - no /S switch
        - exactly two quote characters
        - no special characters between the two quote characters,
          where special is one of: &<>()@^|
        - there are one or more whitespace characters between the
          two quote characters
        - the string between the two quote characters is the name
          of an executable file.

    2.  Otherwise, old behavior is to see if the first character is
        a quote character and if so, strip the leading character and
        remove the last quote character on the command line, preserving
        any text after the last quote character.
As per cmd.exe help, its clear that we should not have special character (which I have in gzip.exe > , < , -1) between the two quoted characters, so to fix, you should close the entire string again in another set of double quotes. Lets retake our previous command with extra double quoted.

c:\Program Files\PostgreSQL\9.2\bin>cmd.exe /C ""C:/Program Files/PostgreSQL/9.2/bin/psql.exe" --version"

psql (PostgreSQL) 9.2.2

Fine, its fixed now. Let me put all in points where you can bypass errors while setting archive_command:
  1. archive_command(string) must be in single quotes
  2. Each part of the command should be double quoted like command,source path, and destination path.
  3. Command options SHOULD NOT be double quoted
  4. Maintain one extra double quote on entire string within start/end of single quote.
Now, let me correct my archive_command....
archive_command = '""C:\\Program Files\\GnuWin32\\bin\\gzip.exe" -1 < "%p" > "C:\\Program Files\\PostgreSQL\\9.2\\archives\\%f.gz""'
Though RELOAD will effect the changes, but I recommend to take RESTART. Now you should have all .gz archive files:
c:\Program Files\PostgreSQL\9.2\archives>dir
 Volume in drive C has no label.
 Volume Serial Number is 4ABE-037A

 Directory of c:\Program Files\PostgreSQL\9.2\archives

07/27/2013  09:05 PM         3,613,153 00000001000000000000005A.gz
07/27/2013  09:05 PM         3,611,096 00000001000000000000005B.gz
07/27/2013  09:05 PM         3,612,856 00000001000000000000005C.gz
              89 File(s)    367,755,965 bytes
               2 Dir(s)  45,557,706,752 bytes free
Nice....
Similarly, you can furthermore use the GZIP.EXE -d for decompressing in "restore_command" while doing PITR on compressed archives.
restore_command='""C:\\Program Files\\GnuWin32\\bin\\gzip.exe" -d < "C:\\Program Files\\PostgreSQL\\9.2\\archives\\%f.gz" > "%p""'
Comments & Corrections are welcomed.

--Raghav

Monday, July 22, 2013

ERROR: could not access file "$libdir/plpython2" - ERROR: could not access file "$libdir/plpython3"

Above error described on PG mailing as it failes to CREATE LANGUAGE plpython2u/plpython3u on PG9.3Beta.
Error:
postgres=# create language plpython3u;
ERROR: could not access file "$libdir/plpython3": No such file or directory

postgres=# create language plpython2u;
ERROR: could not access file "$libdir/plpython2": No such file or directory
Before doing some study on above errors I read below PG documentation link on how PostgreSQL permits to create langage plpython and how they should be configured.

http://www.postgresql.org/docs/9.3/static/plpython-python23.html

Its clear from above link you need to compile the binary two time if you need both plpython2u & plpython3u. AFAIK, ActivePython 2.7.x for plpython2u and 3.2.x for plpython3u can be configured on PG 9.2.x without any difficulty, but I not ever gave a try on PG 9.3Beta2. So, considered to give a try and analyze the error about why and how it could be repaired, I first, begun source installation by setting base path with ActivePython 3.2. (ActivePython download link http://www.activestate.com/activepython/downloads)
[root@localhost postgresql-9.3beta1]# export PATH=/opt/ActivePython-3.2/bin:$PATH
[root@localhost postgresql-9.3beta1]# ./configure --prefix=/usr/local/pg93b --with-python
Compilation has failed and showed error in Config.log file as:
make[3]: Entering directory `/usr/local/src/postgresql-9.3beta1/src/pl/plpython'
*** Cannot build PL/Python because libpython is not a shared library.
*** You might have to rebuild your Python installation. Refer to
make[3]: Leaving directory `/usr/local/src/postgresql-9.3beta1/src/pl/plpython'
In PG documentation we have clear directions on the above error and why it happen, plpython will be a shared library on most of the platforms, but on some platforms we need to specifically force the compiler as python from shared library. For that either you can proceed to /src/pl/python/Makefile for changes or specifically state "shared_libpython=yes" while compiling.

Source compilation examines for two files when you use --with-python that are "python" & "python-config". Although I have ActivePython-3.2 in my base path, still compiler fails to find them "python" & "python-conifg"
[root@localhost postgresql-9.3beta1]# which python
/usr/local/bin/python
[root@localhost postgresql-9.3beta1]# which python-config
/usr/local/bin/python-config
Its because, in ActivePython 3.2 file names will be, "python" as "python3" and "python-config" as "python3-config" therefore compiler pointed to old one instead to new. At this point, Asif Naeem (Thank you for your insight) from our core Dev. Team notfied me to mock existing ActivePython-3.2 files as python & python-config. Its nearly like a hack from him, so I duplicated those files as:
cd /opt/ActivePython-3.2/bin
cp python3-config python-config 
cp python3 python
Ok, now I can see that in my base path.
[root@localhost postgresql-9.3beta1]# export PATH=/opt/ActivePython-3.2/bin:$PATH
[root@localhost postgresql-9.3beta1]# which python 
/opt/ActivePython-3.2/bin/python
[root@localhost postgresql-9.3beta1]# which python-config
/opt/ActivePython-3.2/bin/python-config
I recompiled PG source using --with-python after the alterations and also forcing the compiler to choose SHARED_LIBPYTHON using "shared_libpython".
./configure --prefix=/usr/local/pg93b3 --with-python
make shared_libpython=yes
make shared_libpython=yes install
These steps will effectively compile PG9.3Beta with ActivePython-3.2 libraries. Now lets create language plpython3u:
-bash-4.1$ psql -p 4444
psql (9.3beta1)
Type "help" for help.

postgres=# create language plpython3u;
The connection to the server was lost. Attempting reset: Failed.
!>
!>
Oops, this is strange, why it has crashed now.. In this kind of situation $PGDATA/pg_log are your friends to insight about the issue. Here's the database server log information about crash:
2013-07-13 22:08:37 IST-31208-postgres-postgres :LOG: statement: create language plpython3u;
Could not find platform independent libraries 
Could not find platform dependent libraries 
Consider setting $PYTHONHOME to [:]
Fatal Python error: Py_Initialize: Unable to get the locale encoding
Ok. I missed to set python paths, this is very important when you are working with python or perl languages on your database.
I have set PYTHONHOME, PYTHONPATH & LD_LIBRARY_PATH before starting the cluster.
export PYTHONHOME=/opt/ActivePython-3.2/
export PYTHONPATH=/opt/ActivePython-3.2/bin:$PATH
export LD_LIBRARY_PATH=/opt/ActivePython-3.2/lib:$LD_LIBRARY_PATH

/usr/local/pg93b3/bin/pg_ctl -D /usr/local/pg93b3/data/ start

-bash-4.1$ psql -p 4444
psql (9.3beta1)

Type "help" for help.

postgres=# create language plpython3u;
CREATE LANGUAGE
Nice...It has created plpython3u with ActivePython-3.2.

If you want plpython2u also on the same installation. Don't tweak as we did for ActivePython-3.2, just have a copy of ActivePython-2.7 and set it in the base path and recompile the source.
export PATH=/opt/ActivePython-2.7/bin:$PATH
./configure --prefix=/usr/local/pg93b2 --with-python
make shared_libpython=yes
make shared_libpython=yes install


export PYTHONHOME=/opt/ActivePython-2.7/
export PYTHONPATH=/opt/ActivePython-2.7/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/pg93b2/lib
export LD_LIBRARY_PATH=/opt/ActivePython-2.7/lib:$LD_LIBRARY_PATH

-bash-4.1$ ./psql -p 4444
psql (9.3beta2)
Type "help" for help.

postgres=#
postgres=# create language plpython2u;
CREATE LANGUAGE
Comments & Corrections are most welcomed.

--Raghav

Tuesday, July 9, 2013

Improvements in Slony-I 2.2.0beta

Thanks to Slony-I team for releasing Slony-I 2.2.0Beta with many enhancements on key areas. Few are mentioned here:

Event capturing protocol changed for performance (sl_log_1 / sl_log_2):

Earlier release, any DML(INSERT/UPDATE/DELETE) event data must store as SQL statement in two tables(sl_log_1/sl_log_2). Now, the protocol has completely modified to COPY format, as a result of this there'll be sensible performance improvement like lower processing overhead, lower memory consumption, less query processing work on subscriber database in slon process. Check below, on a simple INSERT statement how the event has been captured in previous and latest version.
Previous version 2.1.2:

==> slony1_log_2_00000000000000000005.sql <==

-- start of Slony-I data
------------------------------------------------------------------
insert into "public"."stest" ("id") values ('103');

Latest version 2.2.0Beta:

==> slony1_log_2_00000000000000000006.sql <==

------------------------------------------------------------------
COPY "_rep220"."sl_log_archive" ( log_origin, log_txid,log_tableid,log_actionseq,log_tablenspname, log_tablerelname, log_cmdtype, log_cmdupdnc
ols,log_cmdargs) FROM STDIN;
1       565688  1       2       public  stest   I       0       {id,1000}
\.
DDL handling: In old version, DDL's are treated as a unique slony event and stored in sl_event table with flag "DDL_SCRIPT" and there clearly no indication whether applied or not. Now, unique slony events are shifted from "sl_event" to "sl_log_script" and a flag to share with if it has successfully applied or not. Sl_log_script.log_cmdtype column indicates, "S"(upper) if EXECUTE SCRIPT issues and registered as event to utilize and "s" (lower) indicates that script execution completed. Check below, previous and latest version:
Previous version 2.1.2:

postgres=# select ev_origin,ev_type,ev_data1,ev_data2 from _newbuild.sl_event where ev_type ilike 'ddl_script';
 ev_origin |  ev_type   | ev_data1 |                ev_data2
-----------+------------+----------+-----------------------------------------
         1 | DDL_SCRIPT | 1        | begin;                                 +
           |            |          | alter table stest add column name text;+
           |            |          | end;
           
Latest version 2.2.0Beta:
           
postgres=# select * from _rep220.sl_log_script ;
 log_origin | log_txid | log_actionseq | log_cmdtype |                 log_cmdargs
------------+----------+---------------+-------------+---------------------------------------------
          1 |   681589 |             1 | S           | {"alter table stest add column name text;"}
          1 |   681589 |             2 | S           | {"                                         +
            |          |               |             | "}
          1 |   681589 |             3 | s           | {}

New 3 catalogs added to Slony-I schema:

sl_apply_sync : This table gives clear picture on how many events like DML's, DDL's applied so far including the event applying frequence.
postgres=# select * from _rep220.sl_apply_stats ;
-[ RECORD 1 ]--------+---------------------------------
as_origin            | 1
as_num_insert        | 21
as_num_update        | 0
as_num_delete        | 0
as_num_truncate      | 0
as_num_script        | 1
as_num_total         | 21
as_duration          | 00:00:11.84
as_apply_first       | 2013-06-16 22:43:18.866365+05:30
as_apply_last        | 2013-06-17 03:18:13.324941+05:30
as_cache_prepare     | 2
as_cache_hit         | 19
as_cache_evict       | 0
as_cache_prepare_max | 1
sl_log_script : Its another catalog to catch an unique event like DDL's issued with EXECUTE SCRIPT. Earlier release, these events were captured in sl_log_1/sl_log_2 without any status of execution of those events. This new catalog can facilitate user to grasp concerning the DDL event details.

sl_failover_targets: The view sl_failover_targets displays the valid failover targets for every origin node.
postgres=# select * from _rep220.sl_failover_targets ;
 set_id | set_origin | backup_id
--------+------------+-----------
      1 |          1 |         2
(1 row)
Many more changes in new release, you can refer to the release notes:

http://git.postgresql.org/gitweb/?p=slony1-engine.git;a=blob_plain;f=RELEASE;h=be03be66d8f39ee148b50dadf71ddbe20a1e5ad8;hb=e3e785c93d14b009abc8424fa7e53e8b75c0f098

Slony-I 2.2.0 beta 4 can be downloaded from:
http://www.slony.info/downloads/2.2/source/slony1-2.2.0.b4.tar.bz2
http://www.slony.info/downloads/2.2/source/slony1-2.2.0.b4-docs.tar.bz2

 --Raghav