mirror of https://github.com/Gnucash/gnucash
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
461 lines
20 KiB
461 lines
20 KiB
|
|
This directory contains code for SQL/Postgres support. The
|
|
SQL backend allows multiple users to view/edit the same data
|
|
at the same time, as well as allowing a single user to
|
|
keep data in an SQL database. This code is 'beta': it seems
|
|
to work, but really has not been heavily tested by lots
|
|
of users, and may still contain bugs. See also the
|
|
'to be done' list at the bottom.
|
|
|
|
|
|
Postgres Install Instructions
|
|
-----------------------------
|
|
1) Install PostgresSQL server, client and devel packages.
|
|
(Both the 6.x and the 7.x versions of Postgres should work).
|
|
|
|
2) if installed from redhat, then running /etc/rc.d/init.d/postgresql
|
|
will setup and initialize basic Postgres first-time setup & config.
|
|
3) as root, 'su - postgres' then run 'createuser' to add your user id
|
|
(don't set a password on your postgres db name, yet, GnuCash doesn't
|
|
have a GUI to ask for your password yet)
|
|
If you've forgotten what users are on the system, you can check
|
|
by starting the 'psql' command shell and typing the command.
|
|
'SELECT * FROM pg_shadow;'
|
|
Note this only works for the postgres user, and other users that
|
|
have createuser permissions.
|
|
|
|
4) (optional) enable TCPIP connections to remote hosts. To do this:
|
|
edit the file pg_hba.conf on the remote host to configure and allow
|
|
access from other hosts. See 'man pg_hba.conf' for details.
|
|
RedHat: /var/lib/pgsql/pg_hba.conf
|
|
Debian: /etc/postgresql/pg_hba.conf
|
|
Also create users. Be sure to set a password.
|
|
(there appears to be no way to pass an empty password to postgres)
|
|
Note also the user must have 'createuser' permissions in order to
|
|
lock tables (this is a bug in postgres 6.5 and maybe later ???)
|
|
|
|
5) Hints and Tips:
|
|
If you've forgotten what databases you've created in the past,
|
|
you can look the filesystem: 'ls -la /var/lib/postgres/data/base'
|
|
to view the existing databases. Alternately, if you connect as
|
|
user postgres, you can 'SELECT * FROM pg_database;'
|
|
|
|
|
|
Alternatively, you can install into a brand new database without
|
|
using root privileges. Perform the following:
|
|
|
|
1) Install postgresql server, client and devel packages.
|
|
2) initdb -D ~/gnucash
|
|
This creates a directory structure and supporting files
|
|
under ~/gnucash. The gnucash directory is automatically
|
|
created. The directories and files are owned by the user
|
|
running 'initdb' and have only user access; group and
|
|
other are cleared.
|
|
|
|
By default, the user running 'initdb' is setup as the
|
|
database superuser for the new database.
|
|
|
|
You don't have to use the name 'gnucash'; you can choose
|
|
whatever you want.
|
|
|
|
3) pg_ctl -D ~/gnucash -o "-p 6543" start
|
|
|
|
This starts a postmaster process for the new database
|
|
listening on port 6543. If your system doesn't already
|
|
have a postmaster processing running for the system
|
|
database, you can leave out '-o "-p 6543"' and it will
|
|
use the default port 5432. You can use any open port.
|
|
|
|
By default, the postmaster process will only accept
|
|
connections from processes on the local host.
|
|
|
|
|
|
GnuCash Build Instructions
|
|
--------------------------
|
|
Same as usual, but you must specify the flag '--enable-sql' in order
|
|
to build in Postgres support.
|
|
i.e.
|
|
./configure --enable-sql
|
|
and then 'make'.
|
|
|
|
|
|
How To Use This Thing
|
|
---------------------
|
|
a) Open your favorite datafile in the usual fashion.
|
|
b) Click on 'Save As'
|
|
c) enter the following URL instead of a filename in the file picker:
|
|
postgres://localhost/some_dbname_you_pick
|
|
|
|
The above steps will copy your data into that database. You can
|
|
then restart gnucash (or keep working) and type in the same URL
|
|
in the file open dialogs. Or try it on the command line:
|
|
|
|
/usr/local/bin/gnucash postgres://localhost/dbname_whatever
|
|
|
|
Note: GnuCash will automatically create the database if it does
|
|
not already exist. Do *not* try to create the database by hand,
|
|
or to specify a database that wasn't created by GnuCash.
|
|
|
|
Warning: the postgres backend is in active development,
|
|
and occasionally the database format changes in a way that is
|
|
not backwards compatible. If you have data that you want to
|
|
move from older to newer versions, you can do the following:
|
|
-- start up older version of gnucash, open your postgres://...
|
|
URL.
|
|
-- 'save as' to an ordinary gnucash file (its a flat XML file)
|
|
-- exit the program.
|
|
-- start the newer version of gnucash, open the XML file.
|
|
-- 'save as' a new postgres database. (be sure to pick a new name).
|
|
|
|
|
|
Remote Access
|
|
-------------
|
|
In principle, you can use URL's such as:
|
|
postgres://some.where.com/dbname
|
|
However, these URL's require that the host 'some.where.com' have
|
|
TCPIP access enabled (by following instructions (6) above).
|
|
This is true even if the hostname is your local machine. Thus,
|
|
unless you've setup Postgres TCPIP connections, you *MUST* use
|
|
URL's of the form postgres://localhost/dbname
|
|
|
|
You can specify usernames and passwords in the URL as follows:
|
|
|
|
postgres://localhost/dbname?user=whomever&password=asdf
|
|
|
|
Note that the username and password are the pg_shadow username and
|
|
password, *NOT* your unix login name and password.
|
|
|
|
(A GUI dialog to prompt for your password hasn't been implemented yet.)
|
|
|
|
Note that you (or your SQL database admin) will have to make sure
|
|
that access permissions on the various GnuCash tables have been set
|
|
appropriately. Typically, this will be by starting the 'psql'
|
|
shell, listing all relations with \z, and then issuing
|
|
a 'GRANT ALL on gncAccount TO someuser;', and so on, for each
|
|
relation. Failing to do this will result in lots of unhappiness.
|
|
The sysadmin will also need to make sure that TCP/IP connections
|
|
are properly enabled in the postgres server hba.conf file.
|
|
|
|
|
|
Access Modes
|
|
------------
|
|
There are four different ways or 'modes' in which to access your
|
|
database. These are
|
|
|
|
mode=single-file
|
|
mode=single-update
|
|
mode=multi-user-poll
|
|
mode=multi-user
|
|
|
|
The first two are single-user access modes. These are the
|
|
'safest' modes to use if you don't need multi-user capability.
|
|
The single-update mode is strongly preferred over single-file.
|
|
(The single-file mode is strongly deprecated, and can lead to
|
|
to massive data loss under certain circumstances: for example,
|
|
if your sql connection dies after all the old data has been
|
|
deleted, but before the new data has been written out. Bluntly:
|
|
DO NOT USE mode=single-file. You won't be sorry.).
|
|
|
|
The multi-user modes are intended for use when multiple users
|
|
need to view and make changes to the data. The default mode
|
|
is multi-user.
|
|
|
|
If you have many users (more than a half-dozen or so), you may
|
|
want to run in multi-user-poll mode, instead of the default
|
|
multi-user. This will stop the automatic updates of the
|
|
local instance of gnucash, and thus reduce the amount of
|
|
GUI redraws that happen automatically as other users
|
|
edit data. It should also reduce network traffic
|
|
slightly, and present a slightly lower load on the
|
|
sql server. Different users can mix-n-match the
|
|
two multi-user modes.
|
|
|
|
See the file 'design.txt' for an in-depth explanation of these
|
|
modes.
|
|
|
|
You can specify a particular access mode by specifying the URL
|
|
|
|
postgres://localhost/dbname?mode=whatever
|
|
|
|
You can alternate between multi-user and single-user modes for
|
|
the same database. Just be sure that all users have logged off,
|
|
otherwise gnucash won't let you log on in single-user mode.
|
|
Users are 'logged off' automatically when they exist gnucash.
|
|
In case they have crashed, (and thus appear to still be logged
|
|
in), you can log them off manually by issuing the following:
|
|
|
|
echo "UPDATE gncsession SET time_off='NOW' WHERE time_off = 'infinity';" | psql dbname
|
|
|
|
Hopefully, you do not need to be warned that this dangerous
|
|
if there really are other users logged in. Accessing the
|
|
database in single-user mode when there are other users logged
|
|
in will result in their work being clobbered.
|
|
|
|
Other Options
|
|
-------------
|
|
The 'options=' and 'tty=' postgres keywords are supported. See the
|
|
postgres documentation for more info.
|
|
|
|
|
|
Mangled Balances
|
|
----------------
|
|
If you've been working in multi-user mode for a while, and the
|
|
balances seem to be mangled, log off all the users, log on the the
|
|
same database in single-update mode, and log off again. This
|
|
will cause balance checkpoints to be recomputed, and should
|
|
probably fix any weird account balances you are seeing.
|
|
|
|
(Mangled account balances shouldn't happen, and if you have
|
|
a test case that can reliably recreate these, you should
|
|
submit a bug report).
|
|
|
|
(Note to developers: think of this as an fsck: balances
|
|
shouldn't get wacky, but if you've been running a long time
|
|
and something weird has happened, then a periodic cleanup
|
|
is good to have. Maybe this should be automated ...)
|
|
|
|
Performance
|
|
-----------
|
|
In 'single-user-update' mode, data loads from the sql database
|
|
should be 1.5x faster than comparable loads from the XML flat file,
|
|
at least for medium datasets (measured at 3.5 seconds on a 700MHz
|
|
Athalon a dataset with 3K transactions and 150 accounts, vs. 4.8
|
|
seconds loading from file; postgres version 7.1.2).
|
|
|
|
Hitting the 'save' button is a no-op and takes no CPU cycles.
|
|
(date is saved as its modified, so a global save is not needed).
|
|
|
|
'exit' will stall for a few seconds, while a 'vacuum analyze' is
|
|
issued. The 'vacuum' reclaims storage, and 'analyze' does some
|
|
performance tuning. Doing this regularly improves performance
|
|
about 20% in the cases I looked at.
|
|
|
|
---------
|
|
Accounts are restored roughly at the rate of 50-75 per second
|
|
(for above hardware/software config).
|
|
|
|
---------
|
|
Performance in multi-user mode is still a can of worms, and will
|
|
be a good bit slower for now. The working assumptions are that
|
|
there are millions of transactions, so you can't load all of them.
|
|
But to load only a subset is hard, in several ways. First, its
|
|
technically hard to figure out what subset to load. Then, once
|
|
this is done, it takes about 12 millisecs to load one transaction
|
|
when loading them one at a time (on an Athalon 700MHz). This is
|
|
about 75 transactions a second. Of this, 90% of the CPU cycles
|
|
are burned in the postgres server, so 'basic' tuning won't do
|
|
the trick ... some different algorithm is needed.
|
|
|
|
Handy for raw-file-loading performance measurement is the the
|
|
script 'scan-acct.pl' in src/optional/swig/examples'
|
|
|
|
Weird shit: loading the same data, but in slightly different
|
|
order, can make a *huge* difference in the speed of balance
|
|
subtotal calculations (used only in multi-user mode.) Merely
|
|
rearranging the order of the splits in a transaction can cause
|
|
a factor of 20 (twenty) difference in performance. This can turn
|
|
a 4.5 second load into a minute & a half load !!!! Yow!
|
|
Investigating ...
|
|
|
|
To Be Done
|
|
----------
|
|
Core bugs/features that still need work:
|
|
|
|
(none in this list are truly critical at this point, except possibly
|
|
the username/password dialog, and its effect with regards to initial db
|
|
access).
|
|
|
|
-- Wire in the GUI to ask user for username/password to log onto the
|
|
server. (GUI already implemented, just not yet used).
|
|
|
|
-- The correct URL format for username-password should be:
|
|
postgres://username:passwd@host:port/dbname?key=value&key2=val2
|
|
However, the username-password part of the URI is not correctly
|
|
parsed at the moment.
|
|
|
|
-- distinguish between 'save' and 'save-as' in gnc-book & backend.
|
|
If user hits 'save' in the single-user or multi-user mode, it
|
|
should be a no-op (since the saves have already occurred). Only
|
|
a 'save-as' requires a from-scratch sync.
|
|
?? Maybe this has already been fixed in the redesigned session
|
|
& backend ???
|
|
|
|
-- document how to use the version/extension tables to allow
|
|
seamless upwards compatibility in the face of database schema
|
|
changes.
|
|
|
|
-- allow user to enter URL in GUI dialog. User can currently type URL
|
|
into the file dialog window; it would be nice to have something
|
|
slightly nicer tan this.
|
|
|
|
-- error code should include strings passed back, to be shown in
|
|
GUI dialogs. This is because the backend needs to return things
|
|
like usernames, etc. in the dialogs, and the backend doesn't
|
|
have the interfaces for passing this kind of info.
|
|
(actually, the PERR/PINFO strings might do ??)
|
|
In some cases, we have PERR without setting a backend error...
|
|
grep for all PERR's that don't set backend error.
|
|
|
|
-- note that transaction commit errors may in fact be i/o errors.
|
|
If an i/o error occurred during commit, there would be some
|
|
aberrant rollback behavior. Ouch.
|
|
|
|
-- the transaction rollback code needs to be a GUI popup...
|
|
|
|
-- review (actually, design & architect) the communications error
|
|
handling policy. For example, CopyToEngine() will leave
|
|
the backend in a disabled state if a communication error occurs;
|
|
there will be other debilitating conditions if the backend disappears,
|
|
leaving the engine in a possibly confused state.
|
|
|
|
-- during open of register window, a query of splits is typically run
|
|
twice (once during open -- LedgerDisplayInternal(), and again
|
|
during LedgerDisplayRefresh()) this results in twice as much data
|
|
to the sql backend as needed. This is a performance issue,
|
|
should someday be fixed.
|
|
|
|
-- during transaction edit, query of splits is run twice --
|
|
once during edit recordCB()->xaccSRSaveRegEntry(), and again
|
|
recordCB()->xaccSRRedrawReg(). As above, this is performance
|
|
problem waiting to aggravate.
|
|
|
|
-- minor optimization: KVP's for null strings store null strings,
|
|
this is a waste of time. This occurs primarily for transaction
|
|
memos, I think.
|
|
|
|
-- minor optimization: keeping an iguid cache will avoid sql queries.
|
|
also caching the count for single-user mode would be good
|
|
|
|
To Be Done, Part II
|
|
-------------------
|
|
This list only affects the multi-user and advanced/optional features.
|
|
|
|
Most of the items on this list are 'good-to-fix' but are not
|
|
truly critical. They may result in aberrations, but general
|
|
operation is probably OK.
|
|
|
|
-- when an account is edited, price lookups happen *twice* !!
|
|
once when opened for edit, and again when committed !!
|
|
(the first one is due to gnc_account_tree_refresh ()
|
|
called by gnc_account_window_create() near dialog-account.c:1642)
|
|
|
|
-- nice-to-have performance improvement:
|
|
many queries are needlessly duplicated, even in multi-user mode.
|
|
They are duplicated just in case some other user has modified
|
|
the data. But if we last checked just a few seconds ago,
|
|
we don't really need to update, not yet. We can wait a few
|
|
seconds ... doing this uniformly can cut down on sql traffic.
|
|
(This is particularly egregious for price queries). Implement this
|
|
by putting a datestamp in along with the version number. Let
|
|
the backend check the datestamp, and if its aged less than
|
|
15 seconds, skip doing the sql query.)
|
|
|
|
DONE --- but we need to detect redundant price queries .. ughh
|
|
|
|
-- if always in multi-mode, then initial checkpoints not set up.
|
|
They're not needed until there are a lot of transactions in the
|
|
system. Need to recompute checkpoints on some periodic basis.
|
|
|
|
-- NOTIFY is not updating main window balances ...
|
|
|
|
-- implement price change events
|
|
|
|
-- implement a version table so that we can be upwardly compatible
|
|
with future database changes.
|
|
|
|
-- use FOREIGN KEY to ensure that table integrity is maintained.
|
|
|
|
-- use table constraint ON DELETE to make sure we have no
|
|
dangling splits when a transaction is deleted. As added
|
|
benefit, this might simplify some logic.
|
|
|
|
-- during sync, detect and report conflicting edits to accounts
|
|
and transactions. See the notes for pgendSync() for details
|
|
as to what this is about. For the first pass, this is not a
|
|
serious issue; its a 'nice to have' thing. (sync is called
|
|
when user hits the 'save' button, and should be disabled for
|
|
multi-user modes)
|
|
|
|
-- implement account rollback (i.e. if other user has modified the
|
|
account, we need to do something to merge their work into ours...)
|
|
ditto for prices ...
|
|
|
|
-- bug: if another user deletes an account, we need to look at the
|
|
audit trail to see if the thing has been deleted.
|
|
Otherwise, any edit of this account will incorrectly
|
|
add the deleted account back in. (Note that from the user
|
|
perspective, deleting accounts is a bad idea ...)
|
|
(this is handled with an event, but is not handled via poll)
|
|
|
|
-- fix caching in the face of lost contact to the backend. If the
|
|
backend can't contact its server, then we should just save up caches,
|
|
and then when contact with backend re-established, we should spit
|
|
them out. The pgendSync routine should more or less do the trick;
|
|
note, however, that the XML file format needs to save the version
|
|
number ...
|
|
|
|
-- Implement various advanced database features, such as checking the
|
|
user's permission to view/edit account by account ... (hmmm this
|
|
done by the dbadmin... using SQL commands... which means if user
|
|
tries to write to something they're not allowed to write to,
|
|
then they should be bounced back.) Does some user have the permission
|
|
to create new accounts ??
|
|
|
|
-- Review versioning verification in backend. The desired semantic for
|
|
updates should be like CVS: multiple nearly-simultaneous writers
|
|
are allowed; however, only one wins, and others are kicked back.
|
|
The losers know themselves because they are trying to update info
|
|
of the wrong version.
|
|
-- pgend_transaction_commit does it correctly; but the GUI doesn't
|
|
report a rollback. (need to get err message out of engine, into
|
|
GUI).
|
|
-- pgTransactionSync() is broken, but its not used anywhere.
|
|
-- pgend_account_commit checks version but doesn't rollback.
|
|
(nor does the GUI report a rollback.)
|
|
-- pgendSync does the right thing, except that it doesn't
|
|
detect any version conflicts, nor does it notify the user
|
|
of such conflicts. (actually, it also screws checkpoints).
|
|
|
|
I'm not sure how critical this all is; with a small number of users
|
|
it shouldn't be a problem. With a huge number of users, each editing
|
|
the same transaction (unlikely!?) then there is risk of clobbered
|
|
data, but so what? versioning is at least partly a people-management
|
|
problem. Anyway, what's there now should be pretty good & should
|
|
work for now. Except its mostly untested.
|
|
|
|
-- use version numbers for commodities. Right now, multi-user
|
|
updates of commodities are not detected (this seem OK for
|
|
now, since this is a rare occurrence, right ???)
|
|
|
|
-- multi-user 'save-as' has unexpected results:
|
|
As it currently works, save-as (sync) is a mass-copy of all data
|
|
out of the engine into the named storage location. If the indicated
|
|
storage location doesn't exist, its created, and all of the engine
|
|
data is copied into it. If the storage location (i.e. database)
|
|
does exist, and contains data, then the engine contents are merged
|
|
into it.
|
|
|
|
From the engine point of view, the above is the easiest thing to do.
|
|
However, in the multi-user mode, it may not be what the user is
|
|
expecting. In the multi-user mode, the engine would not contain
|
|
a full copy of the database data; instead, it contains only a subset.
|
|
A 'save as', if implemented as above, would save only that subset,
|
|
instead of all of the data. If the user thought 'save as'
|
|
means the same thing as 'copy', and hoped to copy all of the sql
|
|
data to a file, they would be sorely disappointed. I don't
|
|
currently have an opinion on the 'best' way of dealing with this
|
|
situation. A reasonable solution may well be 'don't do that'.
|
|
|
|
-- Implement PR_BALANCE query type in gncquery.c. This query is
|
|
supposed to return unbalanced transactions from the database.
|
|
I think it might be better to leave this unbalanced, and just
|
|
plain force everything in the database to be balanced. In this
|
|
case, we should then modify the code to check for unbalanced
|
|
transactions before committing them, or otherwise force the backend
|
|
to run in balanced mode. We could set 'force_double_entry=2'
|
|
but this is currently unimplemented in the engine. Alternately,
|
|
we could call 'Scrub' at appropriate times.
|
|
|
|
|
|
============================= END OF FILE ========================
|