Gadfly Server Operations
========================

:Version: $Revision: 1.2 $

To  permit  multiple  processes to access and modify a single database
instance,  and  to  reduce the overhead per process of connecting to a
Gadfly  database a Gadfly database may be run in server mode. A Gadfly
server  can  use  a  DBA (data base administrator) configured start-up
script  to  set  up  optimized  query  accesses  and  certain forms of
security.

For  example  to  startup  a  server  for  the test database "test" in
directory "_test_dir" (created by ./run_tests) use::

  gfserve 2222 test _test_dir admin

or  to start up the same server with some non-priviledged policies and
some  named  prepared queries (as initialized in test/gfstest.startup(...))
use::

  gfserve 2222 test _test_dir admin gfstest

In  both  cases  the  admin password for the server is "admin" and the
server  runs  on port 2222. See the doc string for gadfly.server for more
information on the command line arguments.

Only one process should directly access a gadfly database at once (not
mediated  by a server), so if a server is running, no other server for
that database should be started and no other process should connect in
"non-server" mode to that database.

Motivation
----------

There are several reasons to run a server: to allow multiple processes
to  access  the  same database; to allow password protected restricted
access to the database by non-priviledged agents; and to permit faster
access   to   the  database  by  providing  globally  shared  prepared
statements.  Using  a  server also eliminates the need to start up and
load the database many times -- and startup time could be considerable
if the database is large.

For  example  I  imagine  that  simple Gadfly servers may be of use to
implement  database  enabled  CGI  scripts,  whereas  the "non-server"
Gadfly will only run with CGI scripts that do not modify the database,
and  the startup time for Gadfly might make those scripts unacceptibly
slow  if  the  database  is  large. Furthermore, by using the security
features  a Gadfly server could be configured to allow restricted data
distribution  across  a  network without compromising the integrity of
the database.

Security
--------

The primary goal of Gadfly server security is to prevent accidental or
malicious destruction of a database.

Security  is  arbitrated by policies. Policies have passwords that are
never  transmitted  in clear text. However, a "captured" command could
potentially  be repeated by a hostile program even without knowing the
password.  It  is  not  a good idea to run admin or other unrestricted
commands  on  a  network  that  may  have hostile parties sniffing the
network.  As  with the rest of the system I provide no guarantees, but
for many purposes the level of security provided may be acceptible. To
be  specific  passwords  are used to generate md5 certificates for all
server accesses (please see gadfly.gfsocket for implementation details).

A  server  always has an "admin" policy that is permitted to shutdown,
restart,  or  force  a  checkpoint on the server. By default the admin
policy  also  has  the ability to run arbitrary SQL statements such as
"drop  table x". This ability can be disabled in a startup function if
needed::

  admin_policy.general_queries=0

Other  policies  can  be created that have very restricted access. For
example the following startup function initializes two policies beyond
the  admin policy that can only access certain tables in specific ways
(from test/gfstest.py)::

    def startup(admin_policy, connection, Server_instance):
        """example startup script.
    
           add a policies test and test1 passwords same
             test1 is allowed to query the likess table by name
             test is allowed to update likes where drinker='nan'
           also add prepared query dumpwork to admin_policy.
        """
        from gadfly.server import Policy
        admin_policy["dumpwork"] = "select * from work"
        test1 = Policy("test1", "test1", connection, queries=0)
        test = Policy("test", "test", connection, queries=0)
        test1["qlike"] = "select * from likes where drinker=?"
        test["updatenan"] = """
          update likes
          set beer=?, perday=?
          where drinker='nan'
        """
        test["getnan"] = """
          select * from likes where drinker='nan'
        """
        return {"test": test, "test1": test1}

Please  see  the  doc  string  for gadfly.server for more information on
creating startup functions.

A  policy  with  queries  disabled  (queries=0) can only execute named
queries. By using such policies a DBA can configure a server such that
client  programs can only read certain tables, can only update certain
rows of certain tables in certain ways, and so forth.

Even  policies  with  "unrestricted  access"  (queries=1)  can provide
performance  benefits  if they have associated named, prepared queries
(like  "dumpwork"  above).  At  the  moment  the SQL parser slows down
gadfly  a  bit,  and prepared queries will only be parsed once for all
clients.  After  the first access subsequent accesses may be noticably
faster  (10x  faster  in some cases), especially if the server has the
kjbuckets  builtin  C  module.  However, with queries=1 the policy can
execute any SQL statement.

NOTE: The server runs all accesses (once the complete message has been
read  from  the  network)  serially  --  there is no concurrent access
permitted  to a Gadfly instance at this time. For this reason a "large
query"  may  cause  the  server  to "freeze" and delay other accesses.
Incomplete  requests  due to network delays or other problems will not
freeze the server, however (sockets are polled using select.select).

NOTE:  All  server  accesses  run in "autocommit mode" at this time. A
successful  access  automatically  triggers  a database commit (but an
unsuccessful access will rollback).

As an optimization, however, checkpoints only occur occasionally, once
per a given number of accesses, configurable by setting::

  Server_instance.check_loop = 100

Start up
--------

Servers  can  be  started  from  the command line using the gadfly.server
script  interpretation  (as  shown above) or using gadfly.server.Server(...)
from  another  program.  See the doc strings and source for gadfly.server
and test/gfstest.py for more information.

Shut down
---------

NOTE: this is current inaccurate - gfserver needs to be extended to provide
this functionality

Servers  can  be  shut  down  from  the command line interpretation of
gfclient  or  from  another  program  using the gadfly.client.gfclient(...) class
shutdown()  method,  but  only  using  the admin policy with the admin
password. For example to shut down the server started above::

  gfclient shutdown 2222 admin

See  the doc strings and source for gadfly.server and test/gfstest.py for more
information.

Client Access
-------------

Client  access  to  a  gadfly  server  is similar to the normal Python
DB-SIG  DBAPI access to gadfly, except that it is sometimes faster and
can  potentially  be run from any machine reachable on the network (if
the client program knows the password).

To  access  a  gadfly  server  from  a  remote machine the only python
modules   required   (in  addition  to  the  standard  libraries)  are
gadfly.client and gadfly.gfsocket.

Initialize  a  connection  with  a given "POLICY" with "PASSWORD" to a
running server on "machine.domain.com" using port number 2222 with:
from gadfly.client import gfclient::

  conn = gfclient("POLICY", 2222, "PASSWORD", "machine.domain.com")

Note that policy names and passwords are case sensitive.

Queries and other statements are normally executed via cursors. Obtain
a cursor from a connection using::

  cursor = connection.cursor()

Execute a statement in a cursor using::

  cursor.execute(statement)

or to provide dynamic parameters::

  cursor.execute(statement, dynamic_parameters)

For example::

  cursor.execute("select * from work")
  ...
  cursor.execute("select * from work where name=?", ("carla",))

The  dynamic  parameters  work  the  same as described in the the main
gadfly  documentation  page.  In  particular  INSERT VALUES can insert
several rows at once by using a list of tuples for the rows.

If  there  is  any  problem (bad policy name, bad password, server not
running,  queries  not  allowed  for  this  policy)  the  execute will
generate an exception.

To   run   a   named/prepared   query  (initialized  at  startup)  use
execute_prepared,  which takes a prepared statement name rather than a
query string::

  cursor.execute_prepared("updatenan", ("rollingrock", 1))
  ...
  cursor.execute_prepared("getnan")

The  execute_prepared method works just like the execute method except
that  the  "name"  must  be  the  name  of  a query initialized by the
startup(...) function at server startup.

NOTE:  by default any execution that sends or recieves "too much data"
will  be  aborted. Edit gadfly.gfsocket (both on the client end and on the
server  end  if  different)  if  you wish to disable this sanity check
feature::

  LEN_LIMIT=10e8

As with other dbapi cursors the results of a query can be extracted as
a list of tuples using (after execute)::

  result_list = cursor.fetchall()

The  other  fetches (fetchone and fetchmany) have not been implemented
yet (partially since they don't make much sense in this context).

Both named and unnamed statements may be semicolon separated sequences
of  several  SQL  statements,  but  if  they  are  they will return no
results.

Interactive Client Access
-------------------------

The gfplus command provides interactive access to a Gadfly server.


Implementation Comments
-----------------------

For  your  information  the  server/client  interaction  is  much like
"finger"  or  "http"  --  each  client access is a separate TCP/Stream
connection  where  the  client  sends a request and the server sends a
response.  After  each  access  the  connection is closed and the next
access generates a new connection. I did it that way, because it was a
simple and robust strategy (witness the success of HTTP).

Please   note:   Although   I  have  attempted  to  provide  a  robust
implementation for this software I do not guarantee its correctness. I
hope  it  will  work  well  for  you  but  I  do  not assume any legal
responsibility  for  problems  anyone  may  have  during  use of these
programs.


