Setting up the database. â¢createdb ... Custom Types dict. Tuesday, March 13 ... Change/Add featu
Putting Python in PostgreSQL Frank Wiles
[email protected] | @fwiles | http://www.revsys.com
Tuesday, March 13, 12
Today we get to combine two of my favorite things....
Tuesday, March 13, 12
Why on earth would you want to do that? • Build in Python rather than banging your head against PL/PGSQL
• Utilize PyPi packages • Use a network connection (ZMQ anyone?) • Retrofit features on a large/complex/ proprietary system
Tuesday, March 13, 12
Getting Started
Tuesday, March 13, 12
Installing pl/python • Ubuntu - aptitude
install postgresql-plpython-
• Mac OS X Homebrew - PYTHON
= / path/to/python brew install postgresql
Tuesday, March 13, 12
Setting up the database •createdb •createlang plpythonu
•Check with:
SELECT * FROM pg_language
Tuesday, March 13, 12
Writing your first function CREATE OR REPLACE FUNCTION pymulti(a integer, b integer) returns integer AS $$ return a * b $$ LANGUAGE plpythonu;
Tuesday, March 13, 12
Writing your first function CREATE OR REPLACE FUNCTION pymulti(a integer, b integer) returns integer AS $$ return a * b $$ LANGUAGE plpythonu;
Tuesday, March 13, 12
Datatypes
Tuesday, March 13, 12
PostgreSQL
Python
small int and int
int
bigint
long
boolean
bool
All text types
str
SQL Array
list
Custom Types
dict
Debugging • Using print doesn’t work all that well. • You can log directly to PostgreSQL’s log file with:
plpy.notice(“”) plpy.debug(“”) plpy.error(“”) plpy.fatal(“”)
Tuesday, March 13, 12
Or log with Python... import logging logger = logging.BasicConfig( filename=’/tmp/plpy.log’ level=logging.INFO ) logging.info(“Hi there!”)
Tuesday, March 13, 12
DANGER WILL ROBINSON! • Kind of a pain to maintain and debug • Can easily confuse your DBA (if that isn’t you)
• Not exactly slow, but definitely not free • Use with caution: This should be in your back of tricks if you need it, but use sparingly
• Requires superuser privs and no virtualenvs Tuesday, March 13, 12
When should you use it? • Rolling up/aggregating data • Enforce constraints across teams/devs/ languages
• Protect data integrity from ad hoc queries • Change/Add features at the DB level when you can’t at the app level
Tuesday, March 13, 12
Triggers!
Tuesday, March 13, 12
Let’s say we are setup like this... CREATE TABLE trigger_test ( id serial, username varchar, is_active boolean default true, balance int4 ); INSERT INTO trigger_test (username, is_active, balance) values ('frankwiles', 't', 1000); INSERT INTO trigger_test (username, is_active, balance) values ('jacobkaplanmoss', 'f', 50); INSERT INTO trigger_test (username, is_active, balance) values (‘jefftriplett', 't', 10);
Tuesday, March 13, 12
Let’s ensure you can’t change an inactive user’s balance CREATE OR REPLACE FUNCTION check_active() returns trigger AS $$ class NotActive(Exception): pass if not TD["old"]["is_active"]: raise NotActive $$ LANGUAGE plpythonu; Tuesday, March 13, 12
If updating active and balance... CREATE OR REPLACE FUNCTION check_active() returns trigger AS $$ class NotActive(Exception): pass if not TD["old"]["is_active"] and not TD["new"]["is_active"]: raise NotActive $$ LANGUAGE plpythonu; Tuesday, March 13, 12
How to hook a trigger to a table CREATE TRIGGER double_check_active BEFORE UPDATE ON trigger_test FOR EACH ROW EXECUTE PROCEDURE check_active();
Tuesday, March 13, 12
What happens? pycon=# update trigger_test set balance = 60 where username=‘jacobkaplanmoss’; ERROR: NotActive: CONTEXT: Traceback (most recent call last): PL/Python function "check_active", line 5, in raise NotActive PL/Python function "check_active" pycon=#
Tuesday, March 13, 12
Info in TD TD[“event”] TD[“when”]
Tuesday, March 13, 12
INSERT, UPDATE, DELETE, TRUNCATE BEFORE, AFTER, or INSTEAD OF
TD[“level”]
ROW or STATEMENT
TD[“new”]/TD[“old”]
New and Old data
TD[“name”]
Name of trigger
TD[“table_name”]
Table trigger called on
TD[“table_schema”]
Schema table is in
TD[“args”]
Arguments to trigger function
CREATE TRIGGER options CREATE TRIGGER name {BEFORE|AFTER} { event[ OR ... ] } ON table [ FOR [EACH] {ROW |STATEMENT} ] [ WHEN ( condition ) ] EXECUTE PROCEDURE function_name ( arguments ) Tuesday, March 13, 12
Checking credit card numbers CREATE OR REPLACE FUNCTION checkcc(ccnum varchar) returns boolean AS $$ sum = 0 num_digits = len(ccnum) for count in range(0, num_digits): digit = int(card_number[count]) if not (( count & 1 ) ^ oddeven ): digit = digit * 2 if digit > 9: digit = digit - 9 sum = sum + digit if sum & 10 == 0: return True else: return False $$ LANGUAGE plpythonu; Tuesday, March 13, 12
Let’s do something more interesting... CREATE TABLE msgs ( id serial, to_user varchar, from_user varchar, read boolean default false, message text );
Tuesday, March 13, 12
For INSERTs... CREATE OR REPLACE FUNCTION handle_insert() RETURNS trigger AS $$ import redis to_user = TD["new"]["to_user"] unread_key = "unread-%s" % to_user r = redis.Redis() r.incr(unread_count_key) $$ LANGUAGE plpythonu;
Tuesday, March 13, 12
For UPDATEs... CREATE OR REPLACE FUNCTION handle_update() RETURNS trigger AS $$ import redis to_user = TD["new"]["to_user"] unread_key = "unread-%s" % to_user r = redis.Redis() if not TD["old"]["read"] and TD["new"]["read"]: r.decr(unread_count_key) else: r.incr(unread_count_key) $$ LANGUAGE plpythonu; Tuesday, March 13, 12
For DELETEs... CREATE OR REPLACE FUNCTION handle_delete() RETURNS trigger AS $$ import redis to_user = TD["new"]["to_user"] unread_key = "unread-%s" % to_user r = redis.Redis() r.decr(unread_count_key) $$ LANGUAGE plpythonu;
Tuesday, March 13, 12
Attach them to the table CREATE TRIGGER counter_insert AFTER INSERT ON msgs FOR EACH ROW EXECUTE PROCEDURE handle_insert(); CREATE TRIGGER counter_update AFTER UPDATE ON msgs FOR EACH ROW EXECUTE PROCEDURE handle_update(); CREATE TRIGGER counter_delete AFTER DELETE ON msgs FOR EACH ROW EXECUTE PROCEDURE handle_delete(); Tuesday, March 13, 12
What other things can you do?
Tuesday, March 13, 12
Full database access! CREATE OR REPLACE FUNCTION db_count() RETURNS trigger AS $$ if TD[“new”][“active”] == False: plpy.execute( “INSERT INTO finished_tickets (ticket_id) VALUES (%d)” % TD[“new”] [“id”] $$ LANGUAGE plpythonu;
Tuesday, March 13, 12
Internal ideas... • Aggregate/rollup data • Push “noisy” calculations into the DB to avoid round trips
• Build and maintain materialized views • Regenerate expensive reports only when the underlying data changes enough to warrant it
Tuesday, March 13, 12
More external ideas... • Send emails based on data additions/changes • Expire/repopulate caches • Fire off celery tasks • Trigger backups based on % of data changed • Hit external APIs based on conditions • Lots of fun to be had with ZeroMQ... Tuesday, March 13, 12
Retrofitting an existing system • Send email alerts • Populate your system with data based on INSERTs in another
• Want an SMS when your boss adds a ticket to Trac, but don’t want to dive into the code?
Tuesday, March 13, 12
Questions?
[email protected] | @fwiles | http://www.revsys.com
Tuesday, March 13, 12