PostgreSQL Python

2 downloads 235 Views 378KB Size Report
Setting up the database. •createdb ... Custom Types dict. Tuesday, March 13 ... Change/Add featu
Putting Python in PostgreSQL Frank Wiles [email protected] | @fwiles | http://www.revsys.com

Tuesday, March 13, 12

Today we get to combine two of my favorite things....

Tuesday, March 13, 12

Why on earth would you want to do that? • Build in Python rather than banging your head against PL/PGSQL

• Utilize PyPi packages • Use a network connection (ZMQ anyone?) • Retrofit features on a large/complex/ proprietary system

Tuesday, March 13, 12

Getting Started

Tuesday, March 13, 12

Installing pl/python • Ubuntu - aptitude

install postgresql-plpython-

• Mac OS X Homebrew - PYTHON

= / path/to/python brew install postgresql

Tuesday, March 13, 12

Setting up the database •createdb •createlang plpythonu

•Check with:

SELECT * FROM pg_language

Tuesday, March 13, 12

Writing your first function CREATE OR REPLACE FUNCTION pymulti(a integer, b integer) returns integer AS $$ return a * b $$ LANGUAGE plpythonu;

Tuesday, March 13, 12

Writing your first function CREATE OR REPLACE FUNCTION pymulti(a integer, b integer) returns integer AS $$ return a * b $$ LANGUAGE plpythonu;

Tuesday, March 13, 12

Datatypes

Tuesday, March 13, 12

PostgreSQL

Python

small int and int

int

bigint

long

boolean

bool

All text types

str

SQL Array

list

Custom Types

dict

Debugging • Using print doesn’t work all that well. • You can log directly to PostgreSQL’s log file with:

plpy.notice(“”) plpy.debug(“”) plpy.error(“”) plpy.fatal(“”)

Tuesday, March 13, 12

Or log with Python... import logging logger = logging.BasicConfig( filename=’/tmp/plpy.log’ level=logging.INFO ) logging.info(“Hi there!”)

Tuesday, March 13, 12

DANGER WILL ROBINSON! • Kind of a pain to maintain and debug • Can easily confuse your DBA (if that isn’t you)

• Not exactly slow, but definitely not free • Use with caution: This should be in your back of tricks if you need it, but use sparingly

• Requires superuser privs and no virtualenvs Tuesday, March 13, 12

When should you use it? • Rolling up/aggregating data • Enforce constraints across teams/devs/ languages

• Protect data integrity from ad hoc queries • Change/Add features at the DB level when you can’t at the app level

Tuesday, March 13, 12

Triggers!

Tuesday, March 13, 12

Let’s say we are setup like this... CREATE TABLE trigger_test ( id serial, username varchar, is_active boolean default true, balance int4 ); INSERT INTO trigger_test (username, is_active, balance) values ('frankwiles', 't', 1000); INSERT INTO trigger_test (username, is_active, balance) values ('jacobkaplanmoss', 'f', 50); INSERT INTO trigger_test (username, is_active, balance) values (‘jefftriplett', 't', 10);

Tuesday, March 13, 12

Let’s ensure you can’t change an inactive user’s balance CREATE OR REPLACE FUNCTION check_active() returns trigger AS $$ class NotActive(Exception): pass if not TD["old"]["is_active"]: raise NotActive $$ LANGUAGE plpythonu; Tuesday, March 13, 12

If updating active and balance... CREATE OR REPLACE FUNCTION check_active() returns trigger AS $$ class NotActive(Exception): pass if not TD["old"]["is_active"] and not TD["new"]["is_active"]: raise NotActive $$ LANGUAGE plpythonu; Tuesday, March 13, 12

How to hook a trigger to a table CREATE TRIGGER double_check_active BEFORE UPDATE ON trigger_test FOR EACH ROW EXECUTE PROCEDURE check_active();

Tuesday, March 13, 12

What happens? pycon=# update trigger_test set balance = 60 where username=‘jacobkaplanmoss’; ERROR: NotActive: CONTEXT: Traceback (most recent call last): PL/Python function "check_active", line 5, in raise NotActive PL/Python function "check_active" pycon=#

Tuesday, March 13, 12

Info in TD TD[“event”] TD[“when”]

Tuesday, March 13, 12

INSERT, UPDATE, DELETE, TRUNCATE BEFORE, AFTER, or INSTEAD OF

TD[“level”]

ROW or STATEMENT

TD[“new”]/TD[“old”]

New and Old data

TD[“name”]

Name of trigger

TD[“table_name”]

Table trigger called on

TD[“table_schema”]

Schema table is in

TD[“args”]

Arguments to trigger function

CREATE TRIGGER options CREATE TRIGGER name {BEFORE|AFTER} { event[ OR ... ] } ON table [ FOR [EACH] {ROW |STATEMENT} ] [ WHEN ( condition ) ] EXECUTE PROCEDURE function_name ( arguments ) Tuesday, March 13, 12

Checking credit card numbers CREATE OR REPLACE FUNCTION checkcc(ccnum varchar) returns boolean AS $$ sum = 0 num_digits = len(ccnum) for count in range(0, num_digits): digit = int(card_number[count]) if not (( count & 1 ) ^ oddeven ): digit = digit * 2 if digit > 9: digit = digit - 9 sum = sum + digit if sum & 10 == 0: return True else: return False $$ LANGUAGE plpythonu; Tuesday, March 13, 12

Let’s do something more interesting... CREATE TABLE msgs ( id serial, to_user varchar, from_user varchar, read boolean default false, message text );

Tuesday, March 13, 12

For INSERTs... CREATE OR REPLACE FUNCTION handle_insert() RETURNS trigger AS $$ import redis to_user = TD["new"]["to_user"] unread_key = "unread-%s" % to_user r = redis.Redis() r.incr(unread_count_key) $$ LANGUAGE plpythonu;

Tuesday, March 13, 12

For UPDATEs... CREATE OR REPLACE FUNCTION handle_update() RETURNS trigger AS $$ import redis to_user = TD["new"]["to_user"] unread_key = "unread-%s" % to_user r = redis.Redis() if not TD["old"]["read"] and TD["new"]["read"]: r.decr(unread_count_key) else: r.incr(unread_count_key) $$ LANGUAGE plpythonu; Tuesday, March 13, 12

For DELETEs... CREATE OR REPLACE FUNCTION handle_delete() RETURNS trigger AS $$ import redis to_user = TD["new"]["to_user"] unread_key = "unread-%s" % to_user r = redis.Redis() r.decr(unread_count_key) $$ LANGUAGE plpythonu;

Tuesday, March 13, 12

Attach them to the table CREATE TRIGGER counter_insert AFTER INSERT ON msgs FOR EACH ROW EXECUTE PROCEDURE handle_insert(); CREATE TRIGGER counter_update AFTER UPDATE ON msgs FOR EACH ROW EXECUTE PROCEDURE handle_update(); CREATE TRIGGER counter_delete AFTER DELETE ON msgs FOR EACH ROW EXECUTE PROCEDURE handle_delete(); Tuesday, March 13, 12

What other things can you do?

Tuesday, March 13, 12

Full database access! CREATE OR REPLACE FUNCTION db_count() RETURNS trigger AS $$ if TD[“new”][“active”] == False: plpy.execute( “INSERT INTO finished_tickets (ticket_id) VALUES (%d)” % TD[“new”] [“id”] $$ LANGUAGE plpythonu;

Tuesday, March 13, 12

Internal ideas... • Aggregate/rollup data • Push “noisy” calculations into the DB to avoid round trips

• Build and maintain materialized views • Regenerate expensive reports only when the underlying data changes enough to warrant it

Tuesday, March 13, 12

More external ideas... • Send emails based on data additions/changes • Expire/repopulate caches • Fire off celery tasks • Trigger backups based on % of data changed • Hit external APIs based on conditions • Lots of fun to be had with ZeroMQ... Tuesday, March 13, 12

Retrofitting an existing system • Send email alerts • Populate your system with data based on INSERTs in another

• Want an SMS when your boss adds a ticket to Trac, but don’t want to dive into the code?

Tuesday, March 13, 12

Questions?

[email protected] | @fwiles | http://www.revsys.com

Tuesday, March 13, 12