Program Kubes - UT Computer Science

3 downloads 294 Views 752KB Size Report
Convergence. • Unified by transformations. • mapping of programs to programs (or models to models). • MDE. – map
Program Kubes

Don Batory Department of Computer Sciences University of Texas at Austin

1

Introduction • Future of software development (SWD) lies in automation • automate rote, time-consuming, error-prone tasks • three th ttechnologies h l i th thatt automate t t such h ttasks k will ill converge

• Model Driven Engineering (MDE) • specify target program by a set of high high-level level models which are easy to understand, write, and maintain • program is synthesized by transforming models to executables

• Refactoring R f t i • reorganize programs/models using transformations to improve structure

• Software Product Lines (SPL) • create a family of related programs from a common set of assets • automatically synthesize an SPL member from a declarative specification

2

Convergence • Unified by transformations • • • •

mapping of programs to programs (or models to models) MDE – map models of one type to another Refactoring – restructure models SPL – elaborate models by adding more detail

• E Emerging i Science S i off Automated A t t d SWD from f experiences i off practitioners • compositional p – based on function ((transformation)) composition p • fundamental mathematical structures provide an informal language to express program/model designs • SPL modeled by categories (POPL 2007, ICSE 2007, GPCE 2008) • theoretical basis for future tools and models for synthesis

3

Why Transformations? •

Java and C# programs use methods to update and translate objects • “programming in the small”



In SWD, objects are programs and methods are transformations • “programming p g g in the large” g



Transformations provide a fundamental way to understand SWD • foundation for MDE MDE, refactorings refactorings, and software product lines



Thinking mathematically leads to unusual design techniques that take time to understand • this talk is about one particular example

4

Long, Long Ago… •

In 2001, we (Lopez-Herrejon, et al) discovered a multi-dimensional structure to express interacting dimensions of variability in SPLs



Called “Origami” • design of a program P was defined by a matrix of transformations • rows, columns l were ffeatures t • matrix was folded in precise ways (thereby composing transformations) until a single term was produced • this term is an expression p ((composition p of transformations)) that synthesizes y P

P = •

technique that others will use in the future

Essential concept in building ATS (250K+ LOC) 5

“This Works???” • Take a “physics” approach – small number of principles (in mathematical form) that hold this entire universe together • Origami is sophisticated technique – taken years to connect seeming unrelated topics that it integrates • • • •

data cubes tensors, categories expression problem feature interactions

database technology mathematics programming languages software design g

• This is a modeling talk aimed at practitioners • no special mathematical background • program synthesis th i and d variability i bilit expressed d by b modern d mathematics th ti • raise the level on how to think about SWD and variability

• Your tour g guide through g a strange g universe… 6

Background from Databases: Data Cubes 7

Data Cubes • Data Cubes (Gray 1997) are a multi-dimensional array visualization of relational tables for data warehouses

Dimension Di i Attributes

Measures M Attribute

Items

Time

Tires Bikes Tools

Jan

Items

Location

Time

QtySold

t l tools

usa

f b feb

4

F b Feb

tires

spain

jan

3

Mar

bikes

france

may

30

Apr









May

USA

Spain

France Greece

Location 8

Data Cubes • Data Cubes (Gray 1997) are a multi-dimensional array visualization of relational tables for data warehouses • tuples are cube entries

Dimension Di i Attributes

M Measure Attributes

Items

Location

Time

QtySold

t l tools

usa

f b feb

4

4

tires

spain

jan

3

3

bikes

france

may

30 30







Items

Time

Tires Bikes Tools

Jan F b Feb Mar Apr



May

USA

Spain

Greece France

Location 9

Cube Queries • Subcubes restrict dimensional values • count # of bikes, tools sold in Europe in January, March, April

It Items

Time

Tires Bikes Tools

Jan Feb eb Mar Apr May

USA

Spain

France Greece

Location

10

Data Cube Operations • Projection eliminates unnecessary elements • count # of bikes, tools sold in Europe in January, March, April

It Items

Time

Tires Bikes Tools

Jan Feb eb Mar Apr May

USA

Spain

France Greece

Location

11

Data Cube Operations • Aggregate or roll-up elements to produce a scalar • scalar totals the # of bikes, tools sold in Europe in January, …

Items Bikes Tools

Time Jan

421

Mar a Apr

Spain

France Greece

Location

Order in which dimensions are aggregated doesn’t matter 12

The Kube Data Type

13

Kubes: N-Dimensional Arrays • Borrow terminology of tensors • • • •

rank of kube is its dimensionality scalar l iis a kkube b off rank k0 vector is a kube of rank 1 matrix is a kube of rank 2

• Notation: # of indices = rank of kube S

– kube of rank 0

(scalar)

Vk

– kube of rank 1

(vector – linear array)

Mij

– kube k b off rank k2

( t i – 2D array)) (matrix

Cijk

– kube of rank 3

(cube – 3D array)

14

Kube Operations • Roll-up is contraction • summing across dimensions i, j for kube Cijk:

Vk =

Σij Cijk

• Interested (in this talk) on contracting to scalars

S=

Σijk Cijk

order in which dimensions are contracted does not matter

• Projection limits values of indices

S =

Σi∈I, j∈J, k∈K Cijk 15

Previous Example Items

Time

Tires Bikes Tools

Jan Feb Mar

Cilt =

Apr May

USA

Spain

France Greece

Location

S = Σi∈{{Bikes,Tires}} l∈ {Spain,France,Greece} t∈{Jan,Mar,Apr} Cilt 16

Software Synthesis • Kube expresses multidimensional models of variability • Data cubes aggregate numbers, program synthesis composes transformations • element is a program transformation • dimensions we will see are defined by feature models F

G

F1 F2 F3

G1 G2

P Program =

Σ

G3 G4 G5

H1

H2

H3

H4

H 17

Example: AHEAD Tool Suite • Language, tool extensible IDE • synthesize tools for dialects of Java • optional extensions to Java

St t State_machine hi SM { States g, h, i; T Transition iti e1 1 : g Æ h ...;

refines class CL { ... }

Transition e2 : h Æ i ...; refines interface IN { ... }

... }

state machine

refines 18

AHEAD Tool Suite • Language, tool variability expressed as a 2D kube • ASE 2002 and SIGSOFT/FSE 2003

Lang Java D Ds Sm

Java J Language Features

Refines Quote

Tool

Parse Harvest Reduce Doclet

Pretty

Tool Features 19

AHEAD Tool Suite • AHEAD Tool specified by 2 sets of features • Jedi – javadoc tool for a dialect of Java

Jedi =

Lang

Σ

Java D Ds Sm Refines Quote

Tool

Parse Harvest Reduce Doclet

Pretty

20

Synthesize Jedi • Project matrix

Jedi =

Lang

Σ

Java D Ds Sm Refines Quote

Tool

Parse Harvest Reduce Doclet

Pretty

21

Synthesize Jedi • Contract matrix • Defines composition of transformations to synthesize Jedi

Jedi =

Σ

Lang Jedi

Java S Sm Refines

Tool

Parse Harvest Doclet

Most AHEAD Tools ((250K+ LOC)) are built by y contracting g 2D, 3D kubes; see ASE 2002, FSE 2003 papers 22

Here’s What’s Odd… • Lots of other ways to contract matrix – Each contraction p produces a different expression p – Function composition (unlike integer addition) is not commutative

Jedi =

Σ

Lang Jedi

Java S Sm Refines

Tool

Parse Harvest Doclet

23

Back to Basics: Transformations in SWD 24

Basics • Fact: no program is created spontaneously • no MDE model, Word document, etc… • created by extending simpler programs • and simpler programs come from simpler programs, recursively

0 P1

P2

P3

direction of time 25

Incremental Development • Going forward in time explains how a program’s design was developed p incrementally, y, in logical g steps p • incremental development • hallmark of automated program synthesis

0 P1

P2

P3

direction of time 26

Timelines • Think of arrows as transformations • mappings from one program to another • path from 0 to a program is its timeline • when arrows are given semantic meaning and are reusable they are called features

0 P1

P2

P3

direction of time 27

Example: Expression Trees • Parse tree of an expression

+ 2

• print() operation on trees:

3

+ print

= “2 + 3” 2

• eval() l() operation ti on trees: t

3

+ eval

=5 2

3

28

Expression Tree Program •

Arrow either defines base program or adds operations

Exp

0

shell

Exp

Exp

print()

print()

Int

Plus

Times

eval()

eval

print Int print()

Plus print()

Int print() eval()

Times print()

Plus print() eval()

Times print() eval()

Methods Product Line 29

Properties of Arrows x

• Arrows are composable

z y

Exp

Exp p print()

Int

print Plus

Int print() Plus print()

Times

Times print()

30

Software Product Lines •

Family of related programs



Features are increments in program functionality (arrows) • SPL building blocks



C Conceptualize t li SPL as a di directed t d graph h • nodes are programs, arrows are features • paths from 0 are program timelines • superimpose the timelines of all programs in an SPL

0

shell

print

eval

Methods Product Line Graph 31

Typically • Product line graphs are trees • one timeline for every program in a product line

32

Categories • Add identity arrow to each node

0

shell

print

eval

Category of the Methods Product Line

• Resulting graph (& arrow composition rules) is a category • object is a domain, arrow maps domain to co-domain co domain

• Software product line = micro (trivial) category • each domain contains one element 33

Typical Implementations of SPL categories and arrows

34

Feature Models • Standard representation of SPLs (Kang 1990) • and-or tree with cross-tree constraints • defines legal combinations of features

Car feature diagram

Cruise

Engine

and

Transmission

alternative choose1

or: 1+

Gasoline

Electric

cross-tree constraints

Body

Manual

Automatic

Cruise Æ Automatic 35

Ordered Feature Models • Order in which features (arrows) are composed matters! • Ordered feature model is a “GenVoca” GenVoca grammar • tokens are features • productions define sentences • order in which tokens appear in sentences = order of composition

car

car : [cruise] engine+ trans body; engine : gasoline | electric;

Cruise

Engine

Transmission

Body

trans : manual | automatic; Gasoline

Electric

Manual

Automatic

## cruise Æ automatic;

GenVoca Grammar

Cruise Æ Automatic

Feature Model 36

Encode Graph of SPL • As an ordered feature model

shell

0

print

eval

Methods Category/Graph

Ops: [eval] [print] shell;

Ops

## eval

print

shell

eval Æ print; eval Æ print

GenVoca Grammar

Feature Diagram 37

Feature Implementations • Arrows can change at least the following: • add new classes, packages • add new members (fields, methods) to existing classes • wrap existing methods …

• Lots of prototype technologies to do this….

Exp

0

shell

Exp

Exp

print()

print() eval() l()

eval

print Int

Plus

Times

Int print()

Plus print()

Times print()

Int print() eval()

Plus print() eval()

Times print() eval()

Methods Product Line 38

SPL Implementation • Collection of arrows (features) whose compositions produce programs p g of an SPL is a GenVoca Model Fi = [ Fn … F3, F2, F1 ]

// vector

• Program P of an SPL is a particular composition of arrows P = F8 + F4 + F2 + F1

// P’s timeline

where + denotes feature (arrow) composition

39

Connection to Kubes • Given

P = F8 + F4 + F2 + F1

• Rewrite P as:

P = Σ i∈(8,4,2,1) Fi • Program synthesis is projection and contraction of 1D kube • GenVoca grammar defines legal kube projections

40

Recap So Far… • Conceptualize a SPL as a micro category • quirk: programs and objects are computed

• Implement by ordered pair:

SPL = ( Grammar, Kube ) • 1 dimension of variability Î Kubes are vectors

41

Another way to derive program ET precursor to 2nd dimension of variability 42

Data Type Extensibility • Variability in the Methods SPL were methods a class supported pp • set of Exp subclasses (Int, Plus, Times) were fixed

• Another A th variability i bilit iis d data t ttypes • fix the set of methods • vary the set of subclasses of Exp Exp print()

exp

0

eval()

Exp

Exp

Exp

print()

print()

print()

eval()

int

eval()

plus

Int print() eval() ()

eval()

times Int print() eval() ()

Plus print() eval() ()

Int print() eval() ()

Plus print() eval() ()

Times print() eval() ()

Types Product Line 43

Types Product Line

0

exp

int

plus

times

Types Category/Graph

Types: [times] [plus] [int] exp;

Types

## plus Æ int; times Æ plus;

GenVoca Grammar

times

plus

int

exp

plus Æ int times Æ plus

Feature Diagram

44

Expression Problem classic example of 2D variability 45

Expression Problem • Fundamental problem of program design and variability • • • •

1975 J. Reynolds 1990 W. Cook 1998 S. Krishnamurthi, M. Felleisen, D. Friedman 1998 P. Wadler

• how to add new methods and data types in a type safe manner • SPL of 30-40 line programs

• ASE 2002, SIGSOFT/FSE 2003 • R. Lopez-Herrejon, p j , D. Batory, y, J-P. Martin,, J. Liu,, J. Sarvela • how ideas scale to synthesize large programs • SPL of 30-40K line programs (1000×) • how we synthesize the AHEAD Tool Suite 46

0

Exp

Exp

Exp

Exp Int

Int

Plus

Int

Plus

Times

print p Exp print()

Int

Plus

Times

print()

print()

print()

eval Exp print() eval() Int print() eval()

Plus print() eval()

Times print() eval()

0

Exp

Exp

Exp

Exp Int

Int

Plus

Int

Plus

Times

Exp print() Exp print() p () Int print()

Plus print()

Times print()

exp

Exp print() eval()

int Int print() eval()

Exp

Exp

Exp

print() eval()

print() eval()

print() eval()

plus Int print() eval()

Plus print() eval()

times Int print() eval()

Plus print() eval()

Times print() eval()

0

Expression Product Line (EPL) Exp

Exp

Exp

Exp Int

Int

Exp print()

Plus

Int

Plus

Exp

Exp

print()

print()

Times

Exp print() p () Int

Int

Plus

Int

Plus

Times

print()

print()

print()

print()

print()

print()

Exp

Exp

Exp

Exp

print() eval()

print() eval()

print() eval()

print() eval()

0

Int print() eval()

Int print() eval()

Plus print() eval()

Int print() eval()

Plus print() eval()

Times print() eval()

Commuting Diagrams in Categories Exp

Exp

Exp

Exp Int

Int

Plus

Int

Plus

Exp

Exp

Exp

print()

print()

print()

Times

Exp print() p () Int print()

Int print()

Exp print() eval()

Int print() eval()

Plus print()

Int print()

Plus print()

Exp

Exp

Exp

print() eval()

print() eval()

print() eval()

Int print() eval()

Plus print() eval()

Int print() eval()

Plus print() eval()

Times print()

Times print() eval()

To Explain Origami, Few More Questions •

What is the EPL feature model? 0 Exp

Exp

Exp

Exp Int

Int

Plus

Int

Times

Exp print()

Exp print()

Exp print()

Plus

Exp print() Int print() ()

Int print() p ()

Exp print() eval()

• • •

Int print()

Int print() eval()

Plus print() eval()

Plus print()

Times print()

Exp print() eval()

Exp print() eval()

Exp print() eval() Int print() eval()

Plus print() ()

Int print() eval()

Plus print() eval()

Times print() eval()

What is the origin of the EPL graph? What arrows are stored? How is EPL related to kubes? 51

Observations

52

Observation • Identify any program in EPL by a pair of axis timelines

0 0

exp

int

plus

times

Types

0

Exp

Exp

shell

Exp

Exp Int

Int

print

Plus

Int

Plus

Exp

Exp

Exp

print()

print()

print()

Times

Exp print() Int print()

Int print()

eval Exp print() eval()

Plus print()

Exp

Exp

Exp

print() eval()

print() eval()

print() eval()

Int print() eval()

Int print() eval()

Methods

Int print()

Plus print()

Int print() eval()

Plus print() eval()

Plus print() eval()

Times print()

Times print() eval()

53

Observation • Identify any program in EPL by a pair of axis timelines

0 0

exp

int

plus

times

Types

0

Exp

Exp

shell

Exp

Exp Int

Int

print

Plus

Int

Plus

Exp

Exp

Exp

print()

print()

print()

Times

Exp print() Int print()

Int print()

eval Exp print() eval()

Methods

Int print() eval()

Plus print()

Int print()

Plus print()

Exp

Exp

Exp

print() eval()

print() eval()

print() eval()

Int print() eval()

Plus print() eval()

Int print() eval()

Plus print() eval()

Times print()

Times print() eval()

54

Boundary Cases • Postulate extra null programs and arrows between them

0 0

0

exp

int

plus

0exp

times

0int

0plus

0shell

print

Int

Int

Plus

Exp

Exp

Exp

print()

print()

print()

Times

print()

eval

Methods

Plus

Exp Int print()

Int print()

0eval

Exp

Exp Int

0print

0times

Exp

Exp

shell

Types

Exp print() eval()

Int print() eval()

Plus print()

Int print()

Plus print()

Exp

Exp

Exp

print() eval()

print() eval()

print() eval()

Int print() eval()

Plus print() eval()

Int print() eval()

Plus print() eval()

Times print()

Times print() eval()

55

Last Pieces to Puzzle • Look at two seemingly unrelated topics • cross products of structures (SPLs) • feature interactions

• Putting them together explains Origami

56

Cross Products

57

Cross Products in Mathematics •

A common way to scale 1D structures to higher dimensions is by taking the cross product of 1D structures • 2D structure • nD structure



= cross product of a pair of 1D structures = cross product of n 1D structures

SPL structure is an ordered pair: SPL = ( Grammar, Kube )



Look at cross product of Methods and Types product lines Methods × Types

=

( Gmethods, Vmethods ) × ( Gtypes, Vtypes )

=

( Gmethods × Gtypes, Vmethods # Vtypes ) 58

EPL Feature Model is… • Union of Methods and Types feature models with extra rule

Methods: [eval] [print] shell; %% eval Æ print

×

Types : [times] [plus] [int] exp; %% plus Æ int times Æ plus

extra grammar rule

EPL : Methods Types ;

Methods: [eval] [print] shell; Types : [times] [plus] [int] exp; %% eval Æ print plus Æ int ti times Æ plus l 59

EPL Graph • tensor product (×) of Methods and Types graphs • shape we want • remember: programs/objects are computed, so too are the arrows 0 0 shell

exp

int

plus

times

Types

0 = Types × Methods

print

eval E Exp eval() print()

Methods Int print() eval()

Plus Times print() eval() eval() print()

60

EPL Graph • tensor product (×) of Methods and Types graphs • shape we want • remember: programs/objects are computed, so too are the arrows 0 0

exp

int

times

plus

Types

0

shell

= Types × Methods

print

eval E Exp eval() print()

Methods Int print() eval()

Plus Times print() eval() eval() print()

61

Kube of EPL • Interaction cross product (#) of the Methods &Types kubes • another tensor product • GenVoca model of EPL is a matrix (2D kube) EPLmethods,types methods types

but what are these entries?

=

Vmethods # Vtypes

=

[ shell, print, eval ] # [ exp, int, plus, times ] (shell # exp)

(shell # int)

(shell # plus)

(shell # times)

(print # exp) (p p)

(print # int)) (p

(print # p (p plus))

(print # times)) (p

(eval # exp)

(eval # int)

(eval # plus)

(eval # times)

=

62

Products of Features and Feature Interactions 63

Products & Interactions •

Features change a design



Structural feature interactions are changes to a design that are added conditionally based on the features that are present



p ((Kyo y Kang) g) Classical example • flood control, fire control • must add extra arrow Æ to control their interaction, i.e., so that they work together correctly



flood fire fire

fire’ flood

flood # fire

flood’

flood × fire = flood#fire + flood + fire

Taking the “square” of features

64

From EPL: int × print Exp Exp

int Int

print

Exp p print() Exp print() Int print()

65

From EPL: int × print Exp Exp

int Int

print Exp

print

print()

Int

print # int

int

Exp p print()

Exp print() Int print()

66

From EPL: eval × times Exp

Exp

print()

print()

times Int

Plus

Int

Plus

Times

print()

print()

print()

print()

print()

Exp print() eval()

eval

Exp

Int

Plus

Times

print() eval()

print() eval()

print()

eval # times Exp

print() p ()

print() p ()

eval()

eval()

Int print()

Plus print()

Int print()

Plus print()

Times print()

eval()

eval()

eval()

eval()

eval()

67

GMethods × GTypes Types : [times] [plus] [int] exp; %% plus Æ int times Æ plus

0

Methods : [eval] [print] shell; Types : [times] [plus] [int] exp; %% eval Æ print plus Æ int times Æ plus Methods # Types EPL: Methods Types;

Methods: [eval] [print] shell; %% eval Æ print

Methods: [eval] [print] shell; Types: [times] [plus] [int] exp; %% eval l Æ print i plus Æ int times Æ plus

Taking the Square of Features • Fundamental relationship among features & interactions • The changes features make always commute • Interactions are added to coordinate features correctly

g f f

f’

g

f#g g’

69

Adding Interactions • Recall the EPL graph • Expose interaction arrows • EPL Matrix = VMethods # VTypes 0 ( h ll # exp)) (shell

( h ll # int) (shell i t)

( h ll # plus) (shell l )

( h ll # times) (shell ti )

(print # exp)

(print # int)

(print # plus)

(print # times)

(eval # exp)

(eval # int)

(eval # plus)

(eval # times)

70

Important Property • Given the EPL matrix and boundaries, anyy arrow and anyy program p g of EPL can be computed p • Proof sketch • given f, g, f#g • can complete square

0

g f f

f’

g

f#g g’

71

Another Important Property • Aggregating adjacent squares sums interactions

g

h+g

h

f

f f#g

f#h

f#h + f#g

72

Finally, Why Origami Works

Any program of EPL can be computed by projecting & contracting EPL matrix

73

Projection • Projection of matrix = projection of category, yields a commuting y g diagram g • any path from 0 to P will synthesize P

0

P

commuting category diagram

matrix 74

Contract by Columns • Contraction aggregates adjacent squares and sums interaction arrows

0 +

+

+

+

P

commuting diagram

vector matrix 75

Contract by Rows • Contraction aggregates adjacent squares and sums interaction arrows

0

+++ P

commuting diagram

vector scalar 76

Final Step/Square • Boundary programs are 0s • 0Æ0 arrows add NOTHING NOTHING, and so too their compositions • interaction arrow defines an expression that synthesizes P • this is the arrow that is computed by matrix contraction

0

0 +

0 0 P

commuting diagram

vector scalar 77

Aside: Commuting Paths • • • •

Start from desired program Aggregate by row moves vertically up Aggregation by columns moves vertically left Final square any path will do

0 + +

+ +

+

P

commuting diagram

matrix 78

Generality of Arguments • Nothing special about how we contracted the matrix • any contraction (i.e., path) would do

• Nothing special about program P we selected • any program in the SPL would do

• Nothing special about this matrix • any SPL matrix would do – ex. AHEAD 2D kube

• Nothing special about 2D kubes • same ideas apply to higher-dimensions • e.g. a square becomes a cube with a single interaction arrow

• Result that can be applied to SPL in general • Origami g is a fundamental structure of SPLs

79

So What?

and other final thoughts thoughts…

80

Perspective •

Programming in the Small – Java and C# objects, methods



P Programming i in i the th Large L – objects bj t are programs, methods th d are xforms f



Transformations provide a fundamental way to understand SWD • foundation f d ti ffor MDE, MDE refactorings, f t i and d software ft product d t lines li



Working toward a Science of Automated Design • start from experiences and abstract to a theory • belief: few principles hold this universe together • principles assume a mathematical form as in other sciences and engineering disciplines that manipulate structures • a promising alternative to the ad hoc design techniques in use today

81

Dimensions of Variability •

Preplanned variability is key to automated software design • much like design patterns helped novices design like experts • SPLs help novices create customized programs like experts



Common in SPLs to have orthogonal and interacting sets of features • EPL – method variability vs vs. type variability • AHEAD – tool variability vs. language variability



Origami O g expresses p multiple p dimensions of variability y • powerful and elegant, it scales



Expose p new and basic relationships p in mathematical form • • • •

projection and contraction of kubes – database technology cross product of features (feature interactions) cross product of SPLs use of n-D kubes to represent n-dimensions of variability 82

Final Comments • Clear that ideas are being reinvented in different contexts • not accidental – evidence we are working toward general paradigm • modern mathematics is a simple language to express these ideas • maybe others may be able to find deeper connections

• At the earliest stages • Advice: think in terms of arrows, think in terms of kubes • if you look for kubes, you’ll find them… • if you y don’t look,, you y won’t find them…

83