Flash hardware renderer - Arno Gourdol

4 downloads 252 Views 3MB Size Report
Arno Gourdol. Adobe Flash Runtime Team. April 8, 2011. Flash Camp Brazil ... [Entrypoint] class MyWorkerDef { trace("Hel
Arno Gourdol Flash Camp Brazil April 8, 2011

Adobe Flash Runtime Team

I skate to where the puck is going to be, not where it has been. —Wayne Gretzky

(in millions)

By 2013 mobile users > desktop users

2,000

1,500

1,000

500

0 2003

2005

2008

Mobile

Desktop Source: Gartner

2014

Flash Runtime Release Cadence 2011 Q2

Q3

2012 Q4

Q1

Q2

Q3

2013 Q4

Q1

Q2

Q3

2014 Q4

Q1

Q2

Q3

Q4

&

ActionScript Video Graphics

Faster Garbage Collection

Incremental GC Avoids GC clusters

Faster GC

Reducing allocation cost GC hint API Towards a generational collector without copy

Faster Actionscript

Advanced JITing

Faster ActionScript

Type-based optimizations Numeric optimizations Nullability

New numeric type: “float” IEEE 32-bit Efficient Vector. Also “float4” Faster number crunching

Concurrency

er: aim re scl tu Di Fu

var myWorker:Worker = new Worker("myworker.swf");

myWorker.addEventListener(WorkerExitEvent.WORKER_EXIT, function(e:WorkerExitEvent) { trace("worker finished") }); myWorker.load({myMessage: "world"});

myworker.swf trace("Hello " + Worker.parameters.myMessage)

class MyWorkerDef {

er: aim re scl tu Di Fu

[Entrypoint]

     trace("Hello " + Worker.parameters.myMessage);

     Worker.parameters.myConnection.call("talkback", null, "all fine on the worker front"); }; var myWorker:Worker = new Worker(MyWorkerDef); var c:WorkerConnection = new WorkerConnection(); c.client = { talkback: function(message) { trace("worker said: " + message) } myWorker.load({myMessage: "world", myConnection: c});

Multiple ActionScript workers

Concurrency

Shared-nothing isolation model UI not blocked Leverage multi-core CPUs

StageVideo

1 frame

StageVideo

Traditional

≈33ms

CPU

Net I/O

Decode H.264 Stream

Execute ActionScript

Render Stage

YUV to RGB

Composite

Blit

GPU

CPU

GPU

Net I/O

Execute ActionScript

Decode H.264 Stream

YUV to RGB

Render Stage

Blit

Reduced CPU usage

StageVideo

Improved battery life Improved framerate Works with existing video Available today in FP 10.2

Threaded Video Pipeline

Traditional

CPU

Execute ActionScript

1 frame

1 frame

≈33ms

≈33ms

Render Stage

Render Video

Execute ActionScript

Blit

Skip Stage Frame

Overloaded

Skip Video Frame

GPU

Render Video

Blit

GPU

CPU

Render Stage

Execute ActionScript

Execute ActionScript

Render Stage

Blit

Render Video

Blit

1 frame

CPU

Execute ActionScript

Render Stage

Net I/O

Decode Stream

YUV to RGB

CPU

CPU

GPU

Composite

Blit

GPU

New Video Pipeline

Traditional

≈33ms

Execute ActionScript Net I/O

Render Stage

Composite

Decode Stream Blit

1 frame

CPU

CPU

CPU

GPU

Execute ActionScript

Net I/O

Decode H.264 stream

GPU

New Video Pipeline

StageVideo

≈33ms

Execute ActionScript

Render Stage

YUV to RGB

Blit

Render Stage

Net I/O Decode H.264 Stream

YUV to RGB

Blit

Threaded Video Pipeline

Smoother playback Better hardware usage Coming soon

Hardware renderer

The Sun

Stage

The Sun Text: “The Sun”

DisplayObject

The Sun

Bitmap: “sun.png”

Shape: circle

Stage

Flash software renderer Text: “The Sun”

Retained renderer Scan-line rasterization Each pixel rendered once

DisplayObject

The Sun

Bitmap: “sun.png”

Shape: circle

Stage

Flash hardware renderer Text: “The Sun”

DisplayObject

The Sun

Bitmap: “sun.png”

Shape: circle

Immediate renderer Build scene graph Bitmaps as surfaces Readback is very expensive

er: aim re scl tu Di Fu

1 frame

Future?

Traditional

≈33ms

CPU

Execute ActionScript

Rasterize

Rasterize

CPU

Blit

GPU

CPU CPU GPU

Execute ActionScript

Rasterize

Rasterize

Composite

Blit

ActionScript

Flex

60

50

40

30

20

10

0

fps

Android

iOS CPU

AIR 2.6

iOS GPU

iOS CPU

iOS GPU

AIR 2.7

Available today

Hardware Renderer

AIR for iOS and Android renderMode=”GPU”

Hardware compositing Stage3D

Stage3D Molehill

/* Some constants for us */ private const FARPLANE:Number = 10000; private const INVFARPLANE:Number = 1/FARPLANE;   /* Set up the fragment constant for a cyan fog color (fc0) */ context.setProgramConstantsFromVector(Context3DProgramType.FRAGMENT, 0, Vector.([0, INVFARPLANE, INVFARPLANE, 1]));   /* Set up vertex shader which will pass colour and depth to the fragment shader */ var vertexShaderAssembler:AGALMiniAssembler = new AGALMiniAssembler(); vertexShaderAssembler.assemble(Context3DProgramType.VERTEX, "m44 vt0, va0, vc0 \n"+ //transform vertex x,y,z "mov op, vt0 \n"+       //output vertex x,y,z "mov v0, vt0.z \n"+     //move vertex z (depth) to fragment shader "mov v1, va1"           //move vertex r,g,b to fragment shader );   /* Set up a fragment shader */ var fragmentShaderAssembler:AGALMiniAssembler= new AGALMiniAssembler(); fragmentShaderAssembler.assemble(Context3DProgramType.FRAGMENT, "mul ft0, fc0, v0 \n"+      //multiply fog r,g,b by vertex z (depth) "add ft1, ft0, v1 \n"+      //add final fog r,g,b to vertex r,g,b "mov oc, ft1"           //output color );

Alternativa3D Away3D Flare3D

Frameworks

Sophie3D Unity Yogurt3D M2D

Away3D

Molehill Demo Reel

Very low-level API

Stage3D Molehill

Direct access to GPU Amazing performance Optimized for mobile 3D and 2D frameworks Available on Labs today labs.adobe.com

Alex Karpovitch, Alternativa Find out more: Saturday, 9:15am–10:30 Room A

Faster garbage collector Faster ActionScript Concurrency StageVideo Threaded video pipeline Hardware renderer Stage3D Molehill