Flywheel: Google's Data Compression Proxy for the Mobile ... - Usenix [PDF]

3 downloads 105 Views 6MB Size Report
part of Chrome for Android, iOS, Desktop. • Currently serving millions ... optimized for mobile. • Case in point: • 42% of HTML response bytes not compressed. 10 ...
Flywheel: Google's Data Compression Proxy for the Mobile Web Victor Agababov, Michael Buettner, Victor Chudnovsky, Mark Cogan, Ben Greenstein, Shane McDaniel, Michael Piatek, Colin Scott*, Matt Welsh, Bolian Yin

[email protected]

[email protected] *Currently a graduate student at UC Berkeley

Dominant Access Tech: Mobile •

Mobile devices are increasingly dominant



Growth is greatest in emerging markets 2

Mobile Data is Expensive

Emerging markets have highest costs!

Source: blog.jana.com 3

Flywheel •

Flywheel: proxy service that optimizes HTTP response size •

Three years of deployment experience, part of Chrome for Android, iOS, Desktop



Currently serving millions of users & billions of requests per day

4

What Flywheel Does Transcode to WebP Pick quality level Minify CSS, JS GZip text objects Total bytes: 9764

Total bytes: 5565 5

6

None of our optimizations are novel

7

What lessons did we learn from building and operating Flywheel? • •

Key lessons: Highly challenging to maintain good performance Tussles are pervasive, ongoing, & time-consuming

Outline •

Is a proxy really needed?



What’s hard about engineering Flywheel?



Does Flywheel meet our goals?



What can be learned? 9

The web isn’t well optimized for mobile •

Case in point: •

42% of HTML response bytes not compressed 10

Hard to keep up with best practices •

New optimizations: WebP, SDCH, HTTP/2,... rolled out as often as every 6 weeks



Heterogeneity of mobile devices increasing Need: Optimizing service for the mobile web



Need: an optimizing compiler for the mobile web

11

Outline •

Is a proxy really needed?



What’s hard about engineering Flywheel?



Does Flywheel meet our goals?



What can be learned? 12

Design Constraints •

Opt-in deployment model



Transparent to users



HTTP only (No HTTPS)

13

Challenge: trading off latency vs compression Indirection through Flywheel often increases RTT

RTT’

>

RTT

HTTP Origin

Network latency is dominant performance factor (Page size is not a dominant factor!) 14

Flywheel Design Cache

Proxy

Fetch router

Fetch bots

Optimization Optimization Optimization services services

HTTP Origin

services

Google datacenter

GET /

Fetch router maintains connection affinity 15

Flywheel Design Cache

Proxy

Fetch router

Fetch bots

Optimization Optimization Optimization services services

HTTP Origin

services

200

200 Google datacenter

Optimizations: image transcoding, GZip, minification … Separate optimization services for isolation, provisioning

16

Selective Proxying Indirection through Flywheel often increases RTT

GET /

HTTP Origin

Fetch objects on critical path from origin 17

Selective Proxying

GET /i.jpg

HTTP Origin

Fetch objects on critical path from origin Proxy objects that yield high data reduction 18

Challenge: Accommodating Tussles

Need to be policy-neutral

CANARY

Mechanism: HTTP fallback (blockable canary requests)

19

Outline •

Is a proxy really needed?



What’s hard about engineering Flywheel?



Does Flywheel meet our goals?



What can be learned? 20

Evaluation •



This talk: •

Primary goal: reduce web page size



Secondary goal: maintain good performance

Paper: •

Fault tolerance 21

Workload: Geographic Adoption Country

Adoption

Worldwide

10.5%

Brazil

17%

Russia

16.5%

Indonesia

16.3%

Mexico

15.5%

USA

9.5%

Adoption highest in developing markets 22

Workload: Page Footprints

23

Workload: Page Footprints

Majority of pages are small

24

Workload: Page Footprints

Majority of pages are small

97% of bytes come from top 5% largest pages

25

Data Reduction Savings = 1 - (outgoing bytes / incoming bytes) * 100 Type

% of Bytes

Savings

Share of Benefit

Total

100%

58%

-

Images

74.12%

66.40%

85%

HTML

9.64%

38.43%

6%

JavaScript

9.10%

41.09%

6%

CSS

1.81%

52.10%

2%

Other

5.33%

9.23%

1%

26

Data Reduction Savings = 1 - (outgoing bytes / incoming bytes) * 100 Type

% of Bytes

Savings

Share of Benefit

Total

100%

58%

-

Images

74.12%

66.40%

85%

HTML

9.64%

38.43%

6%

JavaScript

9.10%

41.09%

6%

CSS

1.81%

52.10%

2%

Other

5.33%

9.23%

1%

27

Data Reduction Savings = 1 - (outgoing bytes / incoming bytes) * 100 Type

% of Bytes

Savings

Share of Benefit

Total

100%

58%

-

Images

74.12%

66.40%

85%

HTML

9.64%

38.43%

6%

JavaScript

9.10%

41.09%

6%

CSS

1.81%

52.10%

2%

Other

5.33%

9.23%

1%

Images are bulk of bytes & savings 28

Reduction Across Users

29

Reduction Across Users

Median data reduction: 50%

30

Reduction Across Users

31

Reduction Across Users

Overall data reduction: 27%

32

Performance: Methodology •

Goal: don’t degrade performance Compare:



Random sampling of Flywheel users



Holdback experimental group 33

Simple Model of Page Load Time Load time of subresource si = propagation delay + transmission delay + computation time Critical path = longest chain of dependent subresources

Time

Page load time = Σ si on critical path

40

Page Load Time Holdback

36

Flywheel

36.13

32

Flywheel improves performance only in the tail

28

Seconds

39.38

24 20 16

14.61

12 8

9.22

4 0

5.38 3.68 2.08

2.21

Median

13.89

8.95

5.37

3.78

70th

80th

90th

95th

Quantile (page loads)

99th 35

Why is this? •

Recall our workload: • •



Long tail of large pages Most pages are small

Transmission delay dominant factor Propagation delay dominant factor

Good way to understand propagation delay: time to first byte (TTFB) 36

6

Time to First Byte Holdback

5.4 4.8

Seconds

4.2 3.6

Flywheel

5.81 5.06

Most users’ direct path to the origin is shorter than the indirect path

3 2.4 1.8

1.69

1.2 1.00

0.6 0

0.19

0.30

Median

0.36

0.49

70th

0.55

1.90

1.16

0.69

80th

90th

Quantile (requests)

95th

99th 37

Outline •

Is a proxy really needed?



What’s hard about engineering Flywheel?



Does Flywheel meet our goals?



What can be learned? 38

Lessons Learned Many performance optimizations we expected to have impact did not at Web scale 39

Disappointing Optimizations Preconnect: open TCP connection with origin early



bar.com bar.com

.....

foo.com foo.com



Increases reused connections from 73% to 80%



Yields less than 2% decrease in median PLT

Already a strong tendency for connection affinity

40

Disappointing Optimizations Prefetch: request cacheable subresources early



bar.com

GET /js /js

.....

foo.com foo.com



Increases cache hit ratio from 22% to 32%



Yields less than 2% decrease in median PLT

Cacheable items often aren’t on the critical path

41

Lessons Learned Improving PLT is highly challenging •

Compression doesn’t help (much)



Difficult to target critical path 42

Lessons Learned If you want widespread adoption, you must accommodate policy issues! 43

Lessons Learned Many more measurement findings and lessons in the paper! 44

Conclusion •

Flywheel shows it’s possible to provide 58% average HTTP data reduction at web scale •

Data reduction is the easy part



Maintaining performance and accommodating tussles are the hard part lmgtfy.com/?q=Chrome+Data+Saver

[email protected]

[email protected] 45