In-Memory Computing Principles - December 2014 - GridGain

12 downloads 207 Views 5MB Size Report
Cloud/SaaS apps, Mobile Computing back-‐ends ... radically changing users' expectations, application design principles
In-­‐Memory  Computing  Principles  and  Technology  Overview MAC  MOORE   Solutions  Architect

www.gridgain.com ©  2014  GridGain  Systems,  Inc.

#gridgain

Agenda • Overview   • Why  In  Memory  &  Use  Cases   • Evolution  of  Architectures   • Concepts  and  Considerations  

• In-­‐Memory  Data  Fabric  6.5   • • • • • ©  2014  GridGain  Systems,  Inc.

Data  Grid   Clustering  and  Compute   Streaming   Hadoop  Acceleration   Highlights:  Release  6.5

Why  In-­‐Memory  Computing? Cloud/SaaS  apps,  Mobile  Computing   back-­‐ends,  Internet  of  Things,  Big  Data   analytics,  Social  Networks  –  all  need  to   be  done  in-­‐memory  to  reach  Internet   scale

“RAM  is  the  new  disk,  disk  is   the  new  tape.”

RAM  is  3,000  times  faster  than  spinning  disks.  By   moving  data  from  disk  to  RAM  and  employing   modern  in-­‐memory  data  grid  technology,  things  get   fast.  Really,  really  fast. ©  2014  GridGain  Systems,  Inc.

“In-­‐memory  computing  will  have  a  long  term,  disruptive  impact  by   radically  changing  users’  expectations,  application  design  principles,   products’  architectures  and  vendors’  strategies.”

In-­‐memory  computing  is  the  future  of  computing…  it  offers  a   massive  potential  not  only  in  TCO  reduction  but  across  all  four  value   dimensions:  performance,  process  innovation,  simplification  and   flexibility.

©  2014  GridGain  Systems,  Inc.

“Organizations  that  do  not  consider  adopting  in-­‐memory   application  infrastructure  technologies  risk  being  out-­‐ innovated  by  competitors  that  are  early  mainstream  users   of  these  capabilities”

©  2014  GridGain  Systems,  Inc.

In-­‐Memory  Computing:  Why  Now? In-memory will have an industry impact comparable to web and cloud. RAM is the new disk, and disk is the new tape.! Data Growth

Less  than  2  zetabytes  in  2011,  8  in  2015

©  2014  GridGain  Systems,  Inc.

In-Memory Computing Market:! • $13.23B in 2018! • 2013-2018 CAGR 43%!

DRAM Cost, $

Cost drops 30% every 12 months

BigData Technologies Planned

34% will use in-memory technology

Top  3  Reasons  for  In-­‐Memory  Computing 1. Performance   2. Scalability   3. Future-­‐proofing

©  2014  GridGain  Systems,  Inc.

How  In-­‐Memory  Computing  Works:     The  Basic  Idea

•Persistence   •Recovery   •Post-­‐Processing   •Backup

©  2014  GridGain  Systems,  Inc.

In-­‐Memory  Technology:  Use  Cases Data  Velocity,  Data  Volume,  Real-­‐Time  Performance >

Automated Trading Systems


>

Customer 360 view, real-time analysis of KPIs, up-to-the-second operational BI.!

Real time analysis of trading positions & market risk. High volume transactions, ultra low latencies.!

>

Financial Services


>

Online & Mobile Advertising
 Real time decisions, geo-targeting & retail traffic information.!

©  2014  GridGain  Systems,  Inc.

Online Gaming 


Real-time back-ends for mobile and massively parallel games.!

Fraud Detection, Risk Analysis, Insurance rating and modeling.!

>

Big Data Analytics


>

Bioinformatics & Sciences
 High performance genome data matching, Environmental simulation.

THE  EVOLUTION  OF  ARCHITECTURES

©  2014  GridGain  Systems,  Inc.

Traditional  Architecture App Server1

User App

App Server2

App Server3

User App

App Server4

User App

App Server n

User App

User App

Processing Happens Here Data is converted to objects (Marshaling)

Data Requests Data Requests Data Requests

Data Requests

Relational Data

Data is shipped across the network for every request

Traditional RDBMS Server

All data is on disk.

©  2014  GridGain  Systems,  Inc.

Disks

Disks

Disks

Disks

Disks

Disks

This  is  the  central  bottleneck.     Beyond  a  certain  point,  scaling   the  RDBMS  becomes  complex,   expensive  and  difficult  to   manage.

Horizontally  Scale  the  RDBMS App Server1

User App

App Server2

App Server3

User App

App Server4

User App

App Server n

User App

User App

Processing Happens Here Data is converted to objects (Marshaling)

Data Requests Data Requests Data Requests

Data Requests

Relational Data

Data is shipped across the network for every request

RDBMS Server

All data is still on disk.

©  2014  GridGain  Systems,  Inc.

Server1

Server2

Disks

Disks

Disks

Disks

Disks

Disks

Scaling  improves,  but   little  else  changes.  Also   these  are  very  expensive!

IMDG:  Distributed  Caching App Server1

App Server2

App Server3

App Server4

App Server n

User App

User App

User App

User App

User App

Object Data

Traditional RDBMS Server

©  2014  GridGain  Systems,  Inc.

Processing Happens Here

Data is shipped across the network for every request

Distributed Cache Server1

Server2

Disks

Disks

RAM

RAM

Disks

Disks

RAM

RAM

Disks

Disks

RAM

RAM

Scaling  improves,  as  some  (or   all)  data  is  now  in  RAM.  But,   we  still  have  to  ship  data  to   the  app  for  every  request.

GridGain:  IMDG  +  Grid  Compute

Tasks/Queries

All data is in RAM

Server1

Server2

Server3

Server n

User App

User App

User App

User App

GridGain Node

GridGain Node

GridGain Node

GridGain Node

RAM

RAM

RAM

RAM

RAM

RAM

RAM

RAM

Server4

Server5

Server6

Server n

User App

User App

User App

User App

GridGain Node

GridGain Node

GridGain Node

GridGain Node

RAM

RAM

RAM

RAM

RAM

RAM

RAM

RAM

In-Memory Compute+Data Grid

©  2014  GridGain  Systems,  Inc.

Processing is local, in-memory, and in native object format.

By  bringing  computation  to   the  In-­‐Memory  Data  Grid,  you   can  now  achieve  the  fastest   possible  results.

In-­‐Memory  Data  Fabrics  (Grids):  Considerations

   

Performance

Scaling

Consistency

Persistence

Transactions

Search/Query

 

Icons  by:    http://www.icons-­‐land.com/ http://www.awicons.com/ http://www.aha-­‐soft.com/ ©  2014  GridGain  Systems,  Inc.

   

In-­‐Memory  Data  Fabrics  (Grids):  Considerations

Performance

©  2014  GridGain  Systems,  Inc.

>

Understand  your  criteria…   > Real-­‐Time  Performance   > High  Throughput   > Low  Latency   > Dataset  sizes  (growing)

In-­‐Memory  Data  Fabrics  (Grids):  Considerations

Scaling

> > >

>

©  2014  GridGain  Systems,  Inc.

Horizontal  Scaling  (“add  a  brick”)   Dynamic  Topology  (no  downtime)   Vertical  Scaling  (large  RAM  allocation  per-­‐ process)   Good  monitoring  tools  (utilization  &  mgmt.)

In-­‐Memory  Data  Fabrics  (Grids):  Considerations

Consistency >

> >

©  2014  GridGain  Systems,  Inc.

Weak  (Eventual)  Consistency  vs  Strong   Consistency   Per-­‐store  or  per-­‐cache  configuration   Understand  impact  on  performance

In-­‐Memory  Data  Fabrics  (Grids):  Considerations

Persistence

> >

>

©  2014  GridGain  Systems,  Inc.

Flexible  Persistence  Options   Read-­‐Through,  Write-­‐Through,  Write-­‐ Behind   Examples:  Relational  Database,  Local   Storage,  Shared  Storage

In-­‐Memory  Data  Fabrics  (Grids):  Considerations

Transactions > > > >

©  2014  GridGain  Systems,  Inc.

Are  transactions  supported?   Local  or  Distributed   Global  or  per-­‐store/per  cache  configuration   Impact  on  performance

In-­‐Memory  Data  Fabrics  (Grids):  Considerations

Search/Query > > >

©  2014  GridGain  Systems,  Inc.

Query  Capability:  SQL  or  Proprietary  API   Indexing  support   JDBC/ODBC  Interface

INTRODUCTION  TO  GRIDGAIN  DATA  FABRIC

©  2014  GridGain  Systems,  Inc.

Customer  Use  Cases Data  Velocity,  Data  Volume,  Real-­‐Time  Performance >

Automated Trading Systems


>

Customer 360 view, real-time analysis of KPIs, up-to-the-second operational BI.!

Real time analysis of trading positions & market risk. High volume transactions, ultra low latencies.!

>

Financial Services


>

Online & Mobile Advertising
 Real time decisions, geo-targeting & retail traffic information.!

©  2014  GridGain  Systems,  Inc.

Online Gaming 


Real-time back-ends for mobile and massively parallel games.!

Fraud Detection, Risk Analysis, Insurance rating and modeling.!

>

Big Data Analytics


>

Bioinformatics & Sciences
 High performance genome data matching, Environmental simulation.

Gartner  names  GridGain  a   2014  “Cool  Vendor”  for  IMC

“This  positions  GridGain  as  one  of  the   few  IMC  open-­‐source  technologies   available  and  the  only  one…  providing   such  a  rich  set  of  functionality.”

©  2014  GridGain  Systems,  Inc.

GridGain  named  a   2014  AlwaysOn  Global  250   Top  Private  Company  

Use  Case:   Largest bank in Eastern Europe (Russia), and the third largest in Europe

• Open  tender  won  by  GridGain   – Goal:  Real-­‐time  risk  and  leverage   reporting  on  their  global  financial   trading  portfolio     – Performed  a  detailed  evaluation  and   software  assurance  test   – Delivered  best  performance,  scale   and  high  availability  

©  2014  GridGain  Systems,  Inc.

1 Billion

Transactions per Second

10 Dell R610 servers
 < $30K 1 TB Memory

GridGain  enters  the  
 Apache  Software  Foundation

©  2014  GridGain  Systems,  Inc.

GridGain  In-­‐Memory  Data  Fabric:     Strategic  Approach  to  IMC •

Supports Applications of various types and languages

• Open

Source – Apache 2.0! • Simple Java/C#/C++ APIs! • 1 JAR Dependency! • High Performance & Scale! • Automatic Fault Tolerance! • Management/Monitoring! • Runs on Commodity Hardware • Supports

existing & n 
 ew data sources! • No need to rip & replace

©  2014  GridGain  Systems,  Inc.

In-­‐Memory  Data  Grid • Distributed  In-­‐Memory  Key-­‐ Value  Store   • Local,  Replicated,  Partitioned   • TBs  of  data,  of  any  type   • On-­‐Heap  and  Off-­‐Heap   Storage   • Backup  Replicas  /  Automatic   Failover     • Distributed  ACID  Transactions     • SQL  queries  and  JDBC  driver   • Collocation  of  Compute  and   Data ©  2014  GridGain  Systems,  Inc.

Clustering  &  Compute • Direct  API  for   MapReduce   • Zero  Deployment   • Cron-­‐like  Task  Scheduling   • State  Checkpoints   • Early  and  Late  Load   Balancing   • Automatic  Failover   • Full  Cluster  Management   • Pluggable  SPI  Design ©  2014  GridGain  Systems,  Inc.

Client-­‐Server  vs  Affinity  Colocation

Client-­‐ Server

©  2014  GridGain  Systems,  Inc.

Affinity   Colocation

Distributed  Java  Structures • • • • • • • •

©  2014  GridGain  Systems,  Inc.

Distributed  Map  (cache)   Distributed  Set   Distributed  Queue   CountDownLatch   AtomicLong   AtomicSequence   AtomicReference   Distributed  ExecutorService

In-­‐Memory  Streaming  and  CEP • Streaming  Data   Never  Ends   • Branching  Pipelines   • Pluggable  Routing   • Sliding  Windows  for   CEP/Continuous   Query   • Real  Time  Analysis

©  2014  GridGain  Systems,  Inc.

Hadoop  Accelerator • Plug  and  Play  installation   • 10x  to  100x  Acceleration   • In-­‐Memory  Native   MapReduce   • In-­‐Process  Data  Colocation   • GGFS  In-­‐Memory  File   System   • Pure  In-­‐Memory   • Read-­‐Through  from  HDFS   • Write-­‐Through  to  HDFS     • Sync  and  Async  Persistence ©  2014  GridGain  Systems,  Inc.

In-­‐Memory  Accelerated  Map  Reduce • In-­‐Memory  Native   Performance   • Zero  Code  Change   • Use  existing  MR  code   • Use  existing  Hive  queries   • No  Name  Node   • No  Network  Noise   • In-­‐Process  Data  Colocation

©  2014  GridGain  Systems,  Inc.

Visor:  Monitoring  &  Mgmt  for  DevOps

©  2014  GridGain  Systems,  Inc.

HIGHLIGHTS:  RELEASE  6.5

©  2014  GridGain  Systems,  Inc.

Cross-­‐Language  Interoperability • Portable  Objects


Cross Language Data Interoperability

C#

Language-­‐neutral  storage  format.  Allows  you  to   write  data  from  one  language,  and  access  or   modify  it  from  another.  

• Performance  Across  Languages


Optimized  neutral  format  that  provides  great   performance  across  all  supported  languages.  

• Client  Feature  Parity


Feature  parity  across  supported  language  APIs.

Java

GridGain Data Grid

C++

©  2014  GridGain  Systems,  Inc.

Dynamic  Schema  Changes • Dynamic  Schema  Changes


Dynamic Schema Changes

C#

Change  data  structures:  add  properties/fields   dynamically  at  runtime  when  needed.  

• Searchable  /  Indexable


Index  and  search  into  arbitrary  fields,  without   needing  to  create  annotated  classes.  

• Version  Independent


Data  storage  format  no  longer  tied  to  class   versioning  or  product  versioning.

Language Independent Format Java

C++

©  2014  GridGain  Systems,  Inc.

Grid  Managed  Services • Automatic  High-­‐Availability


Grid Managed Services

Define  services  that  should  be  instantiated  by   the  grid.  

• Configurable


Support  for  Grid  Singletons,  Node  Singletons,   and  more.  

• Maintenance-­‐free


Deployment  of  services  in  the  desired   configuration  is  guaranteed  by  the  GridGain   Grid.

©  2014  GridGain  Systems,  Inc.

Node Singleton

GridGain Data Grid

Grid Singleton

GridGain Data Grid

Enterprise  Edition:  Exclusive  Features >

Management & Monitoring
 GridGain Visor for GUI-based DevOps! >

>

Local Restartable Store


Rolling Production Updates
 Perform software upgrades without downtime!

!

fast recovery during planned outages or DR

> >

Data Center Replication
 Multi-datacenter WAN support!

Support incidents, ticket access, upgrades, patches! >

>

Network Segmentation Protection 


Support & Maintenance
 Training & Consulting


!

Configurable fault-tolerance for network interruptions

>

Deploy with Confidence
 Indemnification for Enterprise Customers

>

Security Features


!

Client Authentication and related SPIs

©  2014  GridGain  Systems,  Inc.

!

Technical training and customized consulting services

Enterprise  &  Open  Source  Comparison  Chart GridGain  Enterprise  Subscriptions  include   the  following  during  the  subscription  term:   >

>

> >

>

©  2014  GridGain  Systems,  Inc.

Right  to  Use  the  Enterprise  Edition  of  the   GridGain  product.   Bug  fixes,  patches,  updates  and  upgrades  to   the  latest  version  of  the  product.   9x5  or  24x7  Support  for  the  product.   Ability  to  procure  Training  and  Consulting   Services  from  GridGain.   Confidence  and  protection,  not  provided   under  Open  Source  licensing,  that  only  a   commercial  vendor  can  provide,  such  as   Indemnification.

ANY  QUESTIONS? Thank  you  for  joining  us.  Follow  the  conversation.

www.gridgain.com @gridgain    #gridgain

©  2014  GridGain  Systems,  Inc.