Apache Ignite - GridGain

0 downloads 368 Views 6MB Size Report
Real-‐Time Big Data with In-‐Memory Data Fabric ... Distributed Data Grid ... Streaming. BI accelerators. Clustering
Apache  Ignite:  

Real-­‐Time  Big  Data  with  In-­‐Memory  Data  Fabric  

NIKITA  IVANOV   Founder  &  CTO   @c64hacker  

©  2014  GridGain  Systems,  Inc.  

www.gridgain.com  

#gridgain  

Agenda   •  History  of  GridGain/Apache  Ignite   •  EvoluSon  of  In-­‐Memory  CompuSng   •  In-­‐Memory  Data  Fabric   –  –  –  – 

©  2014  GridGain  Systems,  Inc.  

Distributed  Cluster  &  Compute   Distributed  Data  Grid   Distributed  Streaming  &  CEP   Plug-­‐n-­‐Play  Hadoop  Accelerator  

What  is  In-­‐Memory  Compu>ng   •  •  •  •  •  •  •  • 

©  2014  GridGain  Systems,  Inc.  

High  Performance  &  Low  Latencies   Faster  than  Disk  and  Flash   Cost  EffecSve   Distributed  or  Not   Caching,  Streaming,  ComputaSons   Data  Querying  –  SQL  or  Unstructured   VolaSle  and  Persistent   OLAP  and  OLTP  Use  Cases  

Evolu>on  of   In-­‐Memory  Compu>ng   Streaming

Data Grid

Clustering & Compute Grid

Database   IM  opSons  

Hadoop   accelerators  

Streaming   BI  accelerators   In-­‐Memory  Data  Grids   IMDBs   Distributed   Caching   Caching   ©  2014  GridGain  Systems,  Inc.  

Hadoop Acceleration

Exis>ng  Market  is  Fragmented   Company  

Product  

Proprietary/   Open  Source  

Characteriza>on  

Oracle

In-Memory Option for Oracle Database

Proprietary

Cost Option

Oracle

Times Ten

Proprietary

Point Solution IMDB

Oracle

Coherence

Proprietary

Point Solution IMDG

SAP

Hana

Proprietary

Point Solution - IMDB

Microsoft

SQL Server 2014

Proprietary

Feature Upgrade

DataBricks

Apache Spark

Open Source

Point Solution - Hadoop

VoltDB

VoltDB

Open Source

Point Solution – IMDB

Aerospike

Aerospike

Open Source

Point Solution – NoSQL DB

IBM

DB2 with BLU Acceleration

Proprietary

Feature Upgrade

Software AG

Terracotta

Open Source

Point Solution - IMDG

Hazelcast

Hazelcast

Open Source

Point Solution - IMDG

©  2014  GridGain  Systems,  Inc.  

In-­‐Memory  Data  Fabric:     Strategic  Approach  to  IMC   • 

Supports all Apps

• 

Open Source – Apache 2.0 •  Simple Java APIs

Streaming Data Grid Clustering & Compute Grid

Hadoop Acceleration

• 

1 JAR Dependency

• 

High Performance & Scale

• 

Automatic Fault Tolerance

• 

Management/Monitoring

• 

Runs on Commodity Hardware

• 

Supports existing & new data sources •  No need to rip & replace

©  2014  GridGain  Systems,  Inc.  

Clustering  &  Compute   •  •  •  •  •  •  •  •  • 

Zero  Deployment   Pluggable  SPI  Design   Full  Cluster  Management   Direct  API  for  MapReduce   Direct  API  for  Fork/Join   Cron-­‐like  Task  Scheduling   State  Checkpoints   Early  and  Late  Load  Balancing   AutomaSc  Failover  

©  2014  GridGain  Systems,  Inc.  

Automa>c  Cluster  Discovery  

©  2014  GridGain  Systems,  Inc.  

Closure  Execu>on  

©  2014  GridGain  Systems,  Inc.  

Closure  Execu>on  

©  2014  GridGain  Systems,  Inc.  

In-­‐Memory  Caching  and  Data  Grid   •  •  •  •  •  •  •  • 

Distributed  In-­‐Memory  Key-­‐Value  Store   Replicated  and  ParSSoned   TBs  of  data,  of  any  type   On-­‐Heap  and  Off-­‐Heap  Storage   Backup  Replicas  /  AutomaSc  Failover     Distributed  ACID  TransacSons     SQL  queries  and  JDBC  driver   CollocaSon  of  Compute  and  Data  

©  2014  GridGain  Systems,  Inc.  

Cache  Opera>ons  

Find  a  Bug?  

©  2014  GridGain  Systems,  Inc.  

Cache  Transac>on  

©  2014  GridGain  Systems,  Inc.  

Distributed  Java  Data  Structures   •  •  •  •  •  •  •  • 

Distributed  Map  (cache)   Distributed  Set   Distributed  Queue   CountDownLatch   AtomicLong   AtomicSequence   AtomicReference   Distributed  ExecutorService  

©  2014  GridGain  Systems,  Inc.  

Client-­‐Server  vs.  Affinity  Coloca>on  

Client-­‐Server  

©  2014  GridGain  Systems,  Inc.  

Affinity  ColocaSon  

In-­‐Memory  Streaming  &  CEP   •  •  •  •  •  • 

Streaming  Data  Never  Ends   Branching  Pipelines   CEP  Sliding  Windows   Pluggable  RouSng   Real  Time  Analysis   At  Least  Once  Guarantee  

©  2014  GridGain  Systems,  Inc.  

Plug-­‐n-­‐Play  Hadoop  Accelerator   •  • 

• 

• 

Up  to  100x  AcceleraSon   In-­‐Memory  NaSve  MapReduce   –  In-­‐Process  Data  ColocaSon   –  Eager  Push  Scheduling   GGFS  In-­‐Memory  File  System   –  Pure  In-­‐Memory   –  Write-­‐Through  to  HDFS   –  Read-­‐Through  from  HDFS     Sync  and  Async  Persistence  

©  2014  GridGain  Systems,  Inc.  

In-­‐Memory  Na>ve  MapReduce   • 

•  •  •  • 

In-­‐Memory  NaSve  MapReduce   –  Zero  Code  Change   –  Use  exisSng  MR  code   –  Use  exisSng  Hive  queries   No  Name  Node   No  Network  Noise   In-­‐Process  Data  ColocaSon   Eager  Push  Scheduling  

©  2014  GridGain  Systems,  Inc.  

DevOps  Management  and  Monitoring  

©  2014  GridGain  Systems,  Inc.  

THANK  YOU   www.gridgain.com  

©  2014  GridGain  Systems,  Inc.  

#gridgain