Real-âTime Big Data with In-âMemory Data Fabric ... Distributed Data Grid ... Streaming. BI accelerators. Clustering
Apache Ignite:
Real-‐Time Big Data with In-‐Memory Data Fabric
NIKITA IVANOV Founder & CTO @c64hacker
© 2014 GridGain Systems, Inc.
www.gridgain.com
#gridgain
Agenda • History of GridGain/Apache Ignite • EvoluSon of In-‐Memory CompuSng • In-‐Memory Data Fabric – – – –
© 2014 GridGain Systems, Inc.
Distributed Cluster & Compute Distributed Data Grid Distributed Streaming & CEP Plug-‐n-‐Play Hadoop Accelerator
What is In-‐Memory Compu>ng • • • • • • • •
© 2014 GridGain Systems, Inc.
High Performance & Low Latencies Faster than Disk and Flash Cost EffecSve Distributed or Not Caching, Streaming, ComputaSons Data Querying – SQL or Unstructured VolaSle and Persistent OLAP and OLTP Use Cases
Evolu>on of In-‐Memory Compu>ng Streaming
Data Grid
Clustering & Compute Grid
Database IM opSons
Hadoop accelerators
Streaming BI accelerators In-‐Memory Data Grids IMDBs Distributed Caching Caching © 2014 GridGain Systems, Inc.
Hadoop Acceleration
Exis>ng Market is Fragmented Company
Product
Proprietary/ Open Source
Characteriza>on
Oracle
In-Memory Option for Oracle Database
Proprietary
Cost Option
Oracle
Times Ten
Proprietary
Point Solution IMDB
Oracle
Coherence
Proprietary
Point Solution IMDG
SAP
Hana
Proprietary
Point Solution - IMDB
Microsoft
SQL Server 2014
Proprietary
Feature Upgrade
DataBricks
Apache Spark
Open Source
Point Solution - Hadoop
VoltDB
VoltDB
Open Source
Point Solution – IMDB
Aerospike
Aerospike
Open Source
Point Solution – NoSQL DB
IBM
DB2 with BLU Acceleration
Proprietary
Feature Upgrade
Software AG
Terracotta
Open Source
Point Solution - IMDG
Hazelcast
Hazelcast
Open Source
Point Solution - IMDG
© 2014 GridGain Systems, Inc.
In-‐Memory Data Fabric: Strategic Approach to IMC •
Supports all Apps
•
Open Source – Apache 2.0 • Simple Java APIs
Streaming Data Grid Clustering & Compute Grid
Hadoop Acceleration
•
1 JAR Dependency
•
High Performance & Scale
•
Automatic Fault Tolerance
•
Management/Monitoring
•
Runs on Commodity Hardware
•
Supports existing & new data sources • No need to rip & replace
© 2014 GridGain Systems, Inc.
Clustering & Compute • • • • • • • • •
Zero Deployment Pluggable SPI Design Full Cluster Management Direct API for MapReduce Direct API for Fork/Join Cron-‐like Task Scheduling State Checkpoints Early and Late Load Balancing AutomaSc Failover
© 2014 GridGain Systems, Inc.
Automa>c Cluster Discovery
© 2014 GridGain Systems, Inc.
Closure Execu>on
© 2014 GridGain Systems, Inc.
Closure Execu>on
© 2014 GridGain Systems, Inc.
In-‐Memory Caching and Data Grid • • • • • • • •
Distributed In-‐Memory Key-‐Value Store Replicated and ParSSoned TBs of data, of any type On-‐Heap and Off-‐Heap Storage Backup Replicas / AutomaSc Failover Distributed ACID TransacSons SQL queries and JDBC driver CollocaSon of Compute and Data
© 2014 GridGain Systems, Inc.
Cache Opera>ons
Find a Bug?
© 2014 GridGain Systems, Inc.
Cache Transac>on
© 2014 GridGain Systems, Inc.
Distributed Java Data Structures • • • • • • • •
Distributed Map (cache) Distributed Set Distributed Queue CountDownLatch AtomicLong AtomicSequence AtomicReference Distributed ExecutorService
© 2014 GridGain Systems, Inc.
Client-‐Server vs. Affinity Coloca>on
Client-‐Server
© 2014 GridGain Systems, Inc.
Affinity ColocaSon
In-‐Memory Streaming & CEP • • • • • •
Streaming Data Never Ends Branching Pipelines CEP Sliding Windows Pluggable RouSng Real Time Analysis At Least Once Guarantee
© 2014 GridGain Systems, Inc.
Plug-‐n-‐Play Hadoop Accelerator • •
•
•
Up to 100x AcceleraSon In-‐Memory NaSve MapReduce – In-‐Process Data ColocaSon – Eager Push Scheduling GGFS In-‐Memory File System – Pure In-‐Memory – Write-‐Through to HDFS – Read-‐Through from HDFS Sync and Async Persistence
© 2014 GridGain Systems, Inc.
In-‐Memory Na>ve MapReduce •
• • • •
In-‐Memory NaSve MapReduce – Zero Code Change – Use exisSng MR code – Use exisSng Hive queries No Name Node No Network Noise In-‐Process Data ColocaSon Eager Push Scheduling
© 2014 GridGain Systems, Inc.
DevOps Management and Monitoring
© 2014 GridGain Systems, Inc.
THANK YOU www.gridgain.com
© 2014 GridGain Systems, Inc.
#gridgain