1Fundamentals for Apache Kafka Apache Kafka Architecture & Fundamentals ExplainedJoe Desmond, Sr. Technical Trainer, Confluent
2Session Schedule Session 1: Benefits of Stream Processing and Apache Kafka Use Cases Session 2: Apache Kafka Architecture & Fundamentals Explained Session 3: How Apache Kafka Works Session 4: Integrating Apache Kafka into your Environment
3Learning ObjectivesAfter this module you will be able to: Identify the key elements in a Kafka cluster Name the essential responsibilities of each keyelement Explain what a Topic is and describe its relation toPartitions and Segments
4The World Produces Data
5Producers
6Kafka Brokers
7Consumers
8Architecture
9Decoupling Producers and Consumers Producers and Consumers are decoupled Slow Consumers do not affect Producers Add Consumers without affecting Producers Failure of Consumer does not affect System
10How KafkaUsesZooKeeper
11ZooKeeper Basics Open Source Apache Project Distributed Key Value Store Maintains configuration information Stores ACLs and Secrets Enables highly reliable distributed coordination Provides distributed synchronization Three or five servers form an ensemble
12Topics Topics: Streams of “related” Messages in Kafka Is a Logical Representation Categorizes Messages into Groups Developers define Topics ProducerTopic: N to N Relation Unlimited Number of Topics
13Topics, Partitions, and Segments
14Topics, Partitions, and Segments
15The Log
16Log Structured Data Flow
17The Stream
18Data Elements
19Brokers Manage Partitions Messages of Topic spread across Partitions Partitions spread across Brokers Each Broker handles many Partitions Each Partition stored on Broker’s disk Partition: 1.n log files Each message in Log identified by Offset Configurable Retention Policy
20Broker Basics Producer sends Messages toBrokers Brokers receive and storeMessages A Kafka Cluster can have manyBrokers Each Broker manages multiplePartitions
21Broker Replication
22Producer Basics Producers write Data as Messages Can be written in any language Native: Java, C/C , Python, Go,, .NET, JMS More Languages by Community REST Server for any unsupported Language Command Line Producer Tool
23Load Balancing and Semantic Partitioning Producers use a Partitioning Strategy to assign each message to a Partition Two Purposes: Load Balancing Semantic Partitioning Partitioning Strategy specified by Producer Default Strategy: hash(key) % number of partitions No KeyRound-Robin Custom Partitioner possible
24Consumer Basics Consumers pull messages from 1.n topics New inflowing messages are automatically retrieved Consumer offset Keeps track of the last message read Is stored in special topic CLI tools exist to read from cluster
25Consumer Offset
26Distributed Consumption
27Scalable Data Pipeline
28Q&AQuestions: Why do we need an odd number of ZooKeeper nodes? How many Kafka brokers can a cluster maximally have? How many Kafka brokers do you minimally need for highavailability? What is the criteria that two or more consumers form aconsumer group?
29Continue your Apache Kafka Education! Confluent Operations for Apache Kafka Confluent Developer Skills for Building Apache Kafka Confluent Stream Processing using Apache Kafka Streamsand KSQL Confluent Advanced Skills for Optimizing Apache KafkaFor more details, see http://confluent.io/training
What you Need to Know30Certifications Qualifications: 6-to-9 months hands-onexperienceConfluent Certified Developerfor Apache Kafka Duration: 90 mins Availability: Live, online 24/7 Cost: 150 Register online:www.confluent.io/certification(aligns to Confluent Developer Skillsfor Building Apache Kafka course)Confluent CertifiedAdministrator for ApacheKafka(aligns to Confluent Operations Skillsfor Apache Kafka)
31Stay in raining
32Thank you for attending! Thank you for attending thesession! Feedback to: training-admin@confluent.io
33Copyright Confluent, Inc. 2014-2019. Privacy Policy Terms & Conditions.Apache, Apache Kafka, Kafka and the Kafka logo are trademarks ofthe Apache Software Foundation
for Apache Kafka (aligns to Confluent Developer Skills for Building Apache Kafka course) Confluent Certified Administrator for Apache Kafka (aligns to Confluent Operations Skills for Apache Kafka) What you Need to Know Qualifications: 6-to-9 months hands-on experience Duration: 90 mins Availability: Live, online 24/7 Cost: 150
Getting Started with the Cloud . Apache Bigtop Apache Kudu Apache Spark Apache Crunch Apache Lucene Apache Sqoop Apache Druid Apache Mahout Apache Storm Apache Flink Apache NiFi Apache Tez Apache Flume Apache Oozie Apache Tika Apache Hadoop Apache ORC Apache Zeppelin
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting Kafka Design Motivation Goals Kafka built to support real-time analytics Designed to feed analytics system that did real-time processing of streams Unified platform for real-time handling of streaming data feeds Goals: high-throughput streaming data platform supports high-volume event streams like log aggregation, user
Apache Kafka Overview Apache Kafka is a hot technology amongst application developers and architects looking to build the latest generation of real-time and web-scale applications. According the official Apache Kafka website "Kafka is used for building real-time data pipelines and streaming apps. It is horizontally scalable,
only focus on Apache Kafka [26], but the RDMA design could be borrowed by other systems (§6). scalledaproducer that pushes records to containers called Kafka topics. A Kafka's subscriber, called a consumer, subscribes to Kafka topics to fetch
Apache \Storm and Spark for real-time streaming data analysis. For more information about Apache Kafka, refer to the Kafka documentaion. Understanding Kafka Architecture. Apache Kafka is a distributed publish-subscribe messaging system and a robust queue that handles a high volume of data and enables you to pass messages from one end-point to .
CDH: Cloudera’s Distribution Including Apache Hadoop Coordination Data Integration Fast Read/Write Access Languages / Compilers Workflow Scheduling Metadata APACHE ZOOKEEPER APACHE FLUME, APACHE SQOOP APACHE HBASE APACHE PIG, APACHE HIVE APACHE OOZIE APACHE OOZIE APACHE HIVE File System Mount UI
from: apache-kafka It is an unofficial and free apache-kafka ebook created for educational purposes. All the content is extracted from Stack Overflow Documentation, which is written by many hardworking individuals at Stack Overflow. It is neither affiliated with Stack Overflow nor official apache-kafka.
matched to the Cambridge IGCSE and O Level Accounting syllabuses, this coursebook increases understanding of accounting best practice. Clear step-by-step explanations and instructions help students learn how to record, report, present and interpret nancial information while gaining an appreciation of the ways accounting is used in modern business contexts. The coursebook is ideal for those .