Slides - Apache Kafka Architecture & Fundamentals Explained

1y ago
11 Views
2 Downloads
4.85 MB
33 Pages
Last View : 5d ago
Last Download : 3m ago
Upload by : Jenson Heredia
Transcription

1Fundamentals for Apache Kafka Apache Kafka Architecture & Fundamentals ExplainedJoe Desmond, Sr. Technical Trainer, Confluent

2Session Schedule Session 1: Benefits of Stream Processing and Apache Kafka Use Cases Session 2: Apache Kafka Architecture & Fundamentals Explained Session 3: How Apache Kafka Works Session 4: Integrating Apache Kafka into your Environment

3Learning ObjectivesAfter this module you will be able to: Identify the key elements in a Kafka cluster Name the essential responsibilities of each keyelement Explain what a Topic is and describe its relation toPartitions and Segments

4The World Produces Data

5Producers

6Kafka Brokers

7Consumers

8Architecture

9Decoupling Producers and Consumers Producers and Consumers are decoupled Slow Consumers do not affect Producers Add Consumers without affecting Producers Failure of Consumer does not affect System

10How KafkaUsesZooKeeper

11ZooKeeper Basics Open Source Apache Project Distributed Key Value Store Maintains configuration information Stores ACLs and Secrets Enables highly reliable distributed coordination Provides distributed synchronization Three or five servers form an ensemble

12Topics Topics: Streams of “related” Messages in Kafka Is a Logical Representation Categorizes Messages into Groups Developers define Topics ProducerTopic: N to N Relation Unlimited Number of Topics

13Topics, Partitions, and Segments

14Topics, Partitions, and Segments

15The Log

16Log Structured Data Flow

17The Stream

18Data Elements

19Brokers Manage Partitions Messages of Topic spread across Partitions Partitions spread across Brokers Each Broker handles many Partitions Each Partition stored on Broker’s disk Partition: 1.n log files Each message in Log identified by Offset Configurable Retention Policy

20Broker Basics Producer sends Messages toBrokers Brokers receive and storeMessages A Kafka Cluster can have manyBrokers Each Broker manages multiplePartitions

21Broker Replication

22Producer Basics Producers write Data as Messages Can be written in any language Native: Java, C/C , Python, Go,, .NET, JMS More Languages by Community REST Server for any unsupported Language Command Line Producer Tool

23Load Balancing and Semantic Partitioning Producers use a Partitioning Strategy to assign each message to a Partition Two Purposes: Load Balancing Semantic Partitioning Partitioning Strategy specified by Producer Default Strategy: hash(key) % number of partitions No KeyRound-Robin Custom Partitioner possible

24Consumer Basics Consumers pull messages from 1.n topics New inflowing messages are automatically retrieved Consumer offset Keeps track of the last message read Is stored in special topic CLI tools exist to read from cluster

25Consumer Offset

26Distributed Consumption

27Scalable Data Pipeline

28Q&AQuestions: Why do we need an odd number of ZooKeeper nodes? How many Kafka brokers can a cluster maximally have? How many Kafka brokers do you minimally need for highavailability? What is the criteria that two or more consumers form aconsumer group?

29Continue your Apache Kafka Education! Confluent Operations for Apache Kafka Confluent Developer Skills for Building Apache Kafka Confluent Stream Processing using Apache Kafka Streamsand KSQL Confluent Advanced Skills for Optimizing Apache KafkaFor more details, see http://confluent.io/training

What you Need to Know30Certifications Qualifications: 6-to-9 months hands-onexperienceConfluent Certified Developerfor Apache Kafka Duration: 90 mins Availability: Live, online 24/7 Cost: 150 Register online:www.confluent.io/certification(aligns to Confluent Developer Skillsfor Building Apache Kafka course)Confluent CertifiedAdministrator for ApacheKafka(aligns to Confluent Operations Skillsfor Apache Kafka)

31Stay in raining

32Thank you for attending! Thank you for attending thesession! Feedback to: training-admin@confluent.io

33Copyright Confluent, Inc. 2014-2019. Privacy Policy Terms & Conditions.Apache, Apache Kafka, Kafka and the Kafka logo are trademarks ofthe Apache Software Foundation

for Apache Kafka (aligns to Confluent Developer Skills for Building Apache Kafka course) Confluent Certified Administrator for Apache Kafka (aligns to Confluent Operations Skills for Apache Kafka) What you Need to Know Qualifications: 6-to-9 months hands-on experience Duration: 90 mins Availability: Live, online 24/7 Cost: 150

Related Documents:

Getting Started with the Cloud . Apache Bigtop Apache Kudu Apache Spark Apache Crunch Apache Lucene Apache Sqoop Apache Druid Apache Mahout Apache Storm Apache Flink Apache NiFi Apache Tez Apache Flume Apache Oozie Apache Tika Apache Hadoop Apache ORC Apache Zeppelin

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting Kafka Design Motivation Goals Kafka built to support real-time analytics Designed to feed analytics system that did real-time processing of streams Unified platform for real-time handling of streaming data feeds Goals: high-throughput streaming data platform supports high-volume event streams like log aggregation, user

Apache Kafka Overview Apache Kafka is a hot technology amongst application developers and architects looking to build the latest generation of real-time and web-scale applications. According the official Apache Kafka website "Kafka is used for building real-time data pipelines and streaming apps. It is horizontally scalable,

only focus on Apache Kafka [26], but the RDMA design could be borrowed by other systems (§6). scalledaproducer that pushes records to containers called Kafka topics. A Kafka's subscriber, called a consumer, subscribes to Kafka topics to fetch

Apache \Storm and Spark for real-time streaming data analysis. For more information about Apache Kafka, refer to the Kafka documentaion. Understanding Kafka Architecture. Apache Kafka is a distributed publish-subscribe messaging system and a robust queue that handles a high volume of data and enables you to pass messages from one end-point to .

CDH: Cloudera’s Distribution Including Apache Hadoop Coordination Data Integration Fast Read/Write Access Languages / Compilers Workflow Scheduling Metadata APACHE ZOOKEEPER APACHE FLUME, APACHE SQOOP APACHE HBASE APACHE PIG, APACHE HIVE APACHE OOZIE APACHE OOZIE APACHE HIVE File System Mount UI

from: apache-kafka It is an unofficial and free apache-kafka ebook created for educational purposes. All the content is extracted from Stack Overflow Documentation, which is written by many hardworking individuals at Stack Overflow. It is neither affiliated with Stack Overflow nor official apache-kafka.

matched to the Cambridge IGCSE and O Level Accounting syllabuses, this coursebook increases understanding of accounting best practice. Clear step-by-step explanations and instructions help students learn how to record, report, present and interpret nancial information while gaining an appreciation of the ways accounting is used in modern business contexts. The coursebook is ideal for those .