本文最后更新于 320 天前,其中的信息可能已经过时,如有错误请发送邮件到wuxianglongblog@163.com
kafka2.8.0集群部署实战案例
一.自行部署zookeeper集群
1.使用咱们的自定义脚本启动zookeeper集群
[root@elk101.oldboyedu.com ~]# manager-zk.sh start
启动服务
========== zk101.yinzhengjie.com zkServer.sh start ================
ZooKeeper JMX enabled by default
ZooKeeper remote JMX Port set to 21811
ZooKeeper remote JMX authenticate set to false
ZooKeeper remote JMX ssl set to false
ZooKeeper remote JMX log4j set to false
ZooKeeper remote JMX Hostname set to 172.200.1.101
Using config: /oldboy/softwares/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
========== zk102.yinzhengjie.com zkServer.sh start ================
ZooKeeper JMX enabled by default
Using config: /oldboy/softwares/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
========== zk103.yinzhengjie.com zkServer.sh start ================
ZooKeeper JMX enabled by default
Using config: /oldboy/softwares/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@elk101.oldboyedu.com ~]#
2.查看zookeeper集群状态
[root@elk101.oldboyedu.com ~]# manager-zk.sh status
查看状态
========== zk101.yinzhengjie.com zkServer.sh status ================
ZooKeeper JMX enabled by default
ZooKeeper remote JMX Port set to 21811
ZooKeeper remote JMX authenticate set to false
ZooKeeper remote JMX ssl set to false
ZooKeeper remote JMX log4j set to false
ZooKeeper remote JMX Hostname set to 172.200.1.101
Using config: /oldboy/softwares/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: 172.200.1.101. Client SSL: false.
Mode: observer
========== zk102.yinzhengjie.com zkServer.sh status ================
ZooKeeper JMX enabled by default
Using config: /oldboy/softwares/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: 172.200.1.102. Client SSL: false.
Mode: leader
========== zk103.yinzhengjie.com zkServer.sh status ================
ZooKeeper JMX enabled by default
Using config: /oldboy/softwares/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: 172.200.1.103. Client SSL: false.
Mode: follower
[root@elk101.oldboyedu.com ~]#
二.部署单机版的kafka环境
1.下载Kafka软件并解压到指定目录
[root@elk101.oldboyedu.com ~]# ll
总用量 81832
-rw-r--r--. 1 root root 12387614 4月 24 18:31 apache-zookeeper-3.7.0-bin.tar.gz
-rw-r--r-- 1 root root 71403603 5月 5 16:48 kafka_2.13-2.8.0.tgz
[root@elk101.oldboyedu.com ~]#
[root@elk101.oldboyedu.com ~]# tar zxf kafka_2.13-2.8.0.tgz -C /oldboy/softwares/
[root@elk101.oldboyedu.com ~]#
2.创建符号连接并配置环境变量
[root@elk101.oldboyedu.com ~]# cd /oldboy/softwares/
[root@elk101.oldboyedu.com /oldboy/softwares/]#
[root@elk101.oldboyedu.com /oldboy/softwares/]# ll
总用量 0
drwxr-xr-x. 7 root root 145 4月 28 09:31 apache-zookeeper-3.7.0-bin
lrwxrwxrwx. 1 root root 36 4月 22 20:38 jdk -> /oldboy/softwares/jdk1.8.0_201/
drwxr-xr-x. 7 10 143 245 12月 16 2018 jdk1.8.0_201
drwxr-xr-x 7 root root 105 4月 14 22:34 kafka_2.13-2.8.0
lrwxrwxrwx. 1 root root 50 4月 22 21:00 zookeeper -> /oldboy/softwares/apache-zookeeper-3.7.0-bin/
[root@elk101.oldboyedu.com /oldboy/softwares/]#
[root@elk101.oldboyedu.com /oldboy/softwares/]# ln -sv kafka_2.13-2.8.0 kafka
"kafka" -> "kafka_2.13-2.8.0"
[root@elk101.oldboyedu.com /oldboy/softwares/]#
[root@elk101.oldboyedu.com /oldboy/softwares/]# ll
总用量 0
drwxr-xr-x. 7 root root 145 4月 28 09:31 apache-zookeeper-3.7.0-bin
lrwxrwxrwx. 1 root root 36 4月 22 20:38 jdk -> /oldboy/softwares/jdk1.8.0_201/
drwxr-xr-x. 7 10 143 245 12月 16 2018 jdk1.8.0_201
lrwxrwxrwx 1 root root 16 5月 5 16:50 kafka -> kafka_2.13-2.8.0
drwxr-xr-x 7 root root 105 4月 14 22:34 kafka_2.13-2.8.0
lrwxrwxrwx. 1 root root 50 4月 22 21:00 zookeeper -> /oldboy/softwares/apache-zookeeper-3.7.0-bin/
[root@elk101.oldboyedu.com /oldboy/softwares/]#
[root@elk101.oldboyedu.com /oldboy/softwares/]# cd
[root@elk101.oldboyedu.com ~]#
[root@elk101.oldboyedu.com ~]# vim /etc/profile.d/kafka.sh
[root@elk101.oldboyedu.com ~]#
[root@elk101.oldboyedu.com ~]# cat /etc/profile.d/kafka.sh
#!/bin/bash
KAFKA_HOME="/oldboy/softwares/kafka"
PATH=$PATH:$KAFKA_HOME/bin
[root@elk101.oldboyedu.com ~]#
[root@elk101.oldboyedu.com ~]# source /etc/profile.d/kafka.sh
[root@elk101.oldboyedu.com ~]#
3.kafka的配置文件简介
(1)查看kafka的配置文件目录
[root@elk101.oldboyedu.com ~]# ll /oldboy/softwares/kafka/config/
总用量 72
-rw-r--r-- 1 root root 906 4月 14 22:28 connect-console-sink.properties
-rw-r--r-- 1 root root 909 4月 14 22:28 connect-console-source.properties
-rw-r--r-- 1 root root 5321 4月 14 22:28 connect-distributed.properties
-rw-r--r-- 1 root root 883 4月 14 22:28 connect-file-sink.properties
-rw-r--r-- 1 root root 881 4月 14 22:28 connect-file-source.properties
-rw-r--r-- 1 root root 2247 4月 14 22:28 connect-log4j.properties
-rw-r--r-- 1 root root 2540 4月 14 22:28 connect-mirror-maker.properties
-rw-r--r-- 1 root root 2262 4月 14 22:28 connect-standalone.properties
-rw-r--r-- 1 root root 1221 4月 14 22:28 consumer.properties # 消费者的默认配置文件,主要是给命令行用户使用,因为程序员大多数会在代码中写死相关的配置信息哟~
drwxr-xr-x 2 root root 102 4月 14 22:28 kraft
-rw-r--r-- 1 root root 4674 4月 14 22:28 log4j.properties
-rw-r--r-- 1 root root 1925 4月 14 22:28 producer.properties # 生产者的默认配置文件,主要是给命令行用户使用,因为程序员大多数会在代码中写死相关的配置信息哟~
-rw-r--r-- 1 root root 6849 4月 14 22:28 server.properties # 主要用于配置kafka borker的相关信息。
-rw-r--r-- 1 root root 1032 4月 14 22:28 tools-log4j.properties
-rw-r--r-- 1 root root 1169 4月 14 22:28 trogdor.conf
-rw-r--r-- 1 root root 1205 4月 14 22:28 zookeeper.properties # Duang~不难发现,kafka软件默认内置了zookeeper服务,咱们不用内置的zk,而是使用咱们自己搭建的zookeeper集群。
[root@elk101.oldboyedu.com ~]#
(2)查看broker默认的配置文件
[root@elk101.oldboyedu.com ~]# egrep -v "^#|^$" /oldboy/softwares/kafka/config/server.properties
broker.id=0
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=18000
group.initial.rebalance.delay.ms=0
[root@elk101.oldboyedu.com ~]#
broker的配置:
http://kafka.apache.org/documentation/#brokerconfigs
温馨提示:
(1)kafka2.8.0版本主要配置以下三个参数即可,其它参数使用默认值就适合了大多数场景:
broker.id
log.dirs
zookeeper.connect
(2)对于kafka0.11.0及更早发布的版本除了上面提到的,还应该注意以下几个参数:
delete.topic.enable: 该版本默认是false,建议修改为ture。在kafka1.0版本之后官方均改为true。
4.修改"kafka101"实例的配置文件
[root@elk101.oldboyedu.com ~]# vim /oldboy/softwares/kafka/config/server.properties
[root@elk101.oldboyedu.com ~]#
[root@elk101.oldboyedu.com ~]# grep ^[a-Z] /oldboy/softwares/kafka/config/server.properties
broker.id=101
...
log.dirs=/oldboy/data/kafka
...
zookeeper.connect=elk101.oldboyedu.com:2181,elk102.oldboyedu.com:2181,elk103.oldboyedu.com:2181/oldboyedu_kafka280
...
[root@elk101.oldboyedu.com ~]#
温馨提示:
只需修改上述3个参数即可。
5.修改broker的堆内存大小
[root@elk101.oldboyedu.com ~]# vim /oldboy/softwares/kafka/bin/kafka-server-start.sh
[root@elk101.oldboyedu.com ~]#
[root@elk101.oldboyedu.com ~]# grep KAFKA_HEAP_OPTS /oldboy/softwares/kafka/bin/kafka-server-start.sh
...
if [ "x$KAFKA_HEAP_OPTS" = "x" ]; then
# export KAFKA_HEAP_OPTS="-Xmx1G -Xms1G"
export KAFKA_HEAP_OPTS="-Xmx256M -Xms256M"
...
[root@elk101.oldboyedu.com ~]#
[root@elk101.oldboyedu.com ~]# jps
19955 jar
36118 Kafka
36809 Jps
21902 QuorumPeerMain
[root@elk101.oldboyedu.com ~]#
[root@elk101.oldboyedu.com ~]# jmap -heap 36118 # 如下图所示,kafka的heap内存修改成功啦~
6.启动kafka集群
[root@elk101.oldboyedu.com ~]# kafka-server-start.sh -daemon $KAFKA_HOME/config/server.properties
[root@elk101.oldboyedu.com ~]#
[root@elk101.oldboyedu.com ~]# ss -ntl | grep 9092
LISTEN 0 50 [::]:9092 [::]:*
[root@elk101.oldboyedu.com ~]#
7.查看zookeeper中有关kafka的信息
如下图所示,我们可以看到kafka在zookeeper集群中生成了很多znode信息哟~
三.部署集群版的kafka
1.使用同步脚本将kafka应用程序推送到集群其他节点
[root@elk101.oldboyedu.com ~]# data_rsync.sh /oldboy/softwares/kafka
=========== elk102.oldboyedu.com : /oldboy/softwares/kafka ===========
命令执行成功
=========== elk103.oldboyedu.com : /oldboy/softwares/kafka ===========
命令执行成功
[root@elk101.oldboyedu.com ~]#
[root@elk101.oldboyedu.com ~]# data_rsync.sh /oldboy/softwares/kafka_2.13-2.8.0/
=========== elk102.oldboyedu.com : /oldboy/softwares/kafka_2.13-2.8.0/ ===========
命令执行成功
=========== elk103.oldboyedu.com : /oldboy/softwares/kafka_2.13-2.8.0/ ===========
命令执行成功
[root@elk101.oldboyedu.com ~]#
[root@elk101.oldboyedu.com ~]# data_rsync.sh /etc/profile.d/kafka.sh
=========== elk102.oldboyedu.com : /etc/profile.d/kafka.sh ===========
命令执行成功
=========== elk103.oldboyedu.com : /etc/profile.d/kafka.sh ===========
命令执行成功
[root@elk101.oldboyedu.com ~]#
2.修改相应节点的配置文件
[root@elk102.oldboyedu.com ~]# grep ^broker /oldboy/softwares/kafka/config/server.properties
broker.id=102
[root@elk102.oldboyedu.com ~]#
[root@elk103.oldboyedu.com ~]# grep ^broker $KAFKA_HOME/config/server.properties
broker.id=103
[root@elk103.oldboyedu.com ~]#
3.编写kafka集群管理脚本
(1)安装ansible
[root@elk101.oldboyedu.com ~]# yum -y install ansible
(2)编写启动脚本
[root@elk101.oldboyedu.com ~]# vim /usr/local/bin/manager_kafka.sh
[root@elk101.oldboyedu.com ~]#
[root@elk101.oldboyedu.com ~]#
[root@elk101.oldboyedu.com ~]# cat /usr/local/bin/manager_kafka.sh
#!/bin/bash
#判断用户是否传参
if [ $# -ne 1 ];then
echo "无效参数,用法为: $0 {start|stop}"
exit
fi
#获取用户输入的命令
cmd=$1
for (( i=101 ; i<=103 ; i++ )) ; do
tput setaf 2
echo "****** elk${i}.oldboyedu.com ---> [`basename $0`: $cmd ] ******"
tput setaf 9
case $cmd in
start)
ssh elk${i}.oldboyedu.com "source /etc/profile.d/kafka.sh; kafka-server-start.sh -daem
on /oldboy/softwares/kafka/config/server.properties" echo elk${i}.oldboyedu.com "服务已启动"
;;
stop)
ssh elk${i}.oldboyedu.com "source /etc/profile.d/kafka.sh; kafka-server-stop.sh"
echo elk${i}.oldboyedu.com "服务已停止"
;;
status)
ansible kafka -m shell -a 'jps'
exit
;;
*)
echo "无效参数,用法为: $0 {start|stop|status}"
exit
;;
esac
done
[root@elk101.oldboyedu.com ~]#
(3)为脚本添加执行权限
[root@elk101.oldboyedu.com ~]# ll /oldboy/softwares/kafka/bin/manager-kafka.sh
-rw-r--r-- 1 root root 1318 5月 7 10:51 /oldboy/softwares/kafka/bin/manager-kafka.sh
[root@elk101.oldboyedu.com ~]#
[root@elk101.oldboyedu.com ~]# chmod +x /oldboy/softwares/kafka/bin/manager-kafka.sh
[root@elk101.oldboyedu.com ~]#
[root@elk101.oldboyedu.com ~]# ll /oldboy/softwares/kafka/bin/manager-kafka.sh
-rwxr-xr-x 1 root root 1318 5月 7 10:51 /oldboy/softwares/kafka/bin/manager-kafka.sh
[root@elk101.oldboyedu.com ~]#
4.启动kafka集群
(1)使用kafka启动脚本来启动kafka集群
[root@elk101.oldboyedu.com ~]# manager-kafka.sh status
****** elk101.oldboyedu.com ---> [manager-kafka.sh: status ] ******
elk101.oldboyedu.com | CHANGED | rc=0 >>
2992 QuorumPeerMain
6369 Jps
3609 ZooKeeperMain
kafka103.yinzhengjie.com | CHANGED | rc=0 >>
1451 QuorumPeerMain
4062 Jps
kafka102.yinzhengjie.com | CHANGED | rc=0 >>
3074 QuorumPeerMain
5150 Jps
[root@elk101.oldboyedu.com ~]#
[root@elk101.oldboyedu.com ~]# manager-kafka.sh start
****** elk101.oldboyedu.com ---> [manager-kafka.sh: start ] ******
elk101.oldboyedu.com 服务已启动
****** kafka102.yinzhengjie.com ---> [manager-kafka.sh: start ] ******
kafka102.yinzhengjie.com 服务已启动
****** kafka103.yinzhengjie.com ---> [manager-kafka.sh: start ] ******
kafka103.yinzhengjie.com 服务已启动
[root@elk101.oldboyedu.com ~]#
[root@elk101.oldboyedu.com ~]# manager-kafka.sh status
****** elk101.oldboyedu.com ---> [manager-kafka.sh: status ] ******
elk101.oldboyedu.com | CHANGED | rc=0 >>
2992 QuorumPeerMain
3609 ZooKeeperMain
6745 Kafka
6923 Jps
kafka103.yinzhengjie.com | CHANGED | rc=0 >>
4550 Jps
4426 Kafka
1451 QuorumPeerMain
kafka102.yinzhengjie.com | CHANGED | rc=0 >>
3074 QuorumPeerMain
5638 Jps
5514 Kafka
[root@elk101.oldboyedu.com ~]#
5.使用zkWeb工具查看kafka在zookeeper存储的数据信息
[root@elk101.oldboyedu.com ~]# zkCli.sh -server 172.200.1.101:2181
...
[zk: 172.200.1.101:2181(CONNECTED) 0] ls /
[zookeeper]
[zk: 172.200.1.101:2181(CONNECTED) 1]
[zk: 172.200.1.101:2181(CONNECTED) 1] ls / # 当我们的kafka集群启动成功之后,不难发现,在zookeeper集群的根节点下多了一个kafka的znode。
[kafka, zookeeper]
[zk: 172.200.1.101:2181(CONNECTED) 2]
[zk: 172.200.1.101:2181(CONNECTED) 2] ls /kafka # 而咱们启动kafka集群后,不难发现多出了如下几个znode信息。
[admin, brokers, cluster, config, consumers, controller, controller_epoch, feature, isr_change_notification, latest_producer_id_block, log_dir_event_notification]
[zk: 172.200.1.101:2181(CONNECTED) 3]
相关znode功能说明如下:
admin:
存储管理kafka集群的相关信息,比如已删除的topic,重新分区等。
brokers:
存储broker的信息。
cluster:
集群信息,我暂时猜测是kafka集群的唯一编号信息的存储。
config:
配置信息。
consumers:
在kafka 0.9.9版本以前是存储消费者相关信息,在kafka 2.8.0版本中不难发现znode下是空的,说不定在将来的版本该znode会被移除哟~
controller:
存储中央控制器,该znode由kafka的broker争抢,谁先抢到谁就负责和zookeeper交互,更新kafka的一些状态信息之类的。当然,其它broker也会watch该znode,一旦被移除就会被其它broker争抢哟~
controller_epoch:
存储集群中的中央控制器选举的次数。
feature:
Kafka的特征信息。
isr_change_notification:
ISR变更通知。
latest_producer_id_block:
最新生产者id的块。
log_dir_event_notification:
日志目录时间通知
四.常见的报错解决方案
1.java.net.UnknownHostException: elk101: elk101: 未知的名称或服务
解决方案:
修改套接字服务器侦听的地址,"listeners=PLAINTEXT://10.0.0.101:9092"
2.kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING
解决方案:
检查zookeeper服务是否正常工作。
3.Failed to acquire lock on file .lock in /oldboy/data/logs. A Kafka instance in another process or thread is using this directory.
解决方案:
检查kafka的配置文件是否指定正确。
4.The Cluster ID -x0A3q14StiIUMPDZIzTew doesn't match stored clusterId Some(t5qwKMFVQQ6G6W17_oAlXw) in meta.properties. The broker is trying to join the wrong cluster. Configured zookeeper.connect may be wrong.
解决方案:
方案一:
删除旧的存储数据。即"log.dirs"指定的位置哟~
方案二:
还原zookeeper以前连接配置