Compare commits

...

156 Commits

Author SHA1 Message Date
EricZeng
ec7f801929 Merge pull request #180 from didi/dev_2.3.0
Dev 2.3.0
2021-02-09 22:06:51 +08:00
zengqiao
0f8aca382e bump version to 2.3.0 2021-02-09 21:47:56 +08:00
zengqiao
0270f77eaa add upgrade doc 2021-02-09 21:46:55 +08:00
EricZeng
dcba71ada4 Merge pull request #179 from lucasun/dev_2.3.0_fe
迭代V2.5, 修复broker监控问题,增加JMX认证支持等...
2021-02-09 18:48:42 +08:00
孙超
6080f76a9c 迭代V2.5, 修复broker监控问题,增加JMX认证支持等... 2021-02-09 15:26:47 +08:00
zengqiao
5efd424172 version 2.3.0 2021-02-09 11:20:56 +08:00
EricZeng
2672502c07 Merge pull request #174 from 17hao/issue-153-authority
Tracking delete account
2021-02-07 16:10:56 +08:00
EricZeng
83440cc3d9 Merge pull request #173 from 17hao/issue-153
Tracking changes applied to app
2021-02-07 16:10:01 +08:00
17hao
8e5f93be1c Tracking delete account 2021-02-07 15:54:41 +08:00
17hao
c1afc07955 Tracking changes applied to app 2021-02-07 15:16:26 +08:00
EricZeng
4a83e14878 Merge pull request #172 from 17hao/issue-153
Tracking changes applied to Kafka cluster
2021-02-07 14:38:38 +08:00
17hao
832320abc6 Improve code's cohesion && save jmx properties 2021-02-07 14:20:57 +08:00
17hao
70c237da72 Tracking changes applied to Kafka cluster 2021-02-07 13:23:22 +08:00
EricZeng
edfcc5c023 Merge pull request #169 from 17hao/issue-153
Record topic operation
2021-02-06 22:30:32 +08:00
17hao
0668debec6 Update pom.xml 2021-02-06 18:46:47 +08:00
17hao
02d6463faa Using JsonUtils instead of fastjson 2021-02-06 18:43:36 +08:00
17hao
1fdb85234c Record editting topic 2021-02-05 12:18:50 +08:00
EricZeng
44b7dd1808 Merge pull request #167 from ZHAOYINRUI/master
更新readme、集群接入手册
2021-02-04 19:21:59 +08:00
ZHAOYINRUI
e983ee3101 Update README.md 2021-02-04 19:10:11 +08:00
ZHAOYINRUI
75e7e81c05 Add files via upload 2021-02-04 19:09:02 +08:00
ZHAOYINRUI
31ce3b9c08 Update add_cluster.md 2021-02-04 19:08:28 +08:00
EricZeng
ed93c50fef modify without logical cluster desc 2021-02-04 16:54:42 +08:00
EricZeng
4845660eb5 Merge pull request #163 from 17hao/issue-160
Issue#160: Remove __consumer_offsets from topic list
2021-02-04 16:42:41 +08:00
17hao
c7919210a2 Fix topic list filter condition 2021-02-04 16:32:31 +08:00
17hao
9491418f3b Update if statements 2021-02-04 12:33:32 +08:00
17hao
e8de403286 Hide __transaction_state in topic list && fix logic error 2021-02-04 12:14:44 +08:00
17hao
dfb625377b Using existing topic name constant 2021-02-03 22:30:38 +08:00
EricZeng
2c0f2a8be6 Merge pull request #166 from ZHAOYINRUI/master
更新集群接入文章、资源申请文章
2021-02-03 21:02:38 +08:00
ZHAOYINRUI
787d3cb3e9 Update resource_apply.md 2021-02-03 20:52:44 +08:00
ZHAOYINRUI
96ca17d26c Add files via upload 2021-02-03 19:43:03 +08:00
ZHAOYINRUI
3dd0f7f2c3 Update add_cluster.md 2021-02-03 19:41:33 +08:00
ZHAOYINRUI
10ba0cf976 Update resource_apply.md 2021-02-03 18:18:02 +08:00
ZHAOYINRUI
276c15cc23 Delete docs/user_guide/resource_apply directory 2021-02-03 18:08:15 +08:00
ZHAOYINRUI
2584b848ad Update resource_apply.md 2021-02-03 18:07:34 +08:00
ZHAOYINRUI
6471efed5f Add files via upload 2021-02-03 18:04:40 +08:00
ZHAOYINRUI
5b7d7ad65d Create resource_apply.md 2021-02-03 18:01:42 +08:00
17hao
712851a8a5 Add braces 2021-02-03 16:06:16 +08:00
17hao
63d291cb47 Remove __consumer_offsets from topic list 2021-02-03 15:50:33 +08:00
EricZeng
f825c92111 Merge pull request #159 from didi/dev_2.2.1
storage support s3
2021-02-03 13:49:52 +08:00
EricZeng
419eb2ea41 Merge pull request #158 from didi/dev
change dockerfile and heml location
2021-02-03 10:09:43 +08:00
zengqiao
89b58dd64e storage support s3 2021-02-02 16:42:20 +08:00
zengqiao
6bc5f81440 change dockerfile and heml location 2021-02-02 15:58:46 +08:00
EricZeng
424f4b7b5e Merge pull request #157 from didi/master
merge master
2021-02-02 15:33:51 +08:00
mrazkong
9271a1caac Merge pull request #118 from yangvipguang/helm-dev
增加Dockerfile 和 简单Helm
2021-02-01 10:50:05 +08:00
杨光
0ee4df03f9 Update Dockerfile 2021-01-31 15:34:15 +08:00
杨光
8ac713ce32 Update Dockerfile 2021-01-31 15:30:18 +08:00
杨光
76b2489fe9 Delete docker-depends/agent/config directory 2021-01-31 15:29:50 +08:00
杨光
6786095154 Delete sources.list 2021-01-31 15:29:31 +08:00
杨光
2c5793ef37 Delete settings 2021-01-31 15:29:19 +08:00
杨光
d483f25b96 Add files via upload
add  jmx prometheus
2021-01-31 15:28:59 +08:00
EricZeng
7118368979 Merge pull request #136 from ZHAOYINRUI/patch-10
Create resource_apply.md
2021-01-29 10:54:11 +08:00
EricZeng
59256c2e80 modify jdbc url
modify jdbc url, add useSSL=false
2021-01-29 10:53:35 +08:00
EricZeng
1fb8a0db1e Merge pull request #146 from ZHAOYINRUI/patch-12
Update README.md
2021-01-29 10:03:48 +08:00
ZHAOYINRUI
07d0c8e8fa Update README.md 2021-01-28 22:02:49 +08:00
EricZeng
98452ead17 Merge pull request #145 from didi/dev
faq add how to resolve topic biz data not exist error desc
2021-01-28 16:20:42 +08:00
zengqiao
d8c9f40377 faq add how to resolve topic biz data not exist error desc 2021-01-28 15:50:31 +08:00
EricZeng
8148d5eec6 Merge pull request #144 from didi/dev
optimize result code
2021-01-28 14:11:00 +08:00
zengqiao
4c429ad604 optimize result code 2021-01-28 12:06:06 +08:00
EricZeng
a9c52de8d5 Merge pull request #143 from ZhaoXinlong/patch-1
correcting some typo
2021-01-27 20:28:32 +08:00
ZhaoXinlong
f648aa1f91 correcting some typo
修改文字错误
2021-01-27 16:31:42 +08:00
EricZeng
eaba388bdd Merge pull request #142 from didi/dev
add connect jmx failed desc
2021-01-27 16:06:39 +08:00
zengqiao
73e6afcbc6 add connect jmx failed desc 2021-01-27 16:01:18 +08:00
EricZeng
8c3b72adf2 Merge pull request #139 from didi/dev
optimize message when login failed
2021-01-26 19:57:50 +08:00
zengqiao
ae18ff4262 optimize login failed message 2021-01-26 16:15:08 +08:00
ZHAOYINRUI
1adc8af543 Create resource_apply.md 2021-01-25 19:27:31 +08:00
EricZeng
7413df6f1e Merge pull request #131 from ZHAOYINRUI/patch-9
Update alarm_rules.md
2021-01-25 18:36:06 +08:00
EricZeng
bda8559190 Merge pull request #135 from didi/master
merge master
2021-01-25 18:34:56 +08:00
EricZeng
b74612fa41 Merge pull request #134 from didi/dev_2.2.0
merge dev 2.2.0
2021-01-25 17:30:26 +08:00
EricZeng
22e0c20dcd Merge pull request #133 from lucasun/dev_2.2.0_fe
fix txt
2021-01-25 17:21:42 +08:00
孙超
08f92e1100 fix txt 2021-01-25 17:02:07 +08:00
zengqiao
bb12ece46e modify zk example 2021-01-25 17:01:54 +08:00
EricZeng
0065438305 Merge pull request #132 from lucasun/dev_2.2.0_fe
add fe page
2021-01-25 16:47:02 +08:00
孙超
7f115c1b3e add fe page 2021-01-25 15:34:07 +08:00
ZHAOYINRUI
4e0114ab0d Update alarm_rules.md 2021-01-25 13:24:01 +08:00
EricZeng
0ef64fa4bd Merge pull request #126 from ZHAOYINRUI/patch-8
Create alarm_rules.md
2021-01-25 11:09:21 +08:00
ZHAOYINRUI
84dbc17c22 Update alarm_rules.md 2021-01-25 11:04:30 +08:00
EricZeng
16e16e356d Merge pull request #130 from xuehaipeng/patch-1
Update faq.md
2021-01-25 10:35:12 +08:00
xuehaipeng
978ee885c4 Update faq.md 2021-01-24 20:06:29 +08:00
zengqiao
850d43df63 add v2.2.0 feature & fix 2021-01-23 13:19:29 +08:00
zengqiao
fc109fd1b1 bump version to 2.2.0 2021-01-23 12:41:38 +08:00
EricZeng
2829947b93 Merge pull request #129 from didi/master
merge master
2021-01-23 11:09:52 +08:00
EricZeng
0c2af89a1c Merge pull request #125 from ZHAOYINRUI/patch-7
create kafka_metrics_desc.md
2021-01-23 11:03:14 +08:00
EricZeng
14c2dc9624 update kafka_metrics.md 2021-01-23 11:01:44 +08:00
EricZeng
4f35d710a6 Update and rename metric.md to kafka_metrics_desc.md 2021-01-23 10:58:11 +08:00
EricZeng
fdb5e018e5 Merge pull request #122 from ZHAOYINRUI/patch-4
Update README.md
2021-01-23 10:51:26 +08:00
EricZeng
6001fde25c Update dynamic_config_manager.md 2021-01-23 10:21:47 +08:00
EricZeng
ae63c0adaf Merge pull request #128 from didi/dev
add sync topic to db doc
2021-01-23 10:20:27 +08:00
zengqiao
ad1539c8f6 add sync topic to db doc 2021-01-23 10:17:59 +08:00
EricZeng
634a0c8cd0 Update faq.md 2021-01-22 20:42:13 +08:00
ZHAOYINRUI
773f9a0c63 Create alarm_rules.md 2021-01-22 18:16:51 +08:00
ZHAOYINRUI
e4e320e9e3 Create metric.md 2021-01-22 18:06:35 +08:00
ZHAOYINRUI
3b4b400e6b Update README.md 2021-01-22 15:56:53 +08:00
杨光
a950be2d95 change password 2021-01-20 17:57:14 +08:00
杨光
ba6f5ab984 add helm and dockerfile 2021-01-20 17:49:56 +08:00
mike.zhangliang
f3a5e3f5ed Update README.md 2021-01-18 19:06:43 +08:00
mike.zhangliang
e685e621f3 Update README.md 2021-01-18 19:05:44 +08:00
EricZeng
2cd2be9b67 Merge pull request #112 from didi/dev
监控告警系统对接说明文档
2021-01-17 18:21:16 +08:00
zengqiao
e73d9e8a03 add monitor_system_integrate_with_self file 2021-01-17 18:18:07 +08:00
zengqiao
476f74a604 rename file 2021-01-17 16:49:02 +08:00
EricZeng
ab0d1d99e6 Merge pull request #111 from didi/dev
Dev
2021-01-17 16:11:08 +08:00
zengqiao
d5680ffd5d 增加Topic同步任务&Bug修复 2021-01-16 16:26:38 +08:00
EricZeng
3c091a88d4 Merge pull request #110 from didi/master
合并master分支上的改动
2021-01-16 13:37:31 +08:00
EricZeng
49b70b33de Merge pull request #108 from didi/dev
增加application.yml文件说明 & 修改版本
2021-01-16 13:34:07 +08:00
zengqiao
c5ff2716fb 优化build.sh & yaml 2021-01-16 12:39:56 +08:00
ZQKC
400fdf0896 修复图片地址错误问题
修复图片地址错误问题
2021-01-16 12:04:20 +08:00
ZQKC
cbb8c7323c Merge pull request #109 from ZHAOYINRUI/master
架构图更新、钉钉群ID更新
2021-01-16 09:33:19 +08:00
ZHAOYINRUI
60e79f8f77 Update README.md 2021-01-16 00:25:06 +08:00
ZHAOYINRUI
0e829d739a Add files via upload 2021-01-16 00:22:31 +08:00
ZQKC
62abb274e0 增加application.yml文件说明
增加application.yml文件说明
2021-01-15 19:14:48 +08:00
ZQKC
e4028785de Update README.md
change km address
2021-01-09 15:30:30 +08:00
mrazkong
2bb44bcb76 Update Intergration_n9e_monitor.md 2021-01-07 17:09:15 +08:00
mike.zhangliang
684599f81b Update README.md 2021-01-07 15:44:17 +08:00
mike.zhangliang
b56d28f5df Update README.md 2021-01-07 15:43:07 +08:00
ZHAOYINRUI
02b9ac04c8 Update user_guide_cn.md 2020-12-30 22:44:23 +08:00
zengqiao
2fc283990a bump version to 2.1.0 2020-12-19 01:53:46 +08:00
ZQKC
abb652ebd5 Merge pull request #104 from didi/dev
v2.1版本合并
2020-12-19 01:14:26 +08:00
zengqiao
55786cb7f7 修改node版本要求 2020-12-19 00:45:58 +08:00
zengqiao
447a575f4f v2.1 fe 2020-12-19 00:40:52 +08:00
zengqiao
49280a8617 v2.1版本更新 2020-12-19 00:27:16 +08:00
ZQKC
ff78a9cc35 Merge pull request #101 from didi/dev
use mysql 8
2020-12-11 11:49:06 +08:00
zengqiao
3fea5c9c8c use mysql 8 2020-12-11 10:48:03 +08:00
ZQKC
aea63cad52 Merge pull request #94 from didi/dev
增加FAQ
2020-11-22 21:49:48 +08:00
zengqiao
800abe9920 增加FAQ 2020-11-22 21:43:52 +08:00
ZQKC
dd6069e41a Merge pull request #93 from didi/dev
夜莺Mon集成配置说明
2020-11-22 20:09:34 +08:00
zengqiao
90d31aeff0 夜莺Mon集成配置说明 2020-11-22 20:07:14 +08:00
ZQKC
4d9a327b1f Merge pull request #92 from didi/dev
FIX N9e Mon
2020-11-22 18:15:49 +08:00
zengqiao
06a97ef076 FIX N9e Mon 2020-11-22 18:13:36 +08:00
ZQKC
76c2477387 Merge pull request #91 from didi/dev
修复上报夜莺功能
2020-11-22 17:00:39 +08:00
zengqiao
bc4dac9cad 删除无效代码 2020-11-22 16:58:43 +08:00
zengqiao
36e3d6c18a 修复上报夜莺功能 2020-11-22 16:56:22 +08:00
ZQKC
edfd84a8e3 Merge pull request #88 from didi/dev
增加build.sh
2020-11-15 17:02:26 +08:00
zengqiao
fb20cf6069 增加build.sh 2020-11-15 16:58:28 +08:00
ZQKC
abbe47f6b9 Merge pull request #87 from didi/dev
初始化SQL优化&KCM修复&连接信息修复
2020-11-15 16:55:42 +08:00
zengqiao
f84d250134 kcm修复&连接信息接口修复 2020-11-15 16:50:59 +08:00
zengqiao
3ffb4b8990 初始化SQL优化 2020-11-15 16:31:10 +08:00
ZQKC
f70cfabede Merge pull request #84 from didi/dev
fix 前端资源加载问题
2020-11-14 16:56:16 +08:00
potaaato
3a81783d77 Merge pull request #83 from Candieslove/master
fix: remove track.js && add font.css
2020-11-13 14:04:41 +08:00
eilenexuzhe
237a4a90ff fix: remove track.js && add font.css 2020-11-13 11:58:46 +08:00
ZQKC
99c7dfc98d Merge pull request #81 from didi/dev
修复Topic详情中服务地址不展示问题
2020-11-08 20:13:03 +08:00
zengqiao
48aba34370 修复Topic详情中服务地址不展示问题 2020-11-08 20:07:45 +08:00
ZQKC
29cca36f2c Merge pull request #80 from didi/dev
增加上报监控指标开关
2020-11-08 17:14:50 +08:00
zengqiao
0f5819f5c2 增加上报监控指标开关 2020-11-08 17:13:04 +08:00
ZQKC
373772de2d Merge pull request #79 from didi/dev
文案优化|服务发现接口修复
2020-11-08 16:11:10 +08:00
zengqiao
7f5bbe8b5f 优化 2020-11-08 16:00:15 +08:00
zengqiao
daee57167b 服务发现接口修复 2020-11-08 15:59:50 +08:00
zengqiao
03467196b9 POM文件优化 2020-11-08 15:59:27 +08:00
zengqiao
d3f3531cdb 文案优化 2020-11-08 15:43:42 +08:00
ZQKC
883b694592 Merge pull request #78 from didi/dev
文档更新
2020-11-07 22:21:52 +08:00
zengqiao
6c89d66af9 文档更新 2020-11-07 22:09:22 +08:00
ZQKC
fb0a76b418 Merge pull request #77 from didi/master
master合并到dev
2020-11-07 22:02:24 +08:00
ZQKC
64f77fca5b Merge pull request #71 from didi/dev_2.x
开放接口
2020-10-26 22:53:53 +08:00
zengqiao
b1fca2c5be 删除无效代码 2020-10-26 11:23:28 +08:00
zengqiao
108d705f09 删除无效代码 2020-10-26 11:20:34 +08:00
zengqiao
a77242e66c 开放接口&近期BUG修复 2020-10-26 11:17:45 +08:00
ZQKC
8b153113ff Merge pull request #70 from didi/master
merge master
2020-10-26 10:45:56 +08:00
zengqiao
6d0ec37135 增加格式PDF文档防止图裂 2020-10-22 09:32:58 +08:00
530 changed files with 13692 additions and 3173 deletions

15
.gitignore vendored
View File

@@ -56,6 +56,7 @@ fabric.properties
*.jar *.jar
*.war *.war
*.ear *.ear
*.tar.gz
# virtual machine crash logs, see http://www.java.com/en/download/help/error_hotspot.xml # virtual machine crash logs, see http://www.java.com/en/download/help/error_hotspot.xml
hs_err_pid* hs_err_pid*
@@ -99,14 +100,14 @@ target/
*/velocity.log* */velocity.log*
*/*.log */*.log
*/*.log.* */*.log.*
web/node_modules/ node_modules/
web/node_modules/* node_modules/*
workspace.xml workspace.xml
/output/* /output/*
.gitversion .gitversion
*/node_modules/* node_modules/*
web/src/main/resources/templates/* out/*
*/out/* dist/
*/dist/* dist/*
kafka-manager-web/src/main/resources/templates/
.DS_Store .DS_Store
kafka-manager-web/src/main/resources/templates/*

View File

@@ -5,60 +5,89 @@
**一站式`Apache Kafka`集群指标监控与运维管控平台** **一站式`Apache Kafka`集群指标监控与运维管控平台**
---
## 主要功能特性
### 集群监控维度 阅读本README文档您可以了解到滴滴Logi-KafkaManager的用户群体、产品定位等信息并通过体验地址快速体验Kafka集群指标监控与运维管控的全流程。<br>若滴滴Logi-KafkaManager已在贵司的生产环境进行使用并想要获得官方更好地支持和指导可以通过[`OCE认证`](http://obsuite.didiyun.com/open/openAuth),加入官方交流平台。
- 多版本集群管控,支持从`0.10.2``2.x`版本;
- 集群Topic、Broker等多维度历史与实时关键指标查看
### 集群管控维度 ## 1 产品简介
滴滴Logi-KafkaManager脱胎于滴滴内部多年的Kafka运营实践经验是面向Kafka用户、Kafka运维人员打造的共享多租户Kafka云平台。专注于Kafka运维管控、监控告警、资源治理等核心场景经历过大规模集群、海量大数据的考验。内部满意度高达90%的同时,还与多家知名企业达成商业化合作。
- 集群运维包括逻辑Region方式管理集群 ### 1.1 快速体验地址
- Broker运维包括优先副本选举 - 体验地址 http://117.51.146.109:8080 账号密码 admin/admin
- Topic运维包括创建、查询、扩容、修改属性、数据采样及迁移等
- 消费组运维,包括指定时间或指定偏移两种方式进行重置消费偏移 ### 1.2 体验地图
相比较于同类产品的用户视角单一大多为管理员视角滴滴Logi-KafkaManager建立了基于分角色、多场景视角的体验地图。分别是**用户体验地图、运维体验地图、运营体验地图**
#### 1.2.1 用户体验地图
- 平台租户申请&nbsp;&nbsp;申请应用App作为Kafka中的用户名并用 AppID+password作为身份验证
- 集群资源申请&nbsp;&nbsp;:按需申请、按需使用。可使用平台提供的共享集群,也可为应用申请独立的集群
- Topic&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;可根据应用App创建Topic或者申请其他topic的读写权限
- Topic&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Topic数据采样、调整配额、申请分区等操作
-&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;基于Topic生产消费各环节耗时统计监控不同分位数性能指标
-&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;:支持将消费偏移重置至指定时间或指定位置
#### 1.2.2 运维体验地图
- 多版本集群管控&nbsp;&nbsp;:支持从`0.10.2``2.x`版本
-&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;集群Topic、Broker等多维度历史与实时关键指标查看建立健康分体系
-&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;划分部分Broker作为Region使用Region定义资源划分单位并按照业务、保障能力区分逻辑集群
- Broker&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;:包括优先副本选举等操作
- Topic&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;:包括创建、查询、扩容、修改属性、迁移、下线等
### 用户使用维度 #### 1.2.3 运营体验地图
-&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;沉淀资源治理方法。针对Topic分区热点、分区不足等高频常见问题沉淀资源治理方法实现资源治理专家化
-&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;工单体系。Topic创建、调整配额、申请分区等操作由专业运维人员审批规范资源使用保障平台平稳运行
-&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;成本控制。Topic资源、集群资源按需申请、按需使用。根据流量核算费用帮助企业建设大数据成本核算体系
- Kafka用户、Kafka研发、Kafka运维 视角区分 ### 1.3 核心优势
- Kafka用户、Kafka研发、Kafka运维 权限区分 - &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;:监控多项核心指标,统计不同分位数据,提供种类丰富的指标监控报表,帮助用户、运维人员快速高效定位问题
- 便&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;按照Region定义集群资源划分单位将逻辑集群根据保障等级划分。在方便资源隔离、提高扩展能力的同时实现对服务端的强管控
-&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;基于滴滴内部多年运营实践沉淀资源治理方法建立健康分体系。针对Topic分区热点、分区不足等高频常见问题实现资源治理专家化
-&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;:与滴滴夜莺监控告警系统打通,集成监控告警、集群部署、集群升级等能力。形成运维生态,凝练专家服务,使运维更高效
### 1.4 滴滴Logi-KafkaManager架构图
![kafka-manager-arch](https://img-ys011.didistatic.com/static/dicloudpub/do1_xgDHNDLj2ChKxctSuf72)
## kafka-manager架构图 ## 2 相关文档
![kafka-manager-arch](./docs/assets/images/common/arch.png) ### 2.1 产品文档
- [滴滴Logi-KafkaManager 安装手册](docs/install_guide/install_guide_cn.md)
- [滴滴Logi-KafkaManager 接入集群](docs/user_guide/add_cluster/add_cluster.md)
- [滴滴Logi-KafkaManager 用户使用手册](docs/user_guide/user_guide_cn.md)
- [滴滴Logi-KafkaManager FAQ](docs/user_guide/faq.md)
### 2.2 社区文章
- [滴滴云官网产品介绍](https://www.didiyun.com/production/logi-KafkaManager.html)
- [7年沉淀之作--滴滴Logi日志服务套件](https://mp.weixin.qq.com/s/-KQp-Qo3WKEOc9wIR2iFnw)
- [滴滴Logi-KafkaManager 一站式Kafka监控与管控平台](https://mp.weixin.qq.com/s/9qSZIkqCnU6u9nLMvOOjIQ)
- [滴滴Logi-KafkaManager 开源之路](https://xie.infoq.cn/article/0223091a99e697412073c0d64)
- [滴滴Logi-KafkaManager 系列视频教程](https://mp.weixin.qq.com/s/9X7gH0tptHPtfjPPSdGO8g)
- [kafka实践十五滴滴开源Kafka管控平台 Logi-KafkaManager研究--A叶子叶来](https://blog.csdn.net/yezonggang/article/details/113106244)
## 相关文档 ## 3 滴滴Logi开源用户钉钉交流群
- [kafka-manager安装手册](./docs/install_cn_guide.md)
- [kafka-manager接入集群](./docs/manual_kafka_op/add_cluster.md)
- [kafka-manager使用手册-待更新](./docs/user_cn_guide.md)
## 钉钉交流群
![dingding_group](./docs/assets/images/common/dingding_group.jpg) ![dingding_group](./docs/assets/images/common/dingding_group.jpg)
钉钉群ID32821440
## 4 OCE认证
OCE是一个认证机制和交流平台为滴滴Logi-KafkaManager生产用户量身打造我们会为OCE企业提供更好的技术支持比如专属的技术沙龙、企业一对一的交流机会、专属的答疑群等如果贵司Logi-KafkaManager上了生产[快来加入吧](http://obsuite.didiyun.com/open/openAuth)
## 项目成员 ## 5 项目成员
### 内部核心人员 ### 5.1 内部核心人员
`iceyuhui``liuyaguang``limengmonty``zhangliangmike``nullhuangyiming``zengqiao``eilenexuzhe``huangjiaweihjw` `iceyuhui``liuyaguang``limengmonty``zhangliangmike``nullhuangyiming``zengqiao``eilenexuzhe``huangjiaweihjw``zhaoyinrui``marzkonglingxu``joysunchao`
### 外部贡献者 ### 5.2 外部贡献者
`fangjunyu``zhoutaiyang` `fangjunyu``zhoutaiyang`
## 协议 ## 6 协议
`kafka-manager`基于`Apache-2.0`协议进行分发和使用,更多信息参见[协议文件](./LICENSE) `kafka-manager`基于`Apache-2.0`协议进行分发和使用,更多信息参见[协议文件](./LICENSE)

72
build.sh Normal file
View File

@@ -0,0 +1,72 @@
#!/bin/bash
workspace=$(cd $(dirname $0) && pwd -P)
cd $workspace
## constant
OUTPUT_DIR=./output
KM_VERSION=2.3.0
APP_NAME=kafka-manager
APP_DIR=${APP_NAME}-${KM_VERSION}
MYSQL_TABLE_SQL_FILE=./docs/install_guide/create_mysql_table.sql
CONFIG_FILE=./kafka-manager-web/src/main/resources/application.yml
## function
function build() {
# 编译命令
mvn -U clean package -Dmaven.test.skip=true
local sc=$?
if [ $sc -ne 0 ];then
## 编译失败, 退出码为 非0
echo "$APP_NAME build error"
exit $sc
else
echo "$APP_NAME build ok"
fi
}
function make_output() {
# 新建output目录
rm -rf ${OUTPUT_DIR} &>/dev/null
mkdir -p ${OUTPUT_DIR}/${APP_DIR} &>/dev/null
# 填充output目录, output内的内容
(
cp -rf ${MYSQL_TABLE_SQL_FILE} ${OUTPUT_DIR}/${APP_DIR} && # 拷贝 sql 初始化脚本 至output目录
cp -rf ${CONFIG_FILE} ${OUTPUT_DIR}/${APP_DIR} && # 拷贝 application.yml 至output目录
# 拷贝程序包到output路径
cp kafka-manager-web/target/kafka-manager-web-${KM_VERSION}-SNAPSHOT.jar ${OUTPUT_DIR}/${APP_DIR}/${APP_NAME}.jar
echo -e "make output ok."
) || { echo -e "make output error"; exit 2; } # 填充output目录失败后, 退出码为 非0
}
function make_package() {
# 压缩output目录
(
cd ${OUTPUT_DIR} && tar cvzf ${APP_DIR}.tar.gz ${APP_DIR}
echo -e "make package ok."
) || { echo -e "make package error"; exit 2; } # 压缩output目录失败后, 退出码为 非0
}
##########################################
## main
## 其中,
## 1.进行编译
## 2.生成部署包output
## 3.生成tar.gz压缩包
##########################################
# 1.进行编译
build
# 2.生成部署包output
make_output
# 3.生成tar.gz压缩包
make_package
# 编译成功
echo -e "build done"
exit 0

View File

@@ -0,0 +1,44 @@
FROM openjdk:8-jdk-alpine3.9
LABEL author="yangvipguang"
ENV VERSION 2.1.0
ENV JAR_PATH kafka-manager-web/target
COPY $JAR_PATH/kafka-manager-web-$VERSION-SNAPSHOT.jar /tmp/app.jar
COPY $JAR_PATH/application.yml /km/
RUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.aliyun.com/g' /etc/apk/repositories
RUN apk add --no-cache --virtual .build-deps \
font-adobe-100dpi \
ttf-dejavu \
fontconfig \
curl \
apr \
apr-util \
apr-dev \
tomcat-native \
&& apk del .build-deps
ENV AGENT_HOME /opt/agent/
WORKDIR /tmp
COPY docker-depends/config.yaml $AGENT_HOME
COPY docker-depends/jmx_prometheus_javaagent-0.14.0.jar $AGENT_HOME
ENV JAVA_AGENT="-javaagent:$AGENT_HOME/jmx_prometheus_javaagent-0.14.0.jar=9999:$AGENT_HOME/config.yaml"
ENV JAVA_HEAP_OPTS="-Xms1024M -Xmx1024M -Xmn100M "
ENV JAVA_OPTS="-verbose:gc \
-XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintHeapAtGC -Xloggc:/tmp/gc.log -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps \
-XX:MaxMetaspaceSize=256M -XX:+DisableExplicitGC -XX:+UseStringDeduplication \
-XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError -XX:-UseContainerSupport"
#-Xlog:gc -Xlog:gc* -Xlog:gc+heap=trace -Xlog:safepoint
EXPOSE 8080 9999
ENTRYPOINT ["sh","-c","java -jar $JAVA_HEAP_OPTS $JAVA_OPTS /tmp/app.jar --spring.config.location=/km/application.yml"]
## 默认不带Prometheus JMX监控需要可以自行取消以下注释并注释上面一行默认Entrypoint 命令。
## ENTRYPOINT ["sh","-c","java -jar $JAVA_AGENT $JAVA_HEAP_OPTS $JAVA_OPTS /tmp/app.jar --spring.config.location=/km/application.yml"]

View File

@@ -0,0 +1,5 @@
---
startDelaySeconds: 0
ssl: false
lowercaseOutputName: false
lowercaseOutputLabelNames: false

View File

@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/

24
container/helm/Chart.yaml Normal file
View File

@@ -0,0 +1,24 @@
apiVersion: v2
name: didi-km
description: A Helm chart for Kubernetes
# A chart can be either an 'application' or a 'library' chart.
#
# Application charts are a collection of templates that can be packaged into versioned archives
# to be deployed.
#
# Library charts provide useful utilities or functions for the chart developer. They're included as
# a dependency of application charts to inject those utilities and functions into the rendering
# pipeline. Library charts do not define any templates and therefore cannot be deployed.
type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.1.0
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
# It is recommended to use it with quotes.
appVersion: "1.16.0"

View File

@@ -0,0 +1,22 @@
1. Get the application URL by running these commands:
{{- if .Values.ingress.enabled }}
{{- range $host := .Values.ingress.hosts }}
{{- range .paths }}
http{{ if $.Values.ingress.tls }}s{{ end }}://{{ $host.host }}{{ .path }}
{{- end }}
{{- end }}
{{- else if contains "NodePort" .Values.service.type }}
export NODE_PORT=$(kubectl get --namespace {{ .Release.Namespace }} -o jsonpath="{.spec.ports[0].nodePort}" services {{ include "didi-km.fullname" . }})
export NODE_IP=$(kubectl get nodes --namespace {{ .Release.Namespace }} -o jsonpath="{.items[0].status.addresses[0].address}")
echo http://$NODE_IP:$NODE_PORT
{{- else if contains "LoadBalancer" .Values.service.type }}
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status of by running 'kubectl get --namespace {{ .Release.Namespace }} svc -w {{ include "didi-km.fullname" . }}'
export SERVICE_IP=$(kubectl get svc --namespace {{ .Release.Namespace }} {{ include "didi-km.fullname" . }} --template "{{"{{ range (index .status.loadBalancer.ingress 0) }}{{.}}{{ end }}"}}")
echo http://$SERVICE_IP:{{ .Values.service.port }}
{{- else if contains "ClusterIP" .Values.service.type }}
export POD_NAME=$(kubectl get pods --namespace {{ .Release.Namespace }} -l "app.kubernetes.io/name={{ include "didi-km.name" . }},app.kubernetes.io/instance={{ .Release.Name }}" -o jsonpath="{.items[0].metadata.name}")
export CONTAINER_PORT=$(kubectl get pod --namespace {{ .Release.Namespace }} $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}")
echo "Visit http://127.0.0.1:8080 to use your application"
kubectl --namespace {{ .Release.Namespace }} port-forward $POD_NAME 8080:$CONTAINER_PORT
{{- end }}

View File

@@ -0,0 +1,62 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "didi-km.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
If release name contains chart name it will be used as a full name.
*/}}
{{- define "didi-km.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}
{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "didi-km.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Common labels
*/}}
{{- define "didi-km.labels" -}}
helm.sh/chart: {{ include "didi-km.chart" . }}
{{ include "didi-km.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}
{{/*
Selector labels
*/}}
{{- define "didi-km.selectorLabels" -}}
app.kubernetes.io/name: {{ include "didi-km.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
{{/*
Create the name of the service account to use
*/}}
{{- define "didi-km.serviceAccountName" -}}
{{- if .Values.serviceAccount.create }}
{{- default (include "didi-km.fullname" .) .Values.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,88 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: km-cm
data:
application.yml: |
server:
port: 8080
tomcat:
accept-count: 1000
max-connections: 10000
max-threads: 800
min-spare-threads: 100
spring:
application:
name: kafkamanager
datasource:
kafka-manager:
jdbc-url: jdbc:mysql://xxxxx:3306/kafka-manager?characterEncoding=UTF-8&serverTimezone=GMT%2B8&useSSL=false
username: admin
password: admin
driver-class-name: com.mysql.jdbc.Driver
main:
allow-bean-definition-overriding: true
profiles:
active: dev
servlet:
multipart:
max-file-size: 100MB
max-request-size: 100MB
logging:
config: classpath:logback-spring.xml
custom:
idc: cn
jmx:
max-conn: 20
store-metrics-task:
community:
broker-metrics-enabled: true
topic-metrics-enabled: true
didi:
app-topic-metrics-enabled: false
topic-request-time-metrics-enabled: false
topic-throttled-metrics: false
save-days: 7
# 任务相关的开关
task:
op:
sync-topic-enabled: false # 未落盘的Topic定期同步到DB中
account:
ldap:
kcm:
enabled: false
storage:
base-url: http://127.0.0.1
n9e:
base-url: http://127.0.0.1:8004
user-token: 12345678
timeout: 300
account: root
script-file: kcm_script.sh
monitor:
enabled: false
n9e:
nid: 2
user-token: 1234567890
mon:
base-url: http://127.0.0.1:8032
sink:
base-url: http://127.0.0.1:8006
rdb:
base-url: http://127.0.0.1:80
notify:
kafka:
cluster-id: 95
topic-name: didi-kafka-notify
order:
detail-url: http://127.0.0.1

View File

@@ -0,0 +1,56 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "didi-km.fullname" . }}
labels:
{{- include "didi-km.labels" . | nindent 4 }}
spec:
{{- if not .Values.autoscaling.enabled }}
replicas: {{ .Values.replicaCount }}
{{- end }}
selector:
matchLabels:
{{- include "didi-km.selectorLabels" . | nindent 6 }}
template:
metadata:
{{- with .Values.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "didi-km.selectorLabels" . | nindent 8 }}
spec:
{{- with .Values.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
serviceAccountName: {{ include "didi-km.serviceAccountName" . }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
containers:
- name: {{ .Chart.Name }}
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- name: http
containerPort: 8080
protocol: TCP
- name: jmx-metrics
containerPort: 9999
protocol: TCP
resources:
{{- toYaml .Values.resources | nindent 12 }}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}

View File

@@ -0,0 +1,28 @@
{{- if .Values.autoscaling.enabled }}
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: {{ include "didi-km.fullname" . }}
labels:
{{- include "didi-km.labels" . | nindent 4 }}
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: {{ include "didi-km.fullname" . }}
minReplicas: {{ .Values.autoscaling.minReplicas }}
maxReplicas: {{ .Values.autoscaling.maxReplicas }}
metrics:
{{- if .Values.autoscaling.targetCPUUtilizationPercentage }}
- type: Resource
resource:
name: cpu
targetAverageUtilization: {{ .Values.autoscaling.targetCPUUtilizationPercentage }}
{{- end }}
{{- if .Values.autoscaling.targetMemoryUtilizationPercentage }}
- type: Resource
resource:
name: memory
targetAverageUtilization: {{ .Values.autoscaling.targetMemoryUtilizationPercentage }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,41 @@
{{- if .Values.ingress.enabled -}}
{{- $fullName := include "didi-km.fullname" . -}}
{{- $svcPort := .Values.service.port -}}
{{- if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1beta1
{{- else -}}
apiVersion: extensions/v1beta1
{{- end }}
kind: Ingress
metadata:
name: {{ $fullName }}
labels:
{{- include "didi-km.labels" . | nindent 4 }}
{{- with .Values.ingress.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
{{- if .Values.ingress.tls }}
tls:
{{- range .Values.ingress.tls }}
- hosts:
{{- range .hosts }}
- {{ . | quote }}
{{- end }}
secretName: {{ .secretName }}
{{- end }}
{{- end }}
rules:
{{- range .Values.ingress.hosts }}
- host: {{ .host | quote }}
http:
paths:
{{- range .paths }}
- path: {{ .path }}
backend:
serviceName: {{ $fullName }}
servicePort: {{ $svcPort }}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,15 @@
apiVersion: v1
kind: Service
metadata:
name: {{ include "didi-km.fullname" . }}
labels:
{{- include "didi-km.labels" . | nindent 4 }}
spec:
type: {{ .Values.service.type }}
ports:
- port: {{ .Values.service.port }}
targetPort: http
protocol: TCP
name: http
selector:
{{- include "didi-km.selectorLabels" . | nindent 4 }}

View File

@@ -0,0 +1,12 @@
{{- if .Values.serviceAccount.create -}}
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "didi-km.serviceAccountName" . }}
labels:
{{- include "didi-km.labels" . | nindent 4 }}
{{- with .Values.serviceAccount.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,15 @@
apiVersion: v1
kind: Pod
metadata:
name: "{{ include "didi-km.fullname" . }}-test-connection"
labels:
{{- include "didi-km.labels" . | nindent 4 }}
annotations:
"helm.sh/hook": test
spec:
containers:
- name: wget
image: busybox
command: ['wget']
args: ['{{ include "didi-km.fullname" . }}:{{ .Values.service.port }}']
restartPolicy: Never

View File

@@ -0,0 +1,79 @@
# Default values for didi-km.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.
replicaCount: 1
image:
repository: docker.io/yangvipguang/km
pullPolicy: IfNotPresent
# Overrides the image tag whose default is the chart appVersion.
tag: "v18"
imagePullSecrets: []
nameOverride: ""
fullnameOverride: "km"
serviceAccount:
# Specifies whether a service account should be created
create: true
# Annotations to add to the service account
annotations: {}
# The name of the service account to use.
# If not set and create is true, a name is generated using the fullname template
name: ""
podAnnotations: {}
podSecurityContext: {}
# fsGroup: 2000
securityContext: {}
# capabilities:
# drop:
# - ALL
# readOnlyRootFilesystem: true
# runAsNonRoot: true
# runAsUser: 1000
service:
type: ClusterIP
port: 8080
ingress:
enabled: false
annotations: {}
# kubernetes.io/ingress.class: nginx
# kubernetes.io/tls-acme: "true"
hosts:
- host: chart-example.local
paths: []
tls: []
# - secretName: chart-example-tls
# hosts:
# - chart-example.local
resources:
# We usually recommend not to specify default resources and to leave this as a conscious
# choice for the user. This also increases chances charts run on environments with little
# resources, such as Minikube. If you do want to specify resources, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'resources:'.
limits:
cpu: 50m
memory: 2048Mi
requests:
cpu: 10m
memory: 200Mi
autoscaling:
enabled: false
minReplicas: 1
maxReplicas: 100
targetCPUUtilizationPercentage: 80
# targetMemoryUtilizationPercentage: 80
nodeSelector: {}
tolerations: []
affinity: {}

Binary file not shown.

After

Width:  |  Height:  |  Size: 382 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 270 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 589 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 652 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 511 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 672 KiB

View File

@@ -0,0 +1,101 @@
---
![kafka-manager-logo](../assets/images/common/logo_name.png)
**一站式`Apache Kafka`集群指标监控与运维管控平台**
---
## JMX-连接失败问题解决
集群正常接入Logi-KafkaManager之后即可以看到集群的Broker列表此时如果查看不了Topic的实时流量或者是Broker的实时流量信息时那么大概率就是JMX连接的问题了。
下面我们按照步骤来一步一步的检查。
### 1、问题&说明
**类型一JMX配置未开启**
未开启时,直接到`2、解决方法`查看如何开启即可。
![check_jmx_opened](./assets/connect_jmx_failed/check_jmx_opened.jpg)
**类型二:配置错误**
`JMX`端口已经开启的情况下,有的时候开启的配置不正确,此时也会导致出现连接失败的问题。这里大概列举几种原因:
- `JMX`配置错误:见`2、解决方法`
- 存在防火墙或者网络限制:网络通的另外一台机器`telnet`试一下看是否可以连接上。
- 需要进行用户名及密码的认证:见`3、解决方法 —— 认证的JMX`
错误日志例子:
```
# 错误一: 错误提示的是真实的IP这样的话基本就是JMX配置的有问题了。
2021-01-27 10:06:20.730 ERROR 50901 --- [ics-Thread-1-62] c.x.k.m.c.utils.jmx.JmxConnectorWrap : JMX connect exception, host:192.168.0.1 port:9999.
java.rmi.ConnectException: Connection refused to host: 192.168.0.1; nested exception is:
# 错误二错误提示的是127.0.0.1这个IP这个是机器的hostname配置的可能有问题。
2021-01-27 10:06:20.730 ERROR 50901 --- [ics-Thread-1-62] c.x.k.m.c.utils.jmx.JmxConnectorWrap : JMX connect exception, host:127.0.0.1 port:9999.
java.rmi.ConnectException: Connection refused to host: 127.0.0.1;; nested exception is:
```
### 2、解决方法
这里仅介绍一下比较通用的解决方式,如若有更好的方式,欢迎大家指导告知一下。
修改`kafka-server-start.sh`文件:
```
# 在这个下面增加JMX端口的配置
if [ "x$KAFKA_HEAP_OPTS" = "x" ]; then
export KAFKA_HEAP_OPTS="-Xmx1G -Xms1G"
export JMX_PORT=9999 # 增加这个配置, 这里的数值并不一定是要9999
fi
```
&nbsp;
修改`kafka-run-class.sh`文件
```
# JMX settings
if [ -z "$KAFKA_JMX_OPTS" ]; then
KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=${当前机器的IP}"
fi
# JMX port to use
if [ $JMX_PORT ]; then
KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT -Dcom.sun.management.jmxremote.rmi.port=$JMX_PORT"
fi
```
### 3、解决方法 —— 认证的JMX
如果您是直接看的这个部分,建议先看一下上一节:`2、解决方法`以确保`JMX`的配置没有问题了。
在JMX的配置等都没有问题的情况下如果是因为认证的原因导致连接不了的此时可以使用下面介绍的方法进行解决。
**当前这块后端刚刚开发完成,可能还不够完善,有问题随时沟通。**
`Logi-KafkaManager 2.2.0+`之后的版本后端已经支持`JMX`认证方式的连接,但是还没有界面,此时我们可以往`cluster`表的`jmx_properties`字段写入`JMX`的认证信息。
这个数据是`json`格式的字符串,例子如下所示:
```json
{
"maxConn": 10, # KM对单台Broker的最大JMX连接数
"username": "xxxxx", # 用户名
"password": "xxxx", # 密码
"openSSL": true, # 开启SSL, true表示开启ssl, false表示关闭
}
```
&nbsp;
SQL的例子
```sql
UPDATE cluster SET jmx_properties='{ "maxConn": 10, "username": "xxxxx", "password": "xxxx", "openSSL": false }' where id={xxx};
```

View File

@@ -0,0 +1,65 @@
---
![kafka-manager-logo](../assets/images/common/logo_name.png)
**一站式`Apache Kafka`集群指标监控与运维管控平台**
---
# 动态配置管理
## 1、Topic定时同步任务
### 1.1、配置的用途
`Logi-KafkaManager`在设计上,所有的资源都是挂在应用(app)下面。 如果接入的Kafka集群已经存在Topic了那么会导致这些Topic不属于任何的应用从而导致很多管理上的不便。
因此需要有一个方式将这些无主的Topic挂到某个应用下面。
这里提供了一个配置会定时自动将集群无主的Topic挂到某个应用下面下面。
### 1.2、相关实现
就是一个定时任务,该任务会定期做同步的工作。具体代码的位置在`com.xiaojukeji.kafka.manager.task.dispatch.op`包下面的`SyncTopic2DB`类。
### 1.3、配置说明
**步骤一:开启该功能**
在application.yml文件中增加如下配置已经有该配置的话直接把false修改为true即可
```yml
# 任务相关的开关
task:
op:
sync-topic-enabled: true # 无主的Topic定期同步到DB中
```
**步骤二:配置管理中指定挂在那个应用下面**
配置的位置:
![sync_topic_to_db](./assets/dynamic_config_manager/sync_topic_to_db.jpg)
配置键:`SYNC_TOPIC_2_DB_CONFIG_KEY`
配置值(JSON数组)
- clusterId需要进行定时同步的集群ID
- defaultAppId该集群无主的Topic将挂在哪个应用下面
- addAuthority是否需要加上权限, 默认是false。因为考虑到这个挂载只是临时的我们不希望用户使用这个App同时后续可能移交给真正的所属的应用因此默认是不加上权限。
**注意这里的集群ID或者是应用ID不存在的话会导致配置不生效。该任务对已经在DB中的Topic不会进行修改**
```json
[
{
"clusterId": 1234567,
"defaultAppId": "ANONYMOUS",
"addAuthority": false
},
{
"clusterId": 7654321,
"defaultAppId": "ANONYMOUS",
"addAuthority": false
}
]
```

View File

@@ -0,0 +1,42 @@
---
![kafka-manager-logo](../assets/images/common/logo_name.png)
**一站式`Apache Kafka`集群指标监控与运维管控平台**
---
# 监控系统集成——夜莺
- `Kafka-Manager`通过将 监控的数据 以及 监控的规则 都提交给夜莺,然后依赖夜莺的监控系统从而实现监控告警功能。
- 监控数据上报 & 告警规则的创建等能力已经具备。但类似查看告警历史,告警触发时的监控数据等正在集成中(暂时可以到夜莺系统进行查看),欢迎有兴趣的同学进行共建 或 贡献代码。
## 1、配置说明
```yml
# 配置文件中关于监控部分的配置
monitor:
enabled: false
n9e:
nid: 2
user-token: 123456
# 夜莺 mon监控服务 地址
mon:
base-url: http://127.0.0.1:8006
# 夜莺 transfer上传服务 地址
sink:
base-url: http://127.0.0.1:8008
# 夜莺 rdb资源服务 地址
rdb:
base-url: http://127.0.0.1:80
# enabled: 表示是否开启监控告警的功能, true: 开启, false: 不开启
# n9e.nid: 夜莺的节点ID
# n9e.user-token: 用户的密钥,在夜莺的个人设置中
# n9e.mon.base-url: 监控地址
# n9e.sink.base-url: 数据上报地址
# n9e.rdb.base-url: 用户资源中心地址
```

View File

@@ -0,0 +1,54 @@
---
![kafka-manager-logo](../assets/images/common/logo_name.png)
**一站式`Apache Kafka`集群指标监控与运维管控平台**
---
# 监控系统集成
- 监控系统默认与 [夜莺] (https://github.com/didi/nightingale) 进行集成;
- 对接自有的监控系统需要进行简单的二次开发,即实现部分监控告警模块的相关接口即可;
- 集成会有两块内容,一个是指标数据上报的集成,还有一个是监控告警规则的集成;
## 1、指标数据上报集成
仅完成这一步的集成之后,即可将监控数据上报到监控系统中,此时已能够在自己的监控系统进行监控告警规则的配置了。
**步骤一:实现指标上报的接口**
- 按照自己内部监控系统的数据格式要求,将数据进行组装成符合自己内部监控系统要求的数据进行上报,具体的可以参考夜莺集成的实现代码。
- 至于会上报哪些指标,可以查看有哪些地方调用了该接口。
![sink_metrics](./assets/monitor_system_integrate_with_self/sink_metrics.jpg)
**步骤二:相关配置修改**
![change_config](./assets/monitor_system_integrate_with_self/change_config.jpg)
**步骤三:开启上报任务**
![open_sink_schedule](./assets/monitor_system_integrate_with_self/open_sink_schedule.jpg)
## 2、监控告警规则集成
完成**1、指标数据上报集成**之后,即可在自己的监控系统进行监控告警规则的配置了。完成该步骤的集成之后,可以在`Logi-KafkaManager`中进行监控告警规则的增删改查等等。
大体上和**1、指标数据上报集成**一致,
**步骤一:实现相关接口**
![integrate_ms](./assets/monitor_system_integrate_with_self/integrate_ms.jpg)
实现完成步骤一之后,接下来的步骤和**1、指标数据上报集成**中的步骤二、步骤三一致,都需要进行相关配置的修改即可。
## 3、总结
简单介绍了一下监控告警的集成,嫌麻烦的同学可以仅做 **1、指标数据上报集成** 这一节的内容即可满足一定场景下的需求。
**集成过程中有任何觉得文档没有说清楚的地方或者建议欢迎入群交流也欢迎贡献代码觉得好也辛苦给个star。**

View File

@@ -0,0 +1,27 @@
---
![kafka-manager-logo](../../assets/images/common/logo_name.png)
**一站式`Apache Kafka`集群指标监控与运维管控平台**
---
# 升级至`2.2.0`版本
`2.2.0`版本在`cluster`表及`logical_cluster`各增加了一个字段因此需要执行下面的sql进行字段的增加。
```sql
# cluster表中增加jmx_properties字段, 这个字段会用于存储jmx相关的认证以及配置信息
ALTER TABLE `cluster` ADD COLUMN `jmx_properties` TEXT NULL COMMENT 'JMX配置' AFTER `security_properties`;
# logical_cluster中增加identification字段, 同时数据和原先name数据相同, 最后增加一个唯一键.
# 此后, name字段还是表示集群名称, identification字段表示的是集群标识, 只能是字母数字及下划线组成,
# 数据上报到监控系统时, 集群这个标识采用的字段就是identification字段, 之前使用的是name字段.
ALTER TABLE `logical_cluster` ADD COLUMN `identification` VARCHAR(192) NOT NULL DEFAULT '' COMMENT '逻辑集群标识' AFTER `name`;
UPDATE `logical_cluster` SET `identification`=`name` WHERE id>=0;
ALTER TABLE `logical_cluster` ADD INDEX `uniq_identification` (`identification` ASC);
```

View File

@@ -0,0 +1,17 @@
---
![kafka-manager-logo](../../assets/images/common/logo_name.png)
**一站式`Apache Kafka`集群指标监控与运维管控平台**
---
# 升级至`2.3.0`版本
`2.3.0`版本在`gateway_config`表增加了一个描述说明的字段因此需要执行下面的sql进行字段的增加。
```sql
ALTER TABLE `gateway_config`
ADD COLUMN `description` TEXT NULL COMMENT '描述信息' AFTER `version`;
```

View File

@@ -0,0 +1,41 @@
---
![kafka-manager-logo](../assets/images/common/logo_name.png)
**一站式`Apache Kafka`集群指标监控与运维管控平台**
---
# 使用`MySQL 8`
感谢 [herry-hu](https://github.com/herry-hu) 提供的方案。
当前因为无法同时兼容`MySQL 8``MySQL 5.7`,因此代码中默认的版本还是`MySQL 5.7`
当前如需使用`MySQL 8`,则需按照下述流程进行简单修改代码。
- Step1. 修改application.yml中的MySQL驱动类
```shell
# 将driver-class-name后面的驱动类修改为:
# driver-class-name: com.mysql.jdbc.Driver
driver-class-name: com.mysql.cj.jdbc.Driver
```
- Step2. 修改MySQL依赖包
```shell
# 将根目录下面的pom.xml文件依赖的`MySQL`依赖包版本调整为
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
# <version>5.1.41</version>
<version>8.0.20</version>
</dependency>
```

View File

@@ -1,58 +0,0 @@
---
![kafka-manager-logo](./assets/images/common/logo_name.png)
**一站式`Apache Kafka`集群指标监控与运维管控平台**
---
# 安装手册
## 环境依赖
- `Maven 3.2+`(后端打包依赖)
- `node 10+`(前端打包依赖)
- `Java 8+`(运行环境需要)
- `MySQL`(数据存储)
---
## 环境初始化
执行[create_mysql_table.sql](./create_mysql_table.sql)中的SQL命令从而创建所需的MySQL库及表默认创建的库名是`kafka_manager`
```
# 示例:
mysql -uXXXX -pXXX -h XXX.XXX.XXX.XXX -PXXXX < ./create_mysql_table.sql
```
---
## 打包
```bash
# 一次性打包
cd ..
mvn install
```
---
## 启动
```
# application.yml 是配置文件
cp web/src/main/resources/application.yml web/target/
cd web/target/
nohup java -jar kafka-manager-web-2.0.0-SNAPSHOT.jar --spring.config.location=./application.yml > /dev/null 2>&1 &
```
## 使用
本地启动的话,访问`http://localhost:8080`,输入帐号及密码进行登录。更多参考:[kafka-manager使用手册](./user_cn_guide.md)

View File

@@ -0,0 +1,104 @@
---
![kafka-manager-logo](../assets/images/common/logo_name.png)
**一站式`Apache Kafka`集群指标监控与运维管控平台**
---
# 配置说明
```yaml
server:
port: 8080 # 服务端口
tomcat:
accept-count: 1000
max-connections: 10000
max-threads: 800
min-spare-threads: 100
spring:
application:
name: kafkamanager
datasource:
kafka-manager: # 数据库连接配置
jdbc-url: jdbc:mysql://127.0.0.1:3306/kafka_manager?characterEncoding=UTF-8&serverTimezone=GMT%2B8 #数据库的地址
username: admin # 用户名
password: admin # 密码
driver-class-name: com.mysql.jdbc.Driver
main:
allow-bean-definition-overriding: true
profiles:
active: dev # 启用的配置
servlet:
multipart:
max-file-size: 100MB
max-request-size: 100MB
logging:
config: classpath:logback-spring.xml
custom:
idc: cn # 部署的数据中心, 忽略该配置, 后续会进行删除
jmx:
max-conn: 10 # 和单台 broker 的最大JMX连接数
store-metrics-task:
community:
broker-metrics-enabled: true # 社区部分broker metrics信息收集开关, 关闭之后metrics信息将不会进行收集及写DB
topic-metrics-enabled: true # 社区部分topic的metrics信息收集开关, 关闭之后metrics信息将不会进行收集及写DB
didi:
app-topic-metrics-enabled: false # 滴滴埋入的指标, 社区AK不存在该指标因此默认关闭
topic-request-time-metrics-enabled: false # 滴滴埋入的指标, 社区AK不存在该指标因此默认关闭
topic-throttled-metrics: false # 滴滴埋入的指标, 社区AK不存在该指标因此默认关闭
save-days: 7 #指标在DB中保持的天数-1表示永久保存7表示保存近7天的数据
# 任务相关的开关
task:
op:
sync-topic-enabled: false # 未落盘的Topic定期同步到DB中
account: # ldap相关的配置, 社区版本暂时支持不够完善,可以先忽略,欢迎贡献代码对这块做优化
ldap:
kcm: # 集群升级部署相关的功能需要配合夜莺及S3进行使用这块我们后续专门补充一个文档细化一下牵扯到kcm_script.sh脚本的修改
enabled: false # 默认关闭
storage:
base-url: http://127.0.0.1 # 存储地址
n9e:
base-url: http://127.0.0.1:8004 # 夜莺任务中心的地址
user-token: 12345678 # 夜莺用户的token
timeout: 300 # 集群任务的超时时间,单位秒
account: root # 集群任务使用的账号
script-file: kcm_script.sh # 集群任务的脚本
monitor: # 监控告警相关的功能,需要配合夜莺进行使用
enabled: false # 默认关闭true就是开启
n9e:
nid: 2
user-token: 1234567890
mon:
# 夜莺 mon监控服务 地址
base-url: http://127.0.0.1:8032
sink:
# 夜莺 transfer上传服务 地址
base-url: http://127.0.0.1:8006
rdb:
# 夜莺 rdb资源服务 地址
base-url: http://127.0.0.1:80
# enabled: 表示是否开启监控告警的功能, true: 开启, false: 不开启
# n9e.nid: 夜莺的节点ID
# n9e.user-token: 用户的密钥,在夜莺的个人设置中
# n9e.mon.base-url: 监控地址
# n9e.sink.base-url: 数据上报地址
# n9e.rdb.base-url: 用户资源中心地址
notify: # 通知的功能
kafka: # 默认通知发送到kafka的指定Topic中
cluster-id: 95 # Topic的集群ID
topic-name: didi-kafka-notify # Topic名称
order: # 部署的KM的地址
detail-url: http://127.0.0.1
```

View File

@@ -1,3 +1,8 @@
-- create database
CREATE DATABASE logi_kafka_manager;
USE logi_kafka_manager;
-- --
-- Table structure for table `account` -- Table structure for table `account`
-- --
@@ -36,7 +41,6 @@ CREATE TABLE `app` (
UNIQUE KEY `uniq_name` (`name`), UNIQUE KEY `uniq_name` (`name`),
UNIQUE KEY `uniq_app_id` (`app_id`) UNIQUE KEY `uniq_app_id` (`app_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='应用信息'; ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='应用信息';
INSERT INTO app(app_id, name, password, type, applicant, principals, description) VALUES ('km-admin-tmp', 'km-admin-tmp', '123456', 0, 'admin', 'admin', '临时应用');
-- --
@@ -105,7 +109,8 @@ CREATE TABLE `cluster` (
`zookeeper` varchar(512) NOT NULL DEFAULT '' COMMENT 'zk地址', `zookeeper` varchar(512) NOT NULL DEFAULT '' COMMENT 'zk地址',
`bootstrap_servers` varchar(512) NOT NULL DEFAULT '' COMMENT 'server地址', `bootstrap_servers` varchar(512) NOT NULL DEFAULT '' COMMENT 'server地址',
`kafka_version` varchar(32) NOT NULL DEFAULT '' COMMENT 'kafka版本', `kafka_version` varchar(32) NOT NULL DEFAULT '' COMMENT 'kafka版本',
`security_properties` text COMMENT '安全认证参数', `security_properties` text COMMENT 'Kafka安全认证参数',
`jmx_properties` text COMMENT 'JMX配置',
`status` tinyint(4) NOT NULL DEFAULT '1' COMMENT ' 监控标记, 0表示未监控, 1表示监控中', `status` tinyint(4) NOT NULL DEFAULT '1' COMMENT ' 监控标记, 0表示未监控, 1表示监控中',
`gmt_create` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间', `gmt_create` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
`gmt_modify` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '修改时间', `gmt_modify` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '修改时间',
@@ -198,12 +203,18 @@ CREATE TABLE `gateway_config` (
`type` varchar(128) NOT NULL DEFAULT '' COMMENT '配置类型', `type` varchar(128) NOT NULL DEFAULT '' COMMENT '配置类型',
`name` varchar(128) NOT NULL DEFAULT '' COMMENT '配置名称', `name` varchar(128) NOT NULL DEFAULT '' COMMENT '配置名称',
`value` text COMMENT '配置值', `value` text COMMENT '配置值',
`version` bigint(20) unsigned NOT NULL DEFAULT '0' COMMENT '版本信息', `version` bigint(20) unsigned NOT NULL DEFAULT '1' COMMENT '版本信息',
`description` text COMMENT '描述信息',
`create_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间', `create_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
`modify_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '修改时间', `modify_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '修改时间',
PRIMARY KEY (`id`), PRIMARY KEY (`id`),
UNIQUE KEY `uniq_type_name` (`type`,`name`) UNIQUE KEY `uniq_type_name` (`type`,`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='gateway配置'; ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='gateway配置';
INSERT INTO gateway_config(type, name, value, `version`) values('SERVICE_DISCOVERY_QUEUE_SIZE', 'SERVICE_DISCOVERY_QUEUE_SIZE', 100000000, 1);
INSERT INTO gateway_config(type, name, value, `version`) values('SERVICE_DISCOVERY_APPID_RATE', 'SERVICE_DISCOVERY_APPID_RATE', 100000000, 1);
INSERT INTO gateway_config(type, name, value, `version`) values('SERVICE_DISCOVERY_IP_RATE', 'SERVICE_DISCOVERY_IP_RATE', 100000000, 1);
INSERT INTO gateway_config(type, name, value, `version`) values('SERVICE_DISCOVERY_SP_RATE', 'app_01234567', 100000000, 1);
INSERT INTO gateway_config(type, name, value, `version`) values('SERVICE_DISCOVERY_SP_RATE', '192.168.0.1', 100000000, 1);
-- --
-- Table structure for table `heartbeat` -- Table structure for table `heartbeat`
@@ -290,15 +301,18 @@ CREATE TABLE `kafka_user` (
`create_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间', `create_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
PRIMARY KEY (`id`) PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='kafka用户表'; ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='kafka用户表';
INSERT INTO app(app_id, name, password, type, applicant, principals, description) VALUES ('dkm_admin', 'KM管理员', 'km_kMl4N8as1Kp0CCY', 1, 'admin', 'admin', 'KM管理员应用-谨慎对外提供');
INSERT INTO kafka_user(app_id, password, user_type, operation) VALUES ('dkm_admin', 'km_kMl4N8as1Kp0CCY', 1, 0);
-- --
-- Table structure for table `logical_cluster` -- Table structure for table `logical_cluster`
-- --
-- DROP TABLE IF EXISTS `logical_cluster`;
CREATE TABLE `logical_cluster` ( CREATE TABLE `logical_cluster` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT 'id', `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT 'id',
`name` varchar(192) NOT NULL DEFAULT '' COMMENT '逻辑集群名称', `name` varchar(192) NOT NULL DEFAULT '' COMMENT '逻辑集群名称',
`identification` varchar(192) NOT NULL DEFAULT '' COMMENT '逻辑集群标识',
`mode` int(16) NOT NULL DEFAULT '0' COMMENT '逻辑集群类型, 0:共享集群, 1:独享集群, 2:独立集群', `mode` int(16) NOT NULL DEFAULT '0' COMMENT '逻辑集群类型, 0:共享集群, 1:独享集群, 2:独立集群',
`app_id` varchar(64) NOT NULL DEFAULT '' COMMENT '所属应用', `app_id` varchar(64) NOT NULL DEFAULT '' COMMENT '所属应用',
`cluster_id` bigint(20) NOT NULL DEFAULT '-1' COMMENT '集群id', `cluster_id` bigint(20) NOT NULL DEFAULT '-1' COMMENT '集群id',
@@ -307,8 +321,10 @@ CREATE TABLE `logical_cluster` (
`gmt_create` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间', `gmt_create` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
`gmt_modify` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '修改时间', `gmt_modify` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '修改时间',
PRIMARY KEY (`id`), PRIMARY KEY (`id`),
UNIQUE KEY `uniq_name` (`name`) UNIQUE KEY `uniq_name` (`name`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='逻辑集群信息表'; UNIQUE KEY `uniq_identification` (`identification`)
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=utf8 COMMENT='逻辑集群信息表';
-- --
-- Table structure for table `monitor_rule` -- Table structure for table `monitor_rule`

View File

@@ -0,0 +1,64 @@
---
![kafka-manager-logo](../assets/images/common/logo_name.png)
**一站式`Apache Kafka`集群指标监控与运维管控平台**
---
# 安装手册
## 1、环境依赖
如果是以Release包进行安装的则仅安装`Java``MySQL`即可。如果是要先进行源码包进行打包,然后再使用,则需要安装`Maven``Node`环境。
- `Java 8+`(运行环境需要)
- `MySQL 5.7`(数据存储)
- `Maven 3.5+`(后端打包依赖)
- `Node 10+`(前端打包依赖)
---
## 2、获取安装包
**1、Release直接下载**
这里如果觉得麻烦然后也不想进行二次开发则可以直接下载Release包下载地址[Github Release包下载地址](https://github.com/didi/Logi-KafkaManager/releases)
如果觉得Github的下载地址太慢了也可以进入`Logi-KafkaManager`的用户群获取群地址在README中。
**2、源代码进行打包**
下载好代码之后,进入`Logi-KafkaManager`的主目录,执行`sh build.sh`命令即可,执行完成之后会在`output/kafka-manager-xxx`目录下面生成一个jar包。
对于`windows`环境的用户,估计执行不了`sh build.sh`命令,因此可以直接执行`mvn install`,然后在`kafka-manager-web/target`目录下生成一个kafka-manager-web-xxx.jar的包。
获取到jar包之后我们继续下面的步骤。
---
## 3、MySQL-DB初始化
执行[create_mysql_table.sql](create_mysql_table.sql)中的SQL命令从而创建所需的MySQL库及表默认创建的库名是`logi_kafka_manager`
```
# 示例:
mysql -uXXXX -pXXX -h XXX.XXX.XXX.XXX -PXXXX < ./create_mysql_table.sql
```
---
## 4、启动
```
# application.yml 是配置文件最简单的是仅修改MySQL相关的配置即可启动
nohup java -jar kafka-manager.jar --spring.config.location=./application.yml > /dev/null 2>&1 &
```
### 5、使用
本地启动的话,访问`http://localhost:8080`,输入帐号及密码(默认`admin/admin`)进行登录。更多参考:[kafka-manager 用户使用手册](../user_guide/user_guide_cn.md)

View File

@@ -1,39 +0,0 @@
---
![kafka-manager-logo](../assets/images/common/logo_name.png)
**一站式`Apache Kafka`集群指标监控与运维管控平台**
---
# 集群接入
集群的接入总共需要三个步骤,分别是:
1. 接入物理集群
2. 创建Region
3. 创建逻辑集群
备注接入集群需要2、3两步是因为普通用户的视角下看到的都是逻辑集群如果没有2、3两步那么普通用户看不到任何信息。
## 1、接入物理集群
![op_add_cluster](./imgs/op_add_cluster.jpg)
如上图所示,填写集群信息,然后点击确定,即可完成集群的接入。因为考虑到分布式部署,添加集群之后,需要稍等**`1分钟`**才可以在界面上看到集群的详细信息。
## 2、创建Region
![op_add_region](./imgs/op_add_region.jpg)
如上图所示填写Region信息然后点击确定即可完成Region的创建。
备注Region即为Broker的集合可以按照业务需要将Broker归类从而创建相应的Region。
## 3、创建逻辑集群
![op_add_logical_cluster](./imgs/op_add_logical_cluster.jpg)
如上图所示,填写逻辑集群信息,然后点击确定,即可完成逻辑集群的创建。

View File

@@ -1,166 +0,0 @@
---
![kafka-manger-logo](./assets/images/common/logo_name.png)
**一站式`Apache Kafka`集群指标监控与运维管控平台**
---
# kafka-manager 使用手册
管控平台主要有两种用户视角,分别为:
- 普通用户站在使用Kafka的角度使用kafka-manager
- 管理员站在使用与管理Kafka的角度在使用kafka-manager
下面我们将从这两个用户的维度说明平台的功能及使用。
---
## 1. 普通用户篇
### 1.1 帐号获取及登录
- 询问管理员让其提供普通用户的帐号;
- 输入帐号及密码登录kafka-manager
---
### 1.2 Topic申请
- 步骤一:点击"Topic申请"按钮申请Topic
- 步骤二:填写申请信息;
- 步骤三:等待运维人员或管理员审批;
**Topic申请完成**
![my_order_list](./assets/images/kafka_manager_cn_guide/my_order_list.jpg)
---
### 1.3 Topic信息查看
普通用户可查看的信息包括:
- 集群Topic列表及我收藏的Topic列表
- Topic基本信息(Topic创建及修改时间、Topic数据保存时间、Topic负责人等)
- Topic分区信息
- Topic消费组信息及消费组消费详情
- Topic实时&历史流量信息;
- Topic数据采样
**Topic详情信息界面**
![normal_topic_detail](./assets/images/kafka_manager_cn_guide/normal_topic_detail.jpg)
---
### 1.4 Topic运维
普通用户可进行的Topic运维的操作包括
- 申请Topic扩容
- 重置消费偏移;
**Topic重置消费偏移界面**
![normal_reset_consume_offset](./assets/images/kafka_manager_cn_guide/normal_reset_consume_offset.jpg)
---
### 1.5 告警配置
kafka-manager告警配置中仅支持LagBytesIn/BytesOut这三类告警同时告警被触发后告警消息会被发往指定的Topic(具体哪一个请联系管理员获取)。需要用户主动消费该告警Topic的数据或者统一由管理员将该数据接入外部通知系统比如接入短信通知或电话通知等。
**告警规则配置界面:**
![normal_create_alarm_rule](./assets/images/kafka_manager_cn_guide/normal_create_alarm_rule.jpg)
---
### 1.6 密码修改
**密码修改界面:**
![normal_modify_password](./assets/images/kafka_manager_cn_guide/normal_modify_password.jpg)
---
## 2. 管理员篇
### 2.1 帐号获取及登录
- 默认的管理员帐号密码为`admin/admin`(详见数据库account表)
---
### 2.2 添加集群
登录之后就需要将我们搭建的Kafka集群添加到kafka-manager中。
**添加Kafka集群界面**
![admin_add_cluster](./assets/images/kafka_manager_cn_guide/admin_add_cluster.jpg)
---
### 2.3 监控指标
#### 2.3.1 集群维度指标
- 集群的基本信息;
- 集群历史及实时流量信息;
- 集群Topic信息
- 集群Broker信息
- 集群ConsumerGroup信息
- 集群Region信息
- 集群当前Controller及变更历史
**集群维度监控指标界面:**
![admin_cluster_details](./assets/images/kafka_manager_cn_guide/admin_cluster_details.jpg)
---
#### 2.3.2 Broker维度指标
- Broker基本信息
- Broker历史与实时流量信息
- Broker内Topic信息
- Broker内分区信息
- Broker关键指标(日志刷盘时间等)
- Topic分析(Topic流量占比等)
**`Broker`维度监控指标界面:**
![admin_cluster_broker_detail](./assets/images/kafka_manager_cn_guide/admin_cluster_broker_detail.jpg)
---
#### 2.3.3 Topic维度指标
- 在普通用户的基础上增加展示Topic的Broker信息
图略
---
#### 2.3.4 其他维度指标
- 消费组消费哪些具体的Topic
图略
---
### 2.4 集群运维管控
- Topic申请及扩容工单审批
- Topic创建、删除、扩容及属性修改
- Broker维度优先副本选举
- 分区粒度迁移;
- 逻辑Region管理
**资源审批界面:**
![admin_order](./assets/images/kafka_manager_cn_guide/admin_order.jpg)
---
### 2.5 用户管理
- 对用户进行增删改查;
**用户管理界面:**
![admin_manager_account](./assets/images/kafka_manager_cn_guide/admin_manager_account.jpg)

View File

@@ -0,0 +1,49 @@
---
![kafka-manager-logo](../../assets/images/common/logo_name.png)
**一站式`Apache Kafka`集群指标监控与运维管控平台**
---
# 集群接入
## 主要概念讲解
面对大规模集群、业务场景复杂的情况引入Region、逻辑集群的概念
- Region划分部分Broker作为一个 Region用Region定义资源划分的单位提高扩展性和隔离性。如果部分Topic异常也不会影响大面积的Broker
- 逻辑集群逻辑集群由部分Region组成便于对大规模集群按照业务划分、保障能力进行管理
![op_cluster_arch](assets/op_cluster_arch.png)
集群的接入总共需要三个步骤,分别是:
1. 接入物理集群:填写机器地址、安全协议等配置信息,接入真实的物理集群
2. 创建Region将部分Broker划分为一个Region
3. 创建逻辑集群逻辑集群由部分Region组成可根据业务划分、保障等级来创建相应的逻辑集群
![op_cluster_flow](assets/op_cluster_flow.png)
**备注接入集群需要2、3两步是因为普通用户的视角下看到的都是逻辑集群如果没有2、3两步那么普通用户看不到任何信息。**
## 1、接入物理集群
![op_add_cluster](assets/op_add_cluster.jpg)
如上图所示,填写集群信息,然后点击确定,即可完成集群的接入。因为考虑到分布式部署,添加集群之后,需要稍等**`1分钟`**才可以在界面上看到集群的详细信息。
## 2、创建Region
![op_add_region](assets/op_add_region.jpg)
如上图所示填写Region信息然后点击确定即可完成Region的创建。
备注Region即为Broker的集合可以按照业务需要将Broker归类从而创建相应的Region。
## 3、创建逻辑集群
![op_add_logical_cluster](assets/op_add_logical_cluster.jpg)
如上图所示,填写逻辑集群信息,然后点击确定,即可完成逻辑集群的创建。

View File

Before

Width:  |  Height:  |  Size: 261 KiB

After

Width:  |  Height:  |  Size: 261 KiB

View File

Before

Width:  |  Height:  |  Size: 240 KiB

After

Width:  |  Height:  |  Size: 240 KiB

View File

Before

Width:  |  Height:  |  Size: 195 KiB

After

Width:  |  Height:  |  Size: 195 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 124 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 105 KiB

View File

@@ -0,0 +1,25 @@
![kafka-manager-logo](../assets/images/common/logo_name.png)
**一站式`Apache Kafka`集群指标监控与运维管控平台**
---
## 报警策略-报警函数介绍
| 类别 | 函数 | 含义 |函数文案 |备注 |
| --- | --- | --- | --- | --- |
| 发生次数 |alln | 最近$n个周期内全发生 | 连续发生(all) | |
| 发生次数 | happen, n, m | 最近$n个周期内发生m次 | 出现(happen) | null点也计算在n内 |
| 数学统计 | sum, n | 最近$n个周期取值 的 和 | 求和(sum) | sum_over_time |
| 数学统计 | avg, n | 最近$n个周期取值 的 平均值 | 平均值(avg) | avg_over_time |
| 数学统计 | min, n | 最近$n个周期取值 的 最小值 | 最小值(min) | min_over_time |
| 数学统计 | max, n | 最近$n个周期取值 的 最大值 | 最大值(max | max_over_time |
| 变化率 | pdiff, n | 最近$n个点的变化率, 有一个满足 则触发 | 突增突降率(pdiff) | 假设, 最近3个周期的值分别为 v, v2, v3v为最新值那么计算公式为 any( (v-v2)/v2, (v-v3)/v3 )**区分正负** |
| 变化量 | diff, n | 最近$n个点的变化量, 有一个满足 则触发 | 突增突降值(diff) | 假设, 最近3个周期的值分别为 v, v2, v3v为最新值那么计算公式为 any( (v-v2), (v-v3) )**区分正负** |
| 变化量 | ndiff | 最近n个周期发生m次 v(t) - v(t-1) $OP threshold其中 v(t) 为最新值 | 连续变化(区分正负) - ndiff | |
| 数据中断 | nodata, t | 最近 $t 秒内 无数据上报 | 数据上报中断(nodata) | |
| 同环比 | c_avg_rate_abs, n | 最近$n个周期的取值相比 1天或7天前取值 的变化率 的绝对值 | 同比变化率(c_avg_rate_abs) | 假设最近的n个值为 v1, v2, v3历史取到的对应n'个值为 v1', v2'那么计算公式为abs((avg(v1,v2,v3) / avg(v1',v2') -1)* 100%) |
| 同环比 | c_avg_rate, n | 最近$n个周期的取值相比 1天或7天前取值 的变化率(**区分正负**) | 同比变化率(c_avg_rate) | 假设最近的n个值为 v1, v2, v3历史取到的对应n'个值为 v1', v2'那么计算公式为(avg(v1,v2,v3) / avg(v1',v2') -1)* 100% |

Binary file not shown.

After

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 181 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 65 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 166 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 78 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 55 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 297 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 189 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 173 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 197 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 244 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 118 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 150 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 177 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 276 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 257 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 153 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 189 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 187 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 92 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 116 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 166 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 158 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 124 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 209 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 127 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 162 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 102 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 189 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 281 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 185 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 170 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 252 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 238 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 252 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 278 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 238 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 233 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 208 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 395 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 235 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 157 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 141 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 150 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 168 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 91 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 120 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 120 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 119 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 168 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 211 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 136 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 215 KiB

Some files were not shown because too many files have changed in this diff Show More