EasyOps安装

1 Docker安装

  1. 配置yum repo 需替换成对应的包服务器url 及操作系统版本

    • CentOS 7 —>centos7
    • RedHat 7 —>redhat7
    • RedHat8 —>redhat8
    • KylinV10 —>kylinv10

      export PACKAGE_URL="http://repo.bdms.service.163.org/EasyData-V7.0"
      export OS="centos7"
      curl ${PACKAGE_URL}/common/easyops/init-scripts/add-yum-repo.sh |bash -x -s ${PACKAGE_URL}/$OS/os/x86_64/
  2. 安装配置docker

     curl ${PACKAGE_URL}/common/docker/install-docker.sh |bash -x -s ${PACKAGE_URL}
  3. 修改Docker安装目录

     systemctl stop docker
     mv /var/lib/docker /mnt/data01/
     ln -s /mnt/data01/docker /var/lib/docker
     systemctl start docker

    测试验证生效

     su - easyops
     docker ps
     docker-compose -v

    部署过程 - 图1

2 EasyOps安装

使用easyops用户执行下列步骤

  1. 下载安装包并解压

     cd /home/easyops/easyops
    wget <安装包仓库>/common/easyops/easyops-2.0.1.1.tar.gz
    tar zxvf easyops-2.0.1.1.tar.gz
  2. 生成数据库密文密码

     cd /home/easyops/easyops/easyops-2.0.1.1/
    tar -zxvf openlogic-openjdk-8u352-b08-linux-x64.tar.gz
    openlogic-openjdk-8u352-b08-linux-x64/bin/java -jar genePassWord.jar xx的密码

    得到秘文密码,如:ENC(NlqvsvxSSHUixVrh1Odi3/T3J68M+QUDomRpxUcW9McqoH7B5khgZzwzfKUYEoYv), 填入application.yml

  3. 修改easyops-manager组件的application.yml,填入外置DB的地址、端口、用户名和密码

     cd /home/easyops/easyops/easyops-2.0.1.1/easyops-manager
     vi application.yml
         datasource:
           url: jdbc:mysql://x.x.x.x:3306/easyops?useSSL=false&characterEncoding=utf8&zeroDateTimeBehavior=convertToNull&useLegacyDatetimeCode=false&serverTimezone=Asia/Shanghai
           username: xx
           password: 秘文密码
  4. 启动服务,检查是否生效,查看容器是否都处于up

     cd /home/easyops/easyops/easyops-2.0.1.1
     ./start_services.sh
     docker ps

    注:如果外置DB配置较低,启动服务之后,访问页面时,组件数据可能还在异步load,需要等待load结束,即使在导入license之后,可能依然为置灰状态

3 Playbook导入

wget ${包服务器地址}/common/easyops/upgrade_playbook_700.sh
bash upgrade_playbook_700.sh ${管控部署目录}/services-spec/tools ${包服务器地址}

#脚本使用示例
bash upgrade_playbook_700.sh /home/easyops/easyops/easyops-current/services-spec/tools/ http://repo.bdms.service.163.org/EasyData-V7.0/

4 License 申请和激活

  1. License申请

    部署过程 - 图2 申请过程中需要填入ESN码,可通过如下命令获取

     # 登录EasyOps所在服务器,root账号下执行:
     dmidecode -s system-serial-number
    
     # 若主机没有该命令则执行以下命令获取
     docker exec easyops-manager sh -c "dmidecode -s system-serial-number"
  2. License激活

    使用easyops/Easyops@1024账号密码,登录http://:8000界面,点击右上角“许可证”进入页面激活

    部署过程 - 图3

    支持输入License Key 或上传许可证文件 部署过程 - 图4

5 导入外置DB

使用万象服务导入外置DB: 填写实例名(若是万象建议:default_ntesmysqlpass,若是RDS:建议default_rds)、集群、数据库的版本(版本可任意选择)

部署过程 - 图5

点击下一步:填写VIP(用户数据库地址)、用户名、密码、端口、管控地址managerUrl(可选)、disable_ddl

部署过程 - 图6

mysql_user、mysql_user_password为客户提供的外置DB的用户名和密码。 新导入的如果是万象数据库,管控地址managerUrl可填写万象管控manager_url 新导入的如果是阿里云RDS,管控地址managerUrl可填写RDS管控URL地址 以下分两种情况填disable_ddl:

  • 如果客户提供的是普通用户,需要设置 disable_ddl为true
  • 如果客户提供的是root或高权限用户(拥有建库和创建用户权限),则设置 disable_ddl为false 然后点击下一步,选中跳过此步骤,跳过拓扑检查。再点击下一步,即可完成导入。

部署过程 - 图7

6 EasyOps n9e告警模块部署(可选)

6.1 服务连通性确认

该模块支持部署在非EasyOps所在节点,请根据客户现场需求选择部署节点,确认连通性:

  • 使用邮箱服务告警请确认到邮箱服务器的连通性
  • 使用企业微信机器人告警请确认到https://qyapi.weixin.qq.com 的连通性
  • 使用钉钉机器人告警请确认到https://oapi.dingtalk.com 的连通性
  • 使用飞书机器人告警请确认到https://open.feishu.cn 的连通性
  • 若独立部署,请确认n9e到EasyOps节点MySQL、Prometheus的连通性

6.2 解压n9e安装包

# 下载n9e安装包到/home/easyops/easyops/目录下
# 安装包解压,n9e-5.3.4.2.tar.gz与easyops包在同一层目录
cd /home/easyops/easyops/
wget ${包服务器}/easyops/n9e-5.3.4.2.tar.gz
tar -xzvf n9e-5.3.4.2.tar.gz
# 进入目录
cd n9e-5.3.4.2/

6.3 导入sql

sh n9e-scripts/init_n9e_sql.sh

6.4 配置邮箱服务

若不使用邮件告警则无需配置,飞书、企业微信等其他通道配置参考 Easyops(192)告警使用说明 
邮件服务配置需在n9eetc.temp/script/notify.py 文件中配置。配置前,请确保邮箱账号有效性(可以先拿邮箱客户端验证下有效性)

部署过程 - 图8

6.5 启动n9e模块

以 easyops 账号运行 run.sh 启动项目。以下提示说明运行正常:

[easyops@build0 n9e-5.3.4.2]$ ./run.sh
Loaded image: ulric2019/nightingale:5.3.4.2
Loaded image: redis:6.2
redis
nserver
nwebapi
Creating redis ... done
Creating nwebapi ... done
Creating nserver ... done
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
nwebapi is ok
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
nserver is ok

如无需告警,可直接停止该模块,使用./stop.sh即可 docker ps 查看容器是否运行 docker logs 查看容器日志

6.6 修改EasyOps访问N9e配置

Q:为什么要做这个配置?A:EasyOps web界面上新增了告警web界面入口,所以需配置告警web入口地址

cd /home/easyops/easyops/n9e-5.3.4.2
sh n9e-scripts/change_n9e_ip.sh <easyops节点的业务IP>

6.7 登录验证告警模块

通过WEB页面登录 http://${easyops-ip}:8000/n9e/login,默认使用 root/Easyops@1024 账号密码登录系统: 部署过程 - 图9

或者从EasyOps界面跳转过来:

部署过程 - 图10

正常部署完成后的页面

部署过程 - 图11

到这里已完成n9e的安装启动,默认没有任何告警规则的,可参考下面《告警接入使用说明》导入必备告警规则,或者自行创建告警规则。

6.8 导入告警规则

告警管理—>告警规则—>更多操作—>导入告警规则

json文件位于包目录 common/easyops/alert-7.0.json

6.9 万象接入告警

进入easyops-manager容器

docker exec -it easyops-manager sh

将metrics纳入prometheus管控 执行:

curl  --header 'Content-Type: application/json;charset=UTF-8'  -X POST 'localhost:8097/api/v1/add_ntesmysqlpass_monitor' -d '{"managerUrl":["172.30.4.135:8080","172.30.4.136:8080","172.30.4.137:8080"],"mysqlUrl":["172.30.4.135:9401","172.30.4.136:9401"]}'
  • managerUrl 列表为万象manager的监控地址:管控所在机器ip+管控端口,端口默认为8080,ip 是万象配置文件中ssh_host的所有IP
  • mysqlUrl列表为mysql主库的监控地址,填写ip+监控端口,ip是万象配置文件中ssh_host列表里前俩个IP ,监控端口可通过计算公式得到:mysql端口值-3306+9401 如:mysql的端口为4306,则按照计算公式,监控端口为4306减去3306,再加9401,为10401。

服务安装

1 全局配置

- VOLUME_MOUNT_PREFIX:数据盘目录前缀,如:/mnt/data
-     BASE_DIR:服务安装路径,统一规定:/usr/easyops
-     SSH_USERssh登录用户名,目的主机的用户名,默认easyops
-     JAVA_HOMEjdk安装路径,不修改,默认:/usr/easyops/jdk8
-     BASE_LOG_DIR:日志存储路径,统一规定:/mnt/data01/logs
-     SSH_PASSWORDssh登录密码,目的主机密码;若配置密钥方式,则无需填写
-     UNINSTALL_ACTION:卸载的默认行为,不修改
-     PACKAGE_BASE_URL:安装包下载路径,统一规定:http://<包服务器>/bdms/centos7
-     PYTHON_HOMEpython2默认安装路径,不修改,默认:/usr/lib64/python2
-     SSH_PRIVATE_KEYeasyops用户生成的私钥
-     SSH_PORTssh登录默认端口号
-     ANSIBLE_TRANSPORT: ansible执行的连接方式,默认agent
-     EASYOPS_REPO_BASEURL:依赖包下载路径,统一规定:http://<包服务器>/bdms/centos7/os/x86_64/
-     NODE_EXPORTER_PORT:主机指标采集器的端口,不修改,默认:9100
-     ENABLE_AUTO_START:  检测服务失败后自动拉起,不修改,默认:false

2. 添加主机

主页面->主机->添加主机,填写规范如下:
- 主机列表:主机名 机架形式
- 填写集群主机名称(必须填写全限定域名,即包含主机名+域名,例如host1.xxx.xxx
- 填写rack机架信息,按照datanode实际机架情况填写,例如都在一个机架,可填rack01
- 主机登录信息(优先级高):若此配置,则全局不生效;若此不配置,则全局生效
-     SSH用户名:ssh登录默认的用户名(尽量用easyops),目的主机的用户名
-     SSH端口号:ssh登录默认端口号
- 认证类型(二选一):
-     Private Keyeasyops公钥
-     Passwordssh登录密码,目的主机密码
- 等待添加主机完成

3. 安装服务

进入到EasyOps主页面->总况,点击“安装技术栈”,使用Yaml快捷配置(跳转至步骤5)可以快速的部署整套集群,编辑Yaml内容,选择DEFAULT集群。

部署过程 - 图12

nginx使用单机版或者高可用版本是可选择的,样例以一个为例,请根据需要自行修改确定 knox与nginx/nginx_ha server 安装在同一节点务必按照yml中提示修改knox配置 添加域名访问(选填,需提前沟通确认) yaml中的代表管控注册的所有主机(如果注册的主机包含沙箱,多集群,请把枚举成要安装的主机) 8节点标准部署yml如下:

- service: kerberos
  componentList:
    - component: master
      hostList: ["xinode2.local"]
    - component: slave
      hostList: ["xinode3.local"]
    - component: client
      hostList: ["*"]
  configList:
    global:
      local_realm: "BDMS.COM"
    krb:
      domain_realm_maps: "{\"BDMS.COM\":[\"bdms.com\",\".bdms.com\"]}"

- service: ldap
  componentList:
    - component: server
      hostList: ["node1.local", "node2.local"]
    - component: client
      hostList: ["*"]
  configList:
    slapd-init.ldif:
      ldap_domain: "bdms.com"

# MySQL当前仅支持Centos7和RedHat7系统,且只能用于POC项目
- service: mysql
  componentList:
    - component: server
      hostList: ["node1.local"]

- service: easy_ranger
  componentList:
    - component: admin
      hostList: ["xinode2.local","xinode3.local"]
  configList:
    env:
      "JAVA_MEM_OPTS": "-XX:MetaspaceSize=100m -XX:MaxMetaspaceSize=200m -Xmx3g -Xms3g"

- service: zookeeper
  componentList:
    - component: server
      hostList: ["node{3-5}.local"]
    - component: client
      # 所有机器都安装client
      hostList: ["*"]
  configList:
    env:
      JAVA_OPTS: "-server -Xmx3g -Xms3g -XX:SurvivorRatio=8 -Xss256k -XX:PermSize=128m -XX:MaxPermSize=128m -XX:+UseParNewGC -XX:MaxTenuringThreshold=15 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:+CMSClassUnloadingEnabled -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=75 -XX:-DisableExplicitGC -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=4096 -XX:-UseBiasedLocking -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=3 -XX:GCLogFileSize=512M"  

- service: kafka
  componentList:
    - component: manager
      hostList: ["node4.local"]
    - component: broker
      # 必须安装≥2个节点,建议3个
      hostList: ["node{4-5}.local"]
  configList:
    env:
      heap_args: "-Xmx3g -Xms3g"
    manager:
      manager_heap_args: "-J-Xms1g -J-Xmx1g"  

- service: elasticsearch
  componentList:
    - component: master
      hostList: ["node{3-5}.local"]
    - component: data
      hostList: ["node{3-5}.local"]
  configList:
    data_jvm:
      jvm_heap_size: "4g"
    master_jvm:
      jvm_heap_size: "1g"

- service: neo4j
  componentList:
    - component: server
      hostList: ["xinode2.local", "xinode3.local", "xinode4.local"]

# -----------------Nginx和Ningx_HA二选一,正式环境必须用Ningx_HA-----------------
- service: nginx
  componentList:
    - component: server
      hostList: ["node2.local"]

- service: nginx_ha
  componentList:
    - component: server
      hostList: ["node3.local","node4.local"]
  configList:
    keepalived_common:
      default_virtual_ip_address: "<vip地址>"

- service: hdfs
  componentList:
    # zkfc 必须和 namenode 在相同机器
    - component: zkfc
      hostList: ["node1.local", "node2.local"]
    - component: namenode
      hostList: ["node1.local", "node2.local"]
    - component: journalnode
      hostList: ["node{1-3}.local"]
    - component: datanode
      hostList: ["node[1-5].local"]
    - component: client
      # 所有机器都安装client
      hostList: ["*"]

- service: yarn
  componentList:
    - component: client
      # 所有机器都安装client
      hostList: ["*"]
    - component: nodemanager
      hostList: ["node[1-4].local"]
    - component: resourcemanager
      hostList: ["node3.local", "node4.local"]
    - component: historyserver
      hostList: ["node2.local"]
  configList:
    mapred_env:
      HADOOP_JOB_HISTORYSERVER_HEAPSIZE: 900
    yarn_env:
      "YARN_RESOURCEMANAGER_HEAPSIZE": "8192"
      "YARN_NODEMANAGER_HEAPSIZE": "4096"

- service: knox
  componentList:
    - component: server
      hostList: ["node3.local","node4.local"]
  configList:
    global:
      # nginx单点模式通过nginx ip 地址访问,不通过主机名访问
      domain: "<nginx对外域名或ip>:8889"

- service: hive
  componentList:
    - component: client
      # 所有机器都安装client
      hostList: ["*"]
    - component: hiveserver
      hostList: ["node{1-3}.local"]
    - component: metastore
      hostList: ["node{4-5}.local"]
  configList:
    metastore:
      "hive_metastore_jvm_opts": "-Xmx12g -Xms12g -XX:PermSize=512m -XX:+UseConcMarkSweepGC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=64M -XX:+HeapDumpOnOutOfMemoryError"  
    hiveserver:
      "hive_hiveserver_jvm_opts": "-Xmx18g -Xms18g -Xmn6000m -XX:MaxNewSize=6000m -XX:PermSize=2G -XX:+UseConcMarkSweepGC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=64M"

- service: impala
  componentList:
    - component: client
      # 所有机器都安装client
      hostList: ["*"]
    - component: catalogd
      hostList: ["node5.local"]
    - component: statestored
      hostList: ["node3.local"]
    - component: impalad
      hostList: ["node4.local"]

- service: spark2
  componentList:
    - component: client
      # 所有机器都安装client
      hostList: ["*"]
    - component: jobhistoryserver
      hostList: ["node4.local"]

- service: kyuubi
  componentList:
    - component: service
      hostList: ["xinode2.local", "xinode3.local"]

- service: hbase
  componentList:
    - component: client
      # 所有机器都安装client
      hostList: ["*"]
    - component: master
      # master 需要部署在hdfs的client节点
      hostList: ["node1.local", "node2.local"]
    - component: regionserver
      # 需要部署在datanode的节点
      hostList: ["node{3-4}.local"]
  configList:
    env: 
      "HBASE_MASTER_HEAPSIZE": "2g"

- service: redis_sentinel
  componentList:
    - component: server
      hostList: ["xinode4.local"]
    - component: slave
      hostList: ["xinode2.local","xinode3.local"]
    - component: sentinel
      hostList: ["xinode2.local","xinode3.local","xinode4.local"]

- service: hadoop_meta
  componentList:
    - component: service
      hostList: ["xinode1.local","xinode2.local"]
    # 部署在YARN ResourceManager节点上
    - component: scheduler
      hostList: ["xinode3.local", "xinode4.local"]
    # 需要部署在kerberos的master节点, 且要安装hdfs-client,仅支持单点部署
    - component: kdc
      hostList: ["xinode2.local"]

# 当前仅支持单点部署
- service: meta_service
  componentList:
    - component: service
      hostList: ["xinode1.local"]
  configList:
    env:
      "JAVA_OPTS": "-Xmx512M -Xms512M -server"

- service: easyeagle
  componentList:
    # 仅支持单点安装 backend
  - component: backend
    hostList: ["xinode3.local"]
  - component: parseservice
    # 选择default_yarn实例下任意一个装有yarn client组件的节点,仅支持单点安装 parseservice;如果没有defalt_yarn,请咨询部署开发
    hostList: ["xinode3.local"]
  - component: collector
    # 选择default_yarn实例下所有装有yarn nodemanager组件的节点;如果没有defalt_yarn,请咨询部署开发
    hostList: ["xinode5.local","xinode6.local", "xinode7.local", "xinode8.local"]

- service: bdms_meta
  componentList:
    - component: server
      hostList: ["xinode1.local", "xinode2.local"]
  configList:
    env:
      "JAVA_OPTS": "-Xms512M -Xmx512M -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOGS_DIR/ -Xloggc:$LOGS_DIR/gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=3M"  

- service: mammut
  componentList:
    - component: executor
      hostList: ["xinode1.local", "xinode2.local"]
    - component: webserver
      hostList: ["xinode1.local", "xinode2.local"]
  configList:
    executor: 
      "JAVA_OPTS": "-Xmx1g -Xms1g -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOGS_DIR/ -Xloggc:$LOGS_DIR/gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=3M"  
    webserver:
      "JAVA_OPTS": "-Xmx1g -Xms1g -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOGS_DIR/ -Xloggc:$LOGS_DIR/gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=3M"  

- service: azkaban
  componentList:
    - component: exec
      hostList: ["xinode3.local", "xinode4.local"]
      volumeList:
        - <数据盘挂载路径 建议配置单独磁盘>
    - component: fc
      hostList: ["xinode3.local", "xinode4.local"]
    - component: web
      hostList: ["xinode3.local", "xinode4.local"]
    - component: lib
      hostList: ["xinode3.local", "xinode4.local"]
  configList:
    exec: 
      "AZKABAN_OPTS": "-Xmx3G -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOGS_DIR/ -Xloggc:$LOGS_DIR/gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=3M"    
    web:
      "AZKABAN_OPTS": "-Xmx3G -Xms3G -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOGS_DIR/ -Xloggc:$LOGS_DIR/gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=3M"

- service: easy_account
  version: 3.7.5.4
  componentList:
    - component: server
      hostList: ["xinode1.local", "xinode2.local"]

- service: easy_alert
  componentList:
    - component: server
      hostList: ["xinode1.local", "xinode2.local"]
  configList:
    env:
      JAVA_OPTS: "-server -Xmx1g -Xms1g -XX:SurvivorRatio=8 -Xss256k -XX:PermSize=128m -XX:MaxPermSize=128m -XX:+UseParNewGC -XX:MaxTenuringThreshold=15 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:+CMSClassUnloadingEnabled -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=75 -XX:-DisableExplicitGC -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=4096 -XX:-UseBiasedLocking -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOGS_DIR/ -Xloggc:$LOGS_DIR/gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=3M"  

- service: easy_console
  componentList:
    - component: backend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: frontend
      hostList: ["xinode1.local", "xinode2.local"]

- service: easy_webmaster
  componentList:
    - component: frontend
      hostList: ["xinode1.local", "xinode2.local"]
  configList:
    nginx:
      # nginx 单点模式通过nginx ip 地址访问,nginx_ha默认是通过vip地址访问,可指定域名
      nginx_server_name: "http://<IP或域名>:11062"

- service: easy_aac
  componentList:
    - component: server
      hostList: ["xinode1.local", "xinode2.local"]

- service: easy_ddl
  componentList:
    - component: server
      hostList: ["xinode1.local", "xinode2.local"]

- service: easy_access
  componentList:
    - component: backend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: frontend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: client
      # 部署在azkaban executor 节点
      hostList: ["xinode3.local", "xinode4.local"]
  configList:
    env:
      JAVA_OPTS: "-Xmx8G -Xms8G -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOGS_DIR/ -Xloggc:$LOGS_DIR/gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=3M -XX:ParallelGCThreads=35 -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly  -XX:+ParallelRefProcEnabled -XX:+PrintHeapAtGC -XX:NewSize=4g -XX:MaxNewSize=4g" 

- service: easy_metahub
  componentList:
    - component: backend
      hostList: ["xinode1.local", "xinode2.local"]
  configList:
    env:
      "JAVA_OPTS": "-server -Xmx4g -Xms4g -XX:SurvivorRatio=8 -Xss256k -XX:PermSize=128m -XX:MaxPermSize=128m -XX:+UseParNewGC -XX:MaxTenuringThreshold=15 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:+CMSClassUnloadingEnabled -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=75 -XX:-DisableExplicitGC -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=4096 -XX:-UseBiasedLocking -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOGS_DIR/ -Xloggc:$LOGS_DIR/gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=3M"

- service: easy_transfer
  componentList:
    - component: backend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: frontend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: client
      # 部署在azkaban executor 节点
      hostList: ["xinode3.local", "xinode4.local"]
  configList:
    env:
      "JAVA_OPTS": "-server -Xmx1g -Xms1g -XX:SurvivorRatio=8 -Xss256k -XX:PermSize=128m -XX:MaxPermSize=128m -XX:+UseParNewGC -XX:MaxTenuringThreshold=15 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:+CMSClassUnloadingEnabled -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=75 -XX:-DisableExplicitGC -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=4096 -XX:-UseBiasedLocking -XX:+PrintTenuringDistribution -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOGS_DIR/ -Xloggc:$LOGS_DIR/gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=3M"   

- service: easy_dmap
  componentList:
    - component: backend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: frontend
      hostList: ["xinode1.local", "xinode2.local"]

- service: easy_coop
  version: 1.2.2.5.1
  componentList:
    - component: backend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: frontend
      hostList: ["xinode1.local", "xinode2.local"]
  configList:
    env:
      "JAVA_OPTS": "-Xms512m -Xmx512m -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOGS_DIR/ -Xloggc:$LOGS_DIR/gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=3M"  

- service: easy_flow
  componentList:
    - component: backend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: frontend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: engine
      hostList: ["xinode1.local", "xinode2.local"]

- service: easy_static
  componentList:
    - component: frontend
      hostList: ["xinode1.local", "xinode2.local"]

- service: easy_submit
  componentList:
    - component: backend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: frontend
      hostList: ["xinode1.local", "xinode2.local"]

- service: easy_udf
  componentList:
    - component: backend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: frontend
      hostList: ["xinode1.local", "xinode2.local"]

- service: easy_metaweb
  componentList:
    - component: frontend
      hostList: ["xinode1.local", "xinode2.local"]

- service: easy_openapi
  componentList:
    - component: backend
      hostList: ["xinode1.local", "xinode2.local"]
  configList:
    env:
      "JAVA_OPTS": "-Xmx1g -Xms1g -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOGS_DIR/ -Xloggc:$LOGS_DIR/gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=3M" 

- service: easy_taskops
  componentList:
    - component: backend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: frontend
      hostList: ["xinode1.local", "xinode2.local"]
  configList:
    env:
      "JAVA_OPTS": "-server -Xmx4g -Xms4g -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOGS_DIR/ -Xloggc:$LOGS_DIR/gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=3M"

- service: easy_qa
  componentList:
    - component: backend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: frontend
      hostList: ["xinode1.local", "xinode2.local"]
  configList:
    env:
      "JAVA_OPTS": "-server -Xmx512m -Xms512m -XX:SurvivorRatio=8 -Xss256k -XX:PermSize=128m -XX:MaxPermSize=128m -XX:+UseParNewGC -XX:MaxTenuringThreshold=15 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:+CMSClassUnloadingEnabled -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=75 -XX:-DisableExplicitGC -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=4096 -XX:-UseBiasedLocking -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOGS_DIR/ -Xloggc:$LOGS_DIR/gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=3M"  

- service: easy_design
  componentList:
    - component: backend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: frontend
      hostList: ["xinode1.local", "xinode2.local"]
  configList:
    env:
      "JAVA_OPTS": "-server -Xmx1g -Xms1g -XX:SurvivorRatio=8 -Xss256k -XX:PermSize=128m -XX:MaxPermSize=128m -XX:+UseParNewGC -XX:MaxTenuringThreshold=15 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:+CMSClassUnloadingEnabled -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=75 -XX:-DisableExplicitGC -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=4096 -XX:-UseBiasedLocking -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOGS_DIR/ -Xloggc:$LOGS_DIR/gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=3M" 

- service: easy_index
  componentList:
    - component: backend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: frontend
      hostList: ["xinode1.local", "xinode2.local"]
  configList:
    env:
      "JAVA_OPTS": "-Xms1g -Xmx1g -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOGS_DIR/ -Xloggc:$LOGS_DIR/gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=3M" 

- service: easy_tag
  componentList:
    - component: backend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: frontend
      hostList: ["xinode1.local", "xinode2.local"]
  configList:
    env:
      "JAVA_OPTS": "-Xms1g -Xmx1g -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOGS_DIR/ -Xloggc:$LOGS_DIR/gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=3M"

- service: easy_dqc
  componentList:
    - component: backend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: frontend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: client
      # 部署在azkaban executor 节点
      hostList: ["xinode3.local", "xinode4.local"]
  configList:
    env:
      "JAVA_OPTS": "-server -Xms1g -Xmx1g -XX:MaxPermSize=128m -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOGS_DIR/ -Xloggc:$LOGS_DIR/gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=3M"

- service: easy_test
  componentList:
    - component: backend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: frontend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: client
      # 部署在azkaban executor 节点
      hostList: ["xinode3.local", "xinode4.local"]
  configList:
    env:
      "JAVA_OPTS": "-server -Xmx512m -Xms512m -XX:SurvivorRatio=8 -Xss256k -XX:PermSize=128m -XX:MaxPermSize=128m -XX:+UseParNewGC -XX:MaxTenuringThreshold=15 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:+CMSClassUnloadingEnabled -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=75 -XX:-DisableExplicitGC -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=4096 -XX:-UseBiasedLocking -XX:+PrintTenuringDistribution -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOGS_DIR/ -Xloggc:$LOGS_DIR/gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=3M" 

- service: easy_standard
  componentList:
    - component: backend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: frontend
      hostList: ["xinode1.local", "xinode2.local"]


# -----------------数据服务-----------------
- service: kong
  componentList:
    - component: cassandra
      hostList: ["xinode2.local", "xinode3.local"]
    - component: kong
      hostList: ["xinode3.local", "xinode4.local"]
    - component: konga
      # 仅支持单点部署
      hostList: ["xinode3.local"]

- service: easy_dataservice
  componentList:
    - component: backend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: frontend
      hostList: ["xinode1.local", "xinode2.local"]
    - component: server
      hostList: ["xinode1.local", "xinode2.local"]
    - component: monitor
      # 此组件仅支持单点部署
      hostList: ["xinode1.local"]
    - component: orcserver
      hostList: ["xinode1.local", "xinode2.local"]
  configList:
    env:
      "JAVA_OPTS": "-server -Xmx1g -Xms1g -XX:SurvivorRatio=8 -Xss256k -XX:PermSize=128m -XX:MaxPermSize=128m -XX:+UseParNewGC -XX:MaxTenuringThreshold=15 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:+CMSClassUnloadingEnabled -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=75 -XX:-DisableExplicitGC -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=4096 -XX:-UseBiasedLocking -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOGS_DIR/ -Xloggc:$LOGS_DIR/gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=3M"
# -----------------数据服务-----------------


# -----------------数据资产-----------------
- service: meta_worker
  componentList:
    - component: api_server
      hostList: ["xinode1.local"]
    - component: meta_server
      hostList: ["xinode1.local"]
  configList:
    env:
      "META_WORKER_HEAPSIZE": "-Xmx4g -Xms4g -Xmn2g"  

- service: smilodon_fsimage_audit
  componentList:
    - component: fsimage_oiv
      hostList: ["xinode1.local", "xinode2.local"]
    - component: upload_audit
      # 部署在 HDFS NameNode 节点
      hostList: ["xinode3.local", "xinode4.local"]
    - component: upload_fsimage
      # 部署在 HDFS NameNode 节点
      hostList: ["xinode3.local", "xinode4.local"]
# -----------------数据资产-----------------


# -----------------实时组件-----------------
- service: ds_agent
  componentList:
    - component: agent
      # 部署在 YARN NodeManager 节点
      hostList: ["xinode5.local", "xinode6.local", "xinode7.local", "xinode8.local"]
  configList:
    ds_agent:
      "agent_jvm_conf": "-Xms256m -Xmx256m"  

- service: grafana
  componentList:
    - component: server
      hostList: ["xinode1.local"]

- service: ntsdb
  componentList:
    - component: master
      hostList: ["xinode2.local", "xinode3.local", "xinode4.local"]
    - component: shardserver
      hostList: ["xinode2.local", "xinode3.local", "xinode4.local"]

- service: logstash
  componentList:
    - component: server
      hostList: ["xinode1.local", "xinode2.local"]
  configList:
    env:
      Xms: "1g"
      Xmx: "1g"  

- service: realtime_debugger
  componentList:
    - component: plugin_server
      hostList: ["xinode1.local", "xinode2.local"]
  dependenceList:
    - ne-flink-1.10.0
    - ne-flink-1.12.4
    - ne-flink-1.13.3
    - ne-flink-1.14.0
    - plugin_cdc_ne-flink-1.13.3
    - plugin_ne-flink-1.10.0
    - plugin_ne-flink-1.12.4
    - plugin_ne-flink-1.14.0
    - ndi_ne-flink-1.13.3
  configList:
    plugin_server:
      "java_opts": "-Xms1g -Xmx1g" 

- service: realtime_monitor
  componentList:
    - component: monitor
      hostList: ["xinode1.local", "xinode2.local"]
  configList:
    monitor:
      "java_opts": "-Xms2g -Xmx2g" 

- service: realtime_ops
  componentList:
    - component: ops
      hostList: ["xinode1.local", "xinode2.local"]
    - component: web
      hostList: ["xinode1.local", "xinode2.local"]
  configList:
    ops:
      "java_opts": "-Xms1g -Xmx1g" 

- service: realtime_portal
  componentList:
    - component: portal
      hostList: ["xinode1.local", "xinode2.local"]
  configList:
    portal:
      "java_opts": "-Xms1g -Xmx1g"

- service: realtime_submitter
  componentList:
    - component: submitter
      hostList: ["xinode1.local", "xinode2.local"]
  dependenceList:
    - ne-flink-1.10.0
    - ne-flink-1.12.4
    - ne-flink-1.13.3
    - ne-flink-1.14.0
    - plugin_cdc_ne-flink-1.13.3
    - plugin_ne-flink-1.10.0
    - plugin_ne-flink-1.12.4
    - plugin_ne-flink-1.14.0
    - ndi_ne-flink-1.13.3
  configList:
    submitter:
      "java_opts": "-Xms2g -Xmx2g"

- id: ne-flink-1.10.0
  name: ne-flink-1.10.0
  service: flink
  version: ne-flink-1.10.0-1.1.8-scala-2.12
  componentList:
    - component: client
      hostList: ["xinode1.local","xinode2.local"]

- id: ne-flink-1.12.4
  name: ne-flink-1.12.4
  service: flink
  version: ne-flink-1.12.4-1.1.7.1-scala-2.12
  componentList:
    - component: client
      hostList: ["xinode1.local","xinode2.local"]

- id: ne-flink-1.13.3
  name: ne-flink-1.13.3
  service: flink
  version: ne-flink-1.13.3-1.0.1-scala-2.11
  componentList:
    - component: client
      hostList: ["xinode1.local","xinode2.local"]

- id: ne-flink-1.14.0
  name: ne-flink-1.14.0
  service: flink
  version: ne-flink-1.14.0-1.0.4-scala-2.12
  componentList:
    - component: client
      hostList: ["xinode1.local","xinode2.local"]

- id: plugin_cdc_ne-flink-1.13.3
  name: plugin_cdc_ne-flink-1.13.3
  service: flink_plugin
  version: cdc_ne-flink-1.13.3-1.0.1_scala2.11-release-3.9.4-2.1.7
  componentList:
    - component: client
      hostList: ["xinode1.local","xinode2.local"]

- id: plugin_ne-flink-1.10.0
  name: plugin_ne-flink-1.10.0
  service: flink_plugin
  version: ne-flink-1.10.0-1.1.8_scala2.12_hive2.1.1-release-3.9.4-1.4.5
  componentList:
    - component: client
      hostList: ["xinode1.local","xinode2.local"]

- id: plugin_ne-flink-1.12.4
  name: plugin_ne-flink-1.12.4
  service: flink_plugin
  version: ne-flink-1.12.4-1.1.7.1_scala2.12_hive2.1.1-release-3.9.4-1.4.5
  componentList:
    - component: client
      hostList: ["xinode1.local","xinode2.local"]

- id: plugin_ne-flink-1.14.0
  name: plugin_ne-flink-1.14.0
  service: flink_plugin
  version: ne-flink-1.14.0-1.0.4_scala2.12_hive2.1.1-release-3.9.4-1.4.5.2
  componentList:
    - component: client
      hostList: ["xinode1.local","xinode2.local"]

- id: ndi_ne-flink-1.13.3
  name: ndi_ne-flink-1.13.3
  service: flink_plugin
  version: ndi_ne-flink-1.13.3-1.0.1_scala2.11-release-3.0.0-1.0.0
  componentList:
    - component: client
      hostList: ["xinode1.local","xinode2.local"]

- service: sloth
  componentList:
    - component: server
      hostList: ["xinode1.local","xinode2.local"]
    - component: develop_web
      hostList: ["xinode1.local","xinode2.local"]
  dependenceList:
    - ne-flink-1.10.0
    - ne-flink-1.12.4
    - ne-flink-1.13.3
    - ne-flink-1.14.0     
    - plugin_cdc_ne-flink-1.13.3
    - plugin_ne-flink-1.10.0
    - plugin_ne-flink-1.12.4
    - plugin_ne-flink-1.14.0
    - ndi_ne-flink-1.13.3
  configList:
    server:
      "java_opts": "-Xms2g -Xmx2g"

4节点部署yml部署如下:

---
- service: nginx_ha
  componentList:
    - component: server
      hostList: ["fnode1.local","fnode2.local"]
  configList:
    keepalived_common:
      default_virtual_ip_address: "172.30.1.69"

- service: kerberos
  componentList:
    - component: master
      hostList: ["fnode1.local"]
    - component: slave
      hostList: ["fnode2.local"]
    - component: client
      # 所有机器都安装client
      hostList: ["*"]

- service: ldap
  componentList:
    - component: server
      hostList: ["fnode1.local", "fnode2.local"]
    - component: client
      hostList: ["*"]

- service: easy_ranger
  componentList:
    - component: admin
      hostList: ["fnode1.local","fnode2.local"]

- service: zookeeper
  componentList:
    - component: server
      hostList: ["fnode1.local","fnode2.local", "fnode3.local"]
    - component: client
      # 所有机器都安装client
      hostList: ["*"]

- service: kafka
  componentList:
    - component: manager
      hostList: ["fnode1.local"]
    - component: broker
      hostList: ["fnode1.local", "fnode2.local", "fnode3.local"]

- service: elasticsearch
  componentList:
    - component: master
      hostList: ["fnode1.local", "fnode2.local", "fnode3.local"]
    - component: data
      hostList: ["fnode1.local", "fnode2.local", "fnode3.local"]
  configList:
    data_jvm:
      jvm_heap_size: "2g"
    master_jvm:
      jvm_heap_size: "2g"

- service: ntsdb
  componentList:
    - component: master
      hostList: ["fnode1.local", "fnode2.local", "fnode3.local"]
    - component: shardserver
      hostList: ["fnode1.local", "fnode2.local", "fnode3.local"]

- service: neo4j
  componentList:
    - component: server
      hostList: ["fnode1.local", "fnode2.local", "fnode3.local"]

- service: grafana
  componentList:
    - component: server
      hostList: ["fnode2.local"]

- service: redis_sentinel
  componentList:
    - component: server
      hostList: ["fnode1.local"]
    - component: slave
      hostList: ["fnode2.local","fnode3.local"]
    - component: sentinel
      hostList: ["fnode1.local","fnode2.local","fnode3.local"]

- service: ds_agent
  componentList:
    # 部署在所有nodemanager节点
    - component: agent
      hostList: ["fnode2.local","fnode3.local", "fnode4.local"]

- service: logstash
  componentList:
    - component: server
      hostList: ["fnode3.local"]

- service: hdfs
  componentList:
    # zkfc 必须和 namenode 在相同机器
    - component: zkfc
      hostList: ["fnode1.local", "fnode2.local"]
    - component: namenode
      hostList: ["fnode1.local", "fnode2.local"]
    - component: journalnode
      hostList: ["fnode2.local","fnode3.local", "fnode4.local"]
    - component: datanode
      hostList: ["fnode2.local","fnode3.local", "fnode4.local"]
    - component: client
      # 所有机器都安装client
      hostList: ["*"]

- service: yarn
  componentList:
    - component: client
      # 所有机器都安装client
      hostList: ["*"]
    - component: nodemanager
      hostList: ["fnode2.local","fnode3.local", "fnode4.local"]
    - component: resourcemanager
      hostList: ["fnode1.local", "fnode2.local"]
    - component: historyserver
      hostList: ["fnode2.local"]
  configList:
    yarn_site:
      "yarn.scheduler.minimum-allocation-mb": "512"
      "yarn.scheduler.maximum-allocation-mb": "14096"
      "yarn.nodemanager.resource.memory-mb": "44096"
    mapred_site:
      "mapreduce.map.memory.mb": "512"
      "mapreduce.map.java.opts": "-Xmx819m"
      "mapreduce.reduce.memory.mb": "1024"
      "mapreduce.reduce.java.opts": "-Xmx1638m"
    mapred_env:
      HADOOP_JOB_HISTORYSERVER_HEAPSIZE: 900

- service: hive
  componentList:
    - component: client
      # 所有机器都安装client
      hostList: ["*"]
    - component: hiveserver
      hostList: ["fnode2.local","fnode3.local"]
    - component: metastore
      hostList: ["fnode1.local","fnode4.local"]

# 需要部署在hdfs-client hive-client的节点上
- service: impala
  componentList:
    - component: client
      # 所有机器都安装client
      hostList: ["*"]
    - component: catalogd
      hostList: ["fnode2.local"]
    - component: impalad
      hostList: ["fnode3.local"]
    - component: statestored
      hostList: ["fnode2.local"]
  configList:
    statestored:
      "mem_limit": "2g"
    impalad:
      "mem_limit": "2g"
    catalogd:
      "mem_limit": "2g"

- service: spark2
  componentList:
    - component: client
      # 所有机器都安装client
      # 需要部署在yarn的client节点
      hostList: ["*"]
    - component: jobhistoryserver
      hostList: ["fnode3.local"]
  configList:
    conf_spark_defaults:
      "spark.driver.memory": "2g"
      "spark.executor.memory": "2g"

- service: kyuubi
  componentList:
    - component: service
      hostList: ["fnode1.local"]
  configList:
    common:
      "spark.driver.memory": "4g"
      "spark.executor.memory": "4g"
      "spark.yarn.am.memory": "2g"

- service: hbase
  componentList:
    - component: client
      # client 需要部署在hdfs的client节点
      # 所有机器都安装client
      hostList: ["*"]
    - component: master
      # master 需要部署在hdfs的client节点
      hostList: ["fnode2.local", "fnode3.local"]
    - component: regionserver
      # 需要部署在datanode的节点
      hostList: ["fnode2.local", "fnode3.local"]

- service: knox
  componentList:
    - component: server
      hostList: ["fnode1.local", "fnode2.local"]
  configList:
    global:
    # 可以按需求调整成不同端口,以规避Nginx和Knox同时部署在一台主机上时监听端口冲突
    # nginx 单点模式通过nginx ip 地址访问 不通过主机名访问
      domain: "<nginx对外域名或ip>:8889"

- service: meta_worker
  componentList:
    - component: api_server
      hostList: ["fnode3.local"]
    - component: meta_server
      hostList: ["fnode3.local"]

- service: easyeagle
  componentList:
  # 选择任意一个节点,仅安装一个backend
  - component: backend
    hostList: ["fnode1.local"]
  # 选择default_yarn实例下任意一个装有yarn client组件的节点,仅安装一个parseservice,如果没有defalt_yarn,请咨询部署开发
  - component: parseservice
    hostList: ["fnode1.local"]
  # 选择default_yarn实例下所有装有yarn nodemanager组件的节点,如果没有defalt_yarn,请咨询部署开发
  - component: collector
    hostList: ["fnode2.local","fnode3.local", "fnode4.local"]

- service: smilodon_fsimage_audit
  componentList:
    - component: fsimage_oiv
      hostList: ["fnode1.local"]
    - component: upload_audit
      hostList: ["fnode1.local", "fnode2.local"]
    - component: upload_fsimage
      hostList: ["fnode1.local", "fnode2.local"]

- service: hadoop_meta
  componentList:
    - component: service
      # 需要部署在安装hdfs-client
      hostList: ["fnode4.local"]
    - component: scheduler
      # 部署在YARN ResourceManager节点上
      hostList: ["fnode1.local", "fnode2.local"]
    - component: kdc
      # 需要部署在kerberos的master节点, 且要安装hdfs-client
      hostList: ["fnode1.local"]

# 更改服务配置组中的meta.hadoop.clusters,保持和HDFS中fs.defaultFS定义的一致
- service: meta_service
  componentList:
    - component: service
      # 需要部署在安装hdfs-client
      hostList: ["fnode4.local"]

- service: bdms_meta
  componentList:
    - component: server
      hostList: ["fnode4.local"]

# 需要部署在安装hdfs-client
- service: mammut
  componentList:
    - component: executor
      hostList: ["fnode4.local"]
    - component: webserver
      hostList: ["fnode4.local"]

# 安装的机器需要有hdfs_client,hive_client,spark_client
- service: azkaban
  componentList:
    - component: exec
      hostList: ["fnode2.local", "fnode3.local"]
    - component: fc
      hostList: ["fnode2.local", "fnode3.local"]
    - component: web
      hostList: ["fnode2.local", "fnode3.local"]
    - component: lib
      hostList: ["fnode2.local", "fnode3.local"]

- service: easy_account
  version: 3.7.5.4
  componentList:
    - component: server
      hostList: ["fnode4.local"]

- service: prometheus
  componentList:
    - component: server
      hostList: ["fnode3.local"]

- service: easy_alert
  componentList:
    - component: server
      hostList: ["fnode3.local"]

- service: easy_access
  componentList:
    - component: backend
      hostList: ["fnode3.local"]
    - component: frontend
      hostList: ["fnode3.local"]
    - component: client
      hostList: ["fnode2.local", "fnode3.local"]

#  1)hbase配置 登录一个安装了hbase client的机器(使用hbase 1.2.6 版本)
#  2)es相关 确认下metahub使用的索引是否存在,例如
- service: easy_metahub
  componentList:
    - component: backend
      hostList: ["fnode4.local"]

- service: easy_transfer
  componentList:
    - component: backend
      hostList: ["fnode4.local"]
    - component: frontend
      hostList: ["fnode4.local"]
    - component: client
      # 所有机器都安装client
      hostList: ["fnode2.local", "fnode3.local"]

- service: easy_dmap
  componentList:
    - component: backend
      hostList: ["fnode4.local"]
    - component: frontend
      hostList: ["fnode4.local"]

- service: easy_design
  componentList:
    - component: backend
      hostList: ["fnode3.local"]
    - component: frontend
      hostList: ["fnode3.local"]

- service: easy_coop
  version: 1.2.2.5.1
  componentList:
    - component: backend
      hostList: ["fnode3.local"]
    - component: frontend
      hostList: ["fnode3.local"]

- service: easy_index
  componentList:
    - component: backend
      hostList: ["fnode3.local"]
    - component: frontend
      hostList: ["fnode3.local"]

- service: easy_tag
  componentList:
    - component: backend
      hostList: ["fnode4.local"]
    - component: frontend
      hostList: ["fnode4.local"]

- service: easy_taskops
  componentList:
    - component: backend
      hostList: ["fnode4.local"]
    - component: frontend
      hostList: ["fnode4.local"]

- service: easy_dqc
  componentList:
    - component: backend
      hostList: ["fnode4.local"]
    - component: frontend
      hostList: ["fnode4.local"]
    - component: client
      # 所有机器都安装client
      hostList: ["fnode2.local", "fnode3.local"]

- service: easy_test
  componentList:
    - component: backend
      hostList: ["fnode4.local"]
    - component: frontend
      hostList: ["fnode4.local"]
    - component: client
      hostList: ["fnode2.local", "fnode3.local"]

- service: easy_qa
  componentList:
    - component: backend
      hostList: ["fnode3.local"]
    - component: frontend
      hostList: ["fnode3.local"]

- service: easy_aac
  componentList:
    - component: server
      hostList: ["fnode3.local"]

- service: easy_ddl
  componentList:
    - component: server
      hostList: ["fnode3.local"]

- service: easy_dataservice
  componentList:
    - component: backend
      hostList: ["fnode3.local"]
    - component: frontend
      hostList: ["fnode3.local"]
    - component: server
      hostList: ["fnode3.local"]
    - component: monitor
      hostList: ["fnode3.local"]
    - component: orcserver
      hostList: ["fnode3.local"]
- service: kong
  componentList:
    - component: cassandra
      hostList: ["fnode1.local","fnode2.local"]
    - component: kong
      hostList: ["fnode1.local","fnode2.local"]
    - component: konga
      hostList: ["fnode1.local"]

- service: easy_console
  componentList:
    - component: backend
      hostList: ["fnode2.local"]
    - component: frontend
      hostList: ["fnode2.local"]

- service: easy_webmaster
  componentList:
    - component: frontend
      hostList: ["fnode2.local"]

- service: easy_standard
  componentList:
    - component: backend
      hostList: ["fnode2.local"]
    - component: frontend
      hostList: ["fnode2.local"]

- service: easy_metaweb
  componentList:
    - component: frontend
      hostList: ["fnode1.local"]

- service: easy_openapi
  componentList:
    - component: backend
      hostList: ["fnode1.local"]

- service: easy_flow
  componentList:
    - component: backend
      hostList: ["fnode4.local"]
    - component: frontend
      hostList: ["fnode4.local"]
    - component: engine
      hostList: ["fnode4.local"]

- service: easy_static
  componentList:
    - component: frontend
      hostList: ["fnode4.local"]

- service: easy_submit
  componentList:
    - component: backend
      hostList: ["fnode4.local"]
    - component: frontend
      hostList: ["fnode4.local"]

- service: easy_udf
  componentList:
    - component: backend
      hostList: ["fnode4.local"]
    - component: frontend
      hostList: ["fnode4.local"]

- id: ne-flink-1.10.0
  name: ne-flink-1.10.0
  service: flink
  version: ne-flink-1.10.0-1.1.8-scala-2.12
  componentList:
    - component: client
      hostList: ["fnode2.local"]
- id: ne-flink-1.12.4
  name: ne-flink-1.12.4
  service: flink
  version: ne-flink-1.12.4-1.1.7.1-scala-2.12
  componentList:
    - component: client
      hostList: ["fnode2.local"]
- id: ne-flink-1.13.3
  name: ne-flink-1.13.3
  service: flink
  version: ne-flink-1.13.3-1.0.1-scala-2.11
  componentList:
    - component: client
      hostList: ["fnode2.local"]
- id: ne-flink-1.14.0
  name: ne-flink-1.14.0
  service: flink
  version: ne-flink-1.14.0-1.0.4-scala-2.12
  componentList:
    - component: client
      hostList: ["fnode2.local"]

- id: plugin_cdc_ne-flink-1.13.3
  name: plugin_cdc_ne-flink-1.13.3
  service: flink_plugin
  version: cdc_ne-flink-1.13.3-1.0.1_scala2.11-release-3.9.4-2.1.7
  componentList:
    - component: client
      hostList: ["fnode2.local"]

- id: plugin_ne-flink-1.10.0
  name: plugin_ne-flink-1.10.0
  service: flink_plugin
  version: ne-flink-1.10.0-1.1.8_scala2.12_hive2.1.1-release-3.9.4-1.4.5
  componentList:
    - component: client
      hostList: ["fnode2.local"]

- id: plugin_ne-flink-1.12.4
  name: plugin_ne-flink-1.12.4
  service: flink_plugin
  version: ne-flink-1.12.4-1.1.7.1_scala2.12_hive2.1.1-release-3.9.4-1.4.5
  componentList:
    - component: client
      hostList: ["fnode2.local"]

- id: plugin_ne-flink-1.14.0
  name: plugin_ne-flink-1.14.0
  service: flink_plugin
  version: ne-flink-1.14.0-1.0.4_scala2.12_hive2.1.1-release-3.9.4-1.4.5.2
  componentList:
    - component: client
      hostList: ["fnode2.local"]

- id: ndi_ne-flink-1.13.3
  name: ndi_ne-flink-1.13.3
  service: flink_plugin
  version: ndi_ne-flink-1.13.3-1.0.1_scala2.11-release-3.0.0-1.0.0
  componentList:
    - component: client
      hostList: ["fnode2.local"]

- service: sloth
  componentList:
    - component: server
      hostList: ["fnode2.local"]
    - component: develop_web
      hostList: ["fnode2.local"]
  dependenceList:
    - ne-flink-1.10.0
    - ne-flink-1.12.4
    - ne-flink-1.13.3
    - ne-flink-1.14.0     
    - plugin_cdc_ne-flink-1.13.3
    - plugin_ne-flink-1.10.0
    - plugin_ne-flink-1.12.4
    - plugin_ne-flink-1.14.0
    - ndi_ne-flink-1.13.3

- service: realtime_submitter
  componentList:
    - component: submitter
      hostList: ["fnode2.local"]
  dependenceList:
    - ne-flink-1.10.0
    - ne-flink-1.12.4
    - ne-flink-1.13.3
    - ne-flink-1.14.0     
    - plugin_cdc_ne-flink-1.13.3
    - plugin_ne-flink-1.10.0
    - plugin_ne-flink-1.12.4
    - plugin_ne-flink-1.14.0
    - ndi_ne-flink-1.13.3

- service: realtime_debugger
  componentList:
    - component: plugin_server
      hostList: ["fnode2.local"]
  dependenceList:
    - ne-flink-1.10.0
    - ne-flink-1.12.4
    - ne-flink-1.13.3
    - ne-flink-1.14.0     
    - plugin_cdc_ne-flink-1.13.3
    - plugin_ne-flink-1.10.0
    - plugin_ne-flink-1.12.4
    - plugin_ne-flink-1.14.0
    - ndi_ne-flink-1.13.3

- service: realtime_ops
  componentList:
    - component: ops
      hostList: ["fnode2.local"]
    - component: web
      hostList: ["fnode2.local"]

- service: realtime_monitor
  componentList:
    - component: monitor
      hostList: ["fnode2.local"]


- service: realtime_portal
  componentList:
    - component: portal
      hostList: ["fnode2.local"]

上述每个服务修改完成后,记得保存新的配置组,并点击「开始安装」等待;因为安装组件较多,此过程时间较长,大约需要2个小时左右

常见问题

1) 执行任务时报”user not found”错误 解决办法:基本是因为ldap客户端同步问题,尝试重启找不到用户节点上的nslcd服务

2)kong安装时报“为新增consumer增加api keys”错误 解决办法:后台删除kong相关组件的安装目录,界面重试

3)real_time安装报错 部署过程 - 图13

解决办法:删除monitor system_config表中grafana_datasource_id和access_key相关记录后,尝试重新安装 部署过程 - 图14

4)扩容操作显示失败,但节点仍被加进去了 解决办法:删除(或者先停止再强制删除)新加入的节点,重新走扩容

5)sloth安装失败提示timeout 部署过程 - 图15

解决办法:重试安装

6)easy_mehub、easy_transfer安装失败提示md5有误 解决办法:重试安装

部署完成后的操作

1. easy_access 预置脱敏数据

部署过程 - 图16 到审计页面下,确认「预置脱敏数据」执行成功: 部署过程 - 图17

2 安装数据资产(可选)

注:不支持和mammut一键安装,所以需要单独安装 确认以下组件已安装

  • smilodon_fsimage_audit
  • meta_worker
  • easy_dqc
  • easyeagle

确认已创建离线开发项目mammut_service,新建项目(不开通测试集群)

TODO:需新建项目组mammut_group

使用平台管理员账号(admin.mammut@163.com/Ab123456)登录猛犸平台
应用业务线:默认业务线
所属项目组:mammut_group
负责人:admin.mammut@163.com
集群:easyops-cluster
项目名:mammut_service
存储配额:1T
hive库:mammut_service
队列名称:mammut_service
资源配置:方案一:100/300/20
申请原因:
    描述:数据资产专用项目

确认通过队列权限管理页面授权admin.mammut@163.com有队列mammut_service权限
通过admin.mammut@163.com账号登录猛犸平台中,并进入mammut_service项目,添加队列权限:
项目中心->队列权限->选择mammut队列->新增授权->选择admin.mammut@163.com用户->保存
  1. 提供有效的用户账号,用于接收数据资产调度任务失败报警,该账号需要具有mammut_service项目的管理员权限
  2. 安装easy_dasset服务,便于查看日志建议只选一个backend,安装时修改task.creator配置项为上述步骤的用户账号,默认admin.mammut@163.com。若未配置此账号,资产任务创建后需手动修改调度任务的报警接收人配置
  3. 若项目名不是mammut_service,需修改SmilodonFsimageAudit服务的hdfs_env配置项组中与path相关的几个参数为实际的项目名
  4. 安装失败时,可在backend组件的${current_dir}/logs/init_dw_tasks.log文件查看数仓初始化日志
  5. 内部验收环境,资产安装因为数仓初始化部分执行耗时较长,通常会导致安装超时;解决方式是在确认mammut_service 项目下这两个任务easycost-createtable-easyops, easycost-createtable-v1_4-easyops立即执行成功完成后,重试安装
  6. 扩容backend组件
  7. 修改easy_webmaster的依赖管理,新增easy_dasset,easy_webmaster自定义运维执行下发nginx配置