标签: Cloudera-Manager CDH Hadoop 部署 集群
摘要:管理、部署Hadoop集群需要工具,Cloudera Manager便是其一。本文详细记录了以在线方式部署CDH集群>的步骤。
以Apache Hadoop为主导的大数据技术的出现,使得中小型公司对于大数据的存储与处理也拥有了武器。
目前Hadoop比较流行的主要有2个版本,Apache和Cloudera版本。
Apache Hadoop:维护人员比较多,更新频率比较快,但是稳定性比较差。
Cloudera Hadoop(CDH):CDH:Cloudera公司的发行版本,基于Apache Hadoop的二次开发,优化了组件兼容和交互接口、简化安装配置、增加Cloudera兼容特性。大数据平台CDH集群 cdh-5.70-rpm_install 详细过程
Part 1 install cdh server
1.1 Ready install resources
CentOS Linux release 7.1.1503 (Core) cm-5.7.0
cloudera-manager-installer.bin
adduser deploy
centos7.1 在安装过程时,网络配置,设置静态IP
vim /etc/sysconfig/network-scripts/ifcfg-eth0
设置静态ip,以及指定ip地址
DEVICE="eth0"BOOTPROTO="static"IPADDR=192.168.1.110NM_CONTROLLED="yes"ONBOOT="yes"TYPE="Ethernet"DNS1=8.8.8.8DNS2=8.8.4.4GATEWAY=192.168.1.1
1.2 网络配置(所有节点)
修改hostname为 cdh-server7
RedHat 的 hostname,就修改 /etc/sysconfig/network文件,将里面的 HOSTNAME 这一行修改成 HOSTNAME=NEWNAME,其中 NEWNAME 就是你要设置的 hostname。 Debian发行版的 hostname 的配置文件是 /etc/hostname
修改ip与主机名的对应关系
[root@cdh-server7 ~]# vi /etc/hosts #修改ip与主机名的对应关系:192.168.181.190 node190192.168.181.198 node198192.168.181.196 node196
重启网络服务生效
[root@cdh-server7 ~]# service network restart
关闭SELINUX
查看SELINUX状态[root@cdh-server7 ~]#getenforce
若 SELINUX 没有关闭,按照下述方式关闭vi /etc/selinux/config修改SELinux=disabled。重启生效,可以等后面都设置完了重启主机# This file controls the state of SELinux on the system.# SELINUX= can take one of these three values:# enforcing - SELinux security policy is enforced.# permissive - SELinux prints warnings instead of enforcing.# disabled - SELinux is fully disabled.SELINUX=disabled# SELINUXTYPE= type of policy in use. Possible values are:# targeted - Only targeted network daemons are protected.# strict - Full SELinux protection.SELINUXTYPE=targeted
[root@cdh-server7 ~]# ping www.baidu.com
以上步骤执行完毕后,重启主机
reboot
重启后再次检查下以上几点,确保环境配置正确。
1.3 卸载 openjdk (所有节点)
注意 : 如果没有openjdk, 则不需要卸载,默认 centos7 没有
[root@cdh-server7 deploy]# rpm -qa | grep java[root@cdh-server7 deploy]# rpm -qa | grep jdk# if exist java or jdk, uninstall, erase it. example under this...[root@cdh-server7 deploy]# rpm -e --nodeps java-1.5.0-gcj-1.5.0.0-29.1.el6.x86_64[root@cdh-server7 deploy]# rpm -e --nodeps java-1.6.0-openjdk-1.6.0.0-1.66.1.13.0.el6.x86_64[root@cdh-server7 deploy]# rpm -e --nodeps java-1.7.0-openjdk-1.7.0.45-2.4.3.3.el6.x86_64
1.4 卸载 centOS7 默认mysql
[root@cdh-server7 deploy]# rpm -qa | grep mariadb[root@cdh-server7 deploy]# rpm -e --nodeps mariadb-libs-5.5.41-2.el7_0.x86_64
1.5 Cloudera Manager安装
下载资源文件
将cloudera-manager.repo文件拷贝到所有节点的/etc/yum.repos.d/文件夹下
[root@node196 ]# cd /home/deploy/cdh[root@node196 cdh]# wget https://archive.cloudera.com/cm5/redhat/7/x86_64/cm/cloudera-manager.repo[root@cdh-server7 cdh]# mv cloudera-manager.repo /etc/yum.repos.d/
验证repo文件是否起效
yum list|grep cloudera
[root@cdh-server7 cdh]# yum list | grep clouderacloudera-manager-agent.x86_64 5.7.0-1.cm560.p0.54.el7 cloudera-managercloudera-manager-daemons.x86_64 5.7.0-1.cm560.p0.54.el7 cloudera-managercloudera-manager-server.x86_64 5.7.0-1.cm560.p0.54.el7 cloudera-managercloudera-manager-server-db-2.x86_64 5.7.0-1.cm560.p0.54.el7 cloudera-managerenterprise-debuginfo.x86_64 5.7.0-1.cm560.p0.54.el7 cloudera-manageroracle-j2sdk1.7.x86_64 1.7.0+update67-1 cloudera-manager
如果列出的不是你安装的版本,执行下面命令重试
yum clean all yum list | grep cloudera
上传下列 rpm 包 到 [root@cdh-server7] 的 /home/deploy/cdh/cloudera-rpms (任意目录)
cd /home/deploy/cdh/cloudera-rpmscloudera-manager-agent-5.7.0-1.cm560.p0.54.el7.x86_64.rpmcloudera-manager-daemons-5.7.0-1.cm560.p0.54.el7.x86_64.rpmcloudera-manager-server-5.7.0-1.cm560.p0.54.el7.x86_64.rpm ## agent not usecloudera-manager-server-db-2-5.7.0-1.cm560.p0.54.el7.x86_64.rpm ## agent not useenterprise-debuginfo-5.7.0-1.cm560.p0.54.el7.x86_64.rpmoracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm
说明 : 可从 下载相关rpm包
切换到rpms目录下,执行
[root@cdh-server7 cdh]# cd /home/deploy/cdh/cloudera-rpms/[root@cdh-server7 cloudera-rpms]# yum -y install *.rpm
1.6 拷贝资源包到目标目录
从 http://archive.cloudera.com/cdh5/parcels/5.7.0/ 下载资源包
将之前下载的Parcel那3个文件拷贝到/opt/cloudera/parcel-repo目录下(如果没有该目录,请自行创建)
[root@cdh-server7 cdh]# cp CDH-5.7.0-1.cdh5.7.0.p0.45-el7.parcel /opt/cloudera/parcel-repo/CDH-5.7.0-1.cdh5.7.0.p0.45-el7.parcel[root@cdh-server7 cdh]# cp CDH-5.7.0-1.cdh5.7.0.p0.45-el7.parcel.sha1 /opt/cloudera/parcel-repo/CDH-5.7.0-1.cdh5.7.0.p0.45-el7.parcel.sha[root@cdh-server7 cdh]# cp manifest.json /opt/cloudera/parcel-repo/manifest.json
1.7 配置 java 环境变量
设置JAVA_HOME
[root@cdh-server7 cdh]#vi /etc/profileexport JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera/export PATH=$JAVA_HOME/bin:$PATH[root@cdh-server7 cdh]#source /etc/profile
关闭防火墙
[root@cdh-server7 deploy]#systemctl stop firewalld.service #centos7,关闭防火墙
以上步骤执行完毕后,重启主机
reboot
1.8 安装CM (只在主节点)
以下两步骤请只在主节点上执行 :
-
进入该目录,给bin文件赋予可执行权限
[root@cdh-server7 cdh]# chmod a+x ./cloudera-manager-installer.bin
-
安装CM (该步骤, 可能是不需要的)
[root@cdh-server7 cdh]# ./cloudera-manager-installer.bin
开始启动server端
[root@cdh-server7 cdh]# cd /etc/init.d/[root@cdh-server7 init.d]# ./cloudera-scm-server-db start[root@cdh-server7 init.d]# ./cloudera-scm-server startStarting cloudera-scm-server: [ OK ][root@cdh-server7 init.d]# tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log
注意 :
机器重启之后,默认启动会导致异常 需要按照该先启动cloudera-scm-server-db,再启动cloudera-scm-server的顺序执行
1.9 浏览器访问验证(主节点)
CM安装成功后浏览器输入 用户名和密码都输入admin,进入web管理界面。
通过浏览器访问验证
http://192.168.181.190:7180/
如果打不开改网页,等待2分钟后。这个服务启动是需要一定时间的。
选择部署的版本,这里我们选择免费版的就可以了。
如果不会设置,那么请参考 最靠谱的安装指南
安装服务时,数据库选择默认的嵌入式数据库
Part 2 安装 agent
this step is similar, but I can't be sure, exactly right.
安装 agent ,可以在单独的机器,主节点,可以只当做主,随意你
为agent做配置,启动agent (所有节点)
agent 安装大部分最好和 server 安装步骤相同,避免启动后出问题
2.1 网络配置
修改ip与主机名的对应关系
[root@cdh-agent1 ~]# vi /etc/hosts #修改ip与主机名的对应关系:192.168.181.190 cdh-server7(node190)192.168.181.198 cdh-agent1(node198)192.168.181.196 cdh-agent2(node196)
重启网络服务生效
[root@cdh-server7 ~]# service network restart
关闭SELINUX
查看SELINUX状态[root@cdh-server7 ~]#getenforce
若 SELINUX 没有关闭,按照下述方式关闭vi /etc/selinux/config修改SELinux=disabled。重启生效,可以等后面都设置完了重启主机# This file controls the state of SELinux on the system.# SELINUX= can take one of these three values:# enforcing - SELinux security policy is enforced.# permissive - SELinux prints warnings instead of enforcing.# disabled - SELinux is fully disabled.SELINUX=disabled# SELINUXTYPE= type of policy in use. Possible values are:# targeted - Only targeted network daemons are protected.# strict - Full SELinux protection.SELINUXTYPE=targeted
[root@cdh-server7 ~]# ping www.baidu.com
2.2 卸载 openjdk (所有节点)
注意 : 如果没有openjdk, 则不需要卸载,默认 centos7 没有
[root@cdh-server7 deploy]# rpm -qa | grep java[root@cdh-server7 deploy]# rpm -qa | grep jdk# if exist java or jdk, uninstall, erase it. example under this...[root@cdh-server7 deploy]# rpm -e --nodeps java-1.5.0-gcj-1.5.0.0-29.1.el6.x86_64[root@cdh-server7 deploy]# rpm -e --nodeps java-1.6.0-openjdk-1.6.0.0-1.66.1.13.0.el6.x86_64[root@cdh-server7 deploy]# rpm -e --nodeps java-1.7.0-openjdk-1.7.0.45-2.4.3.3.el6.x86_64
2.3 卸载centOS7默认的mysql
[root@cdh-server7 deploy]# rpm -qa | grep mariadb[root@cdh-server7 deploy]# rpm -e --nodeps mariadb-libs-5.5.41-2.el7_0.x86_64
2.4 cloudera-manager.repo
上传cloudera-manager.repo 到 cdh-agent1
[root@cdh-agent1 cdh]# cp cloudera-manager.repo /etc/yum.repos.d/
transparent_hugepage
echo never > /sys/kernel/mm/transparent_hugepage/enabledecho never > /sys/kernel/mm/transparent_hugepage/defrag
vi /etc/rc.local 在文件尾放入 如下两条语句
echo never > /sys/kernel/mm/transparent_hugepage/enabledecho never > /sys/kernel/mm/transparent_hugepage/defrag
chmod +x /etc/rc.local
调整swappiness
echo 10 > /proc/sys/vm/swappiness# vi /etc/sysctl.confvm.swappiness = 10
2.5 ~/cdh/cloudera-rpms
上传下列rpm包到cdh-agent1的/home/deploy/cdh/cloudera-rpms
cloudera-manager-agent-5.7.0-1.cm560.p0.54.el7.x86_64.rpmcloudera-manager-daemons-5.7.0-1.cm560.p0.54.el7.x86_64.rpmenterprise-debuginfo-5.7.0-1.cm560.p0.54.el7.x86_64.rpmoracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm[root@cdh-agent1 init.d]# cd /home/deploy/cdh/cloudera-rpms/[root@cdh-agent1 init.d]# yum -y install *.rpm
设置JAVA_HOME
[root@cdh-server7 cdh]#vi /etc/profileexport JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera/export PATH=$JAVA_HOME/bin:$PATH[root@cdh-server7 cdh]#source /etc/profile
关闭防火墙
[root@cdh-server7 deploy]#systemctl stop firewalld.service #centos7,关闭防火墙
以上步骤执行完毕后,重启主机
reboot
[root@cdh-agent1 init.d]# vi /etc/cloudera-scm-agent/config.ini+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++# Hostname of the CM server.#server_host=localhostserver_host=cdh-server7(node190)+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[root@cdh-server7 cdh]# cd /etc/init.d/[root@cdh-server7 init.d]# ./cloudera-scm-agent startStarting cloudera-scm-agent: [ OK ][root@cdh-server deploy]# tail -f /var/log//cloudera-scm-agent/cloudera-scm-agent.log
注意 :
安装YARN NodeManager失败时,需要删除 /yarn /var/lib/hadoop-yarn 目录再重新添加
CDH最靠谱的安装指南 :
Part 3 恢复启动 Our 集群
3.1 确定 firewalld close
systemctl start firewalld.service#启动firewallsystemctl stop firewalld.service#停止firewallsystemctl disable firewalld.service#禁止firewall开机启动
注意 : 操作之前确定 firewalld 是关闭的
[root@node19x flag]$ vim /etc/rc.local (/etc/rc.local 对应貌似相对dir /ect/init.d) 1 #!/bin/bash 2 # THIS FILE IS ADDED FOR COMPATIBILITY PURPOSES 3 # 4 # It is highly advisable to create own systemd services or udev rules 5 # to run scripts during boot instead of using this file. 6 # 7 # In contrast to previous versions due to parallel execution during boot 8 # this script will NOT be run after all other services. 9 # 10 # Please note that you must run 'chmod +x /etc/rc.d/rc.local' to ensure 11 # that this script will be executed during boot. 12 13 touch /var/lock/subsys/local 14 echo never > /sys/kernel/mm/transparent_hugepage/enabled 15 echo never > /sys/kernel/mm/transparent_hugepage/defrag 16 service ntpd start 17 service elasticsearch start
3.2 启动server端、cm
only at server node
[root@cdh-server7 cdh]# cd /etc/init.d/[root@cdh-server7 init.d]# ./cloudera-scm-server-db start[root@cdh-server7 init.d]# ./cloudera-scm-server startStarting cloudera-scm-server: [ OK ][root@cdh-server7 init.d]# tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log// 等待日志 7180 启动成功, 访问 : http://node190:7180/cmf/home
注意 :
机器重启之后,默认启动会导致异常需要按照该先启动cloudera-scm-server-db,再启动cloudera-scm-server的顺序执行
一般以下 agent 是自动启动的
[root@node190 init.d]# ./cloudera-scm-agent startcloudera-scm-agent is already runningnode190:./cloudera-scm-agent startnode19x:./cloudera-scm-agent startnode19x:./cloudera-scm-agent start...
3.3 CM页面上启动各服务
CM 页面上重启 service monitor
CM 页面上重启 host monitor
CM 页面上启动各项服务 (如 : ZK, Flume, YARN, HDFS, Hive, Sqoop, Spark etc..)
3.4 各个节点启动 ES
[deploy@node190 init.d]# lltotal 44-rwxr-xr-x 1 root root 8671 Apr 2 04:52 cloudera-scm-agentlrwxrwxrwx. 1 root root 58 Apr 18 16:55 elasticsearch -> /home/deploy/elasticsearch-1.7.1/bin/service/elasticsearch-rw-r--r--. 1 root root 13948 Sep 16 2015 functions-rwxr-xr-x. 1 root root 2989 Sep 16 2015 netconsole-rwxr-xr-x. 1 root root 6630 Sep 16 2015 network-rw-r--r--. 1 root root 1160 Apr 1 00:45 README
deploy
[deploy@node190 init.d]# ./elasticsearch start[deploy@node19x init.d]# ./elasticsearch start[deploy@node19x init.d]# ./elasticsearch start...
http://node190:9200/_plugin/bigdesk/#cluster
等待同步数据完成,一般会很快,等待 Status 从 RED 变为 green 状态
http://node190:9200/_plugin/head/
3.5 启动 kibana
[deploy@node196 ~]#cd /home/deploy/kibana-4.1.1-linux-x64 ./bin/kibana > kibana.log 2>&1 & --@deploy