博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
大数据平台CDH集群在线安装
阅读量:5811 次
发布时间:2019-06-18

本文共 11673 字,大约阅读时间需要 38 分钟。

标签: Cloudera-Manager CDH Hadoop 部署 集群

摘要:管理、部署Hadoop集群需要工具,Cloudera Manager便是其一。本文详细记录了以在线方式部署CDH集群>的步骤。

以Apache Hadoop为主导的大数据技术的出现,使得中小型公司对于大数据的存储与处理也拥有了武器。

目前Hadoop比较流行的主要有2个版本,Apache和Cloudera版本。

Apache Hadoop:维护人员比较多,更新频率比较快,但是稳定性比较差。

Cloudera Hadoop(CDH):CDH:Cloudera公司的发行版本,基于Apache Hadoop的二次开发,优化了组件兼容和交互接口、简化安装配置、增加Cloudera兼容特性。

大数据平台CDH集群 cdh-5.70-rpm_install 详细过程

Part 1 install cdh server

1.1 Ready install resources

  1. CentOS Linux release 7.1.1503 (Core) cm-5.7.0

  2. cloudera-manager-installer.bin

  3. adduser deploy

centos7.1 在安装过程时,网络配置,设置静态IP

vim /etc/sysconfig/network-scripts/ifcfg-eth0

设置静态ip,以及指定ip地址

DEVICE="eth0"BOOTPROTO="static"IPADDR=192.168.1.110NM_CONTROLLED="yes"ONBOOT="yes"TYPE="Ethernet"DNS1=8.8.8.8DNS2=8.8.4.4GATEWAY=192.168.1.1

1.2 网络配置(所有节点)

修改hostname为 cdh-server7

  RedHat 的 hostname,就修改 /etc/sysconfig/network文件,将里面的 HOSTNAME 这一行修改成 HOSTNAME=NEWNAME,其中 NEWNAME 就是你要设置的 hostname。  Debian发行版的 hostname 的配置文件是 /etc/hostname

修改ip与主机名的对应关系

[root@cdh-server7 ~]# vi /etc/hosts #修改ip与主机名的对应关系:192.168.181.190 node190192.168.181.198 node198192.168.181.196 node196

重启网络服务生效

[root@cdh-server7 ~]# service network restart

关闭SELINUX

查看SELINUX状态[root@cdh-server7 ~]#getenforce
若 SELINUX 没有关闭,按照下述方式关闭vi /etc/selinux/config修改SELinux=disabled。重启生效,可以等后面都设置完了重启主机# This file controls the state of SELinux on the system.# SELINUX= can take one of these three values:#       enforcing - SELinux security policy is enforced.#       permissive - SELinux prints warnings instead of enforcing.#       disabled - SELinux is fully disabled.SELINUX=disabled# SELINUXTYPE= type of policy in use. Possible values are:#       targeted - Only targeted network daemons are protected.#       strict - Full SELinux protection.SELINUXTYPE=targeted
[root@cdh-server7 ~]# ping www.baidu.com

以上步骤执行完毕后,重启主机

reboot

重启后再次检查下以上几点,确保环境配置正确。

1.3 卸载 openjdk (所有节点)

注意 : 如果没有openjdk, 则不需要卸载,默认 centos7 没有

[root@cdh-server7 deploy]# rpm -qa | grep java[root@cdh-server7 deploy]# rpm -qa | grep jdk# if exist java or jdk, uninstall, erase it.  example under this...[root@cdh-server7 deploy]# rpm -e --nodeps java-1.5.0-gcj-1.5.0.0-29.1.el6.x86_64[root@cdh-server7 deploy]# rpm -e --nodeps java-1.6.0-openjdk-1.6.0.0-1.66.1.13.0.el6.x86_64[root@cdh-server7 deploy]# rpm -e --nodeps java-1.7.0-openjdk-1.7.0.45-2.4.3.3.el6.x86_64

1.4 卸载 centOS7 默认mysql

[root@cdh-server7 deploy]# rpm -qa | grep mariadb[root@cdh-server7 deploy]# rpm -e --nodeps mariadb-libs-5.5.41-2.el7_0.x86_64

1.5 Cloudera Manager安装

下载资源文件

将cloudera-manager.repo文件拷贝到所有节点的/etc/yum.repos.d/文件夹下

[root@node196 ]# cd /home/deploy/cdh[root@node196 cdh]# wget https://archive.cloudera.com/cm5/redhat/7/x86_64/cm/cloudera-manager.repo[root@cdh-server7 cdh]# mv cloudera-manager.repo /etc/yum.repos.d/

验证repo文件是否起效

yum list|grep cloudera
[root@cdh-server7 cdh]# yum list | grep clouderacloudera-manager-agent.x86_64           5.7.0-1.cm560.p0.54.el7        cloudera-managercloudera-manager-daemons.x86_64         5.7.0-1.cm560.p0.54.el7        cloudera-managercloudera-manager-server.x86_64          5.7.0-1.cm560.p0.54.el7        cloudera-managercloudera-manager-server-db-2.x86_64     5.7.0-1.cm560.p0.54.el7        cloudera-managerenterprise-debuginfo.x86_64             5.7.0-1.cm560.p0.54.el7        cloudera-manageroracle-j2sdk1.7.x86_64                  1.7.0+update67-1               cloudera-manager

如果列出的不是你安装的版本,执行下面命令重试

yum clean all yum list | grep cloudera

上传下列 rpm 包 到 [root@cdh-server7] 的 /home/deploy/cdh/cloudera-rpms (任意目录)

cd /home/deploy/cdh/cloudera-rpmscloudera-manager-agent-5.7.0-1.cm560.p0.54.el7.x86_64.rpmcloudera-manager-daemons-5.7.0-1.cm560.p0.54.el7.x86_64.rpmcloudera-manager-server-5.7.0-1.cm560.p0.54.el7.x86_64.rpm   ## agent not usecloudera-manager-server-db-2-5.7.0-1.cm560.p0.54.el7.x86_64.rpm  ## agent not useenterprise-debuginfo-5.7.0-1.cm560.p0.54.el7.x86_64.rpmoracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm

说明 : 可从 下载相关rpm包

切换到rpms目录下,执行

[root@cdh-server7 cdh]# cd /home/deploy/cdh/cloudera-rpms/[root@cdh-server7 cloudera-rpms]# yum -y install *.rpm

1.6 拷贝资源包到目标目录

从 http://archive.cloudera.com/cdh5/parcels/5.7.0/ 下载资源包

将之前下载的Parcel那3个文件拷贝到/opt/cloudera/parcel-repo目录下(如果没有该目录,请自行创建)

[root@cdh-server7 cdh]# cp CDH-5.7.0-1.cdh5.7.0.p0.45-el7.parcel /opt/cloudera/parcel-repo/CDH-5.7.0-1.cdh5.7.0.p0.45-el7.parcel[root@cdh-server7 cdh]# cp CDH-5.7.0-1.cdh5.7.0.p0.45-el7.parcel.sha1 /opt/cloudera/parcel-repo/CDH-5.7.0-1.cdh5.7.0.p0.45-el7.parcel.sha[root@cdh-server7 cdh]# cp manifest.json /opt/cloudera/parcel-repo/manifest.json

1.7 配置 java 环境变量

设置JAVA_HOME

[root@cdh-server7 cdh]#vi /etc/profileexport JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera/export PATH=$JAVA_HOME/bin:$PATH[root@cdh-server7 cdh]#source /etc/profile

关闭防火墙

[root@cdh-server7 deploy]#systemctl stop firewalld.service  #centos7,关闭防火墙

以上步骤执行完毕后,重启主机

reboot

1.8 安装CM (只在主节点)

以下两步骤请只在主节点上执行 :

  • 进入该目录,给bin文件赋予可执行权限

    [root@cdh-server7 cdh]# chmod a+x ./cloudera-manager-installer.bin
  • 安装CM (该步骤, 可能是不需要的)

    [root@cdh-server7 cdh]# ./cloudera-manager-installer.bin

开始启动server端

[root@cdh-server7 cdh]# cd /etc/init.d/[root@cdh-server7 init.d]# ./cloudera-scm-server-db start[root@cdh-server7 init.d]# ./cloudera-scm-server startStarting cloudera-scm-server:                              [  OK  ][root@cdh-server7 init.d]# tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log

注意 :

机器重启之后,默认启动会导致异常
需要按照该先启动cloudera-scm-server-db,再启动cloudera-scm-server的顺序执行

1.9 浏览器访问验证(主节点)

CM安装成功后浏览器输入 用户名和密码都输入admin,进入web管理界面。

通过浏览器访问验证

http://192.168.181.190:7180/

如果打不开改网页,等待2分钟后。这个服务启动是需要一定时间的。

选择部署的版本,这里我们选择免费版的就可以了。

如果不会设置,那么请参考 最靠谱的安装指南

安装服务时,数据库选择默认的嵌入式数据库

Part 2 安装 agent

this step is similar, but I can't be sure, exactly right.

安装 agent ,可以在单独的机器,主节点,可以只当做主,随意你

为agent做配置,启动agent (所有节点)

agent 安装大部分最好和 server 安装步骤相同,避免启动后出问题

2.1 网络配置

修改ip与主机名的对应关系

[root@cdh-agent1 ~]# vi /etc/hosts #修改ip与主机名的对应关系:192.168.181.190 cdh-server7(node190)192.168.181.198 cdh-agent1(node198)192.168.181.196 cdh-agent2(node196)

重启网络服务生效

[root@cdh-server7 ~]# service network restart

关闭SELINUX

查看SELINUX状态[root@cdh-server7 ~]#getenforce
若 SELINUX 没有关闭,按照下述方式关闭vi /etc/selinux/config修改SELinux=disabled。重启生效,可以等后面都设置完了重启主机# This file controls the state of SELinux on the system.# SELINUX= can take one of these three values:#       enforcing - SELinux security policy is enforced.#       permissive - SELinux prints warnings instead of enforcing.#       disabled - SELinux is fully disabled.SELINUX=disabled# SELINUXTYPE= type of policy in use. Possible values are:#       targeted - Only targeted network daemons are protected.#       strict - Full SELinux protection.SELINUXTYPE=targeted
[root@cdh-server7 ~]# ping www.baidu.com

2.2 卸载 openjdk (所有节点)

注意 : 如果没有openjdk, 则不需要卸载,默认 centos7 没有

[root@cdh-server7 deploy]# rpm -qa | grep java[root@cdh-server7 deploy]# rpm -qa | grep jdk# if exist java or jdk, uninstall, erase it.  example under this...[root@cdh-server7 deploy]# rpm -e --nodeps java-1.5.0-gcj-1.5.0.0-29.1.el6.x86_64[root@cdh-server7 deploy]# rpm -e --nodeps java-1.6.0-openjdk-1.6.0.0-1.66.1.13.0.el6.x86_64[root@cdh-server7 deploy]# rpm -e --nodeps java-1.7.0-openjdk-1.7.0.45-2.4.3.3.el6.x86_64

2.3 卸载centOS7默认的mysql

[root@cdh-server7 deploy]# rpm -qa | grep mariadb[root@cdh-server7 deploy]# rpm -e --nodeps mariadb-libs-5.5.41-2.el7_0.x86_64

2.4 cloudera-manager.repo

上传cloudera-manager.repo 到 cdh-agent1

[root@cdh-agent1 cdh]# cp cloudera-manager.repo /etc/yum.repos.d/

transparent_hugepage

echo never > /sys/kernel/mm/transparent_hugepage/enabledecho never > /sys/kernel/mm/transparent_hugepage/defrag

vi /etc/rc.local 在文件尾放入 如下两条语句

echo never > /sys/kernel/mm/transparent_hugepage/enabledecho never > /sys/kernel/mm/transparent_hugepage/defrag
chmod +x /etc/rc.local

调整swappiness

echo 10 > /proc/sys/vm/swappiness# vi /etc/sysctl.confvm.swappiness = 10

2.5 ~/cdh/cloudera-rpms

上传下列rpm包到cdh-agent1的/home/deploy/cdh/cloudera-rpms

cloudera-manager-agent-5.7.0-1.cm560.p0.54.el7.x86_64.rpmcloudera-manager-daemons-5.7.0-1.cm560.p0.54.el7.x86_64.rpmenterprise-debuginfo-5.7.0-1.cm560.p0.54.el7.x86_64.rpmoracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm[root@cdh-agent1 init.d]# cd /home/deploy/cdh/cloudera-rpms/[root@cdh-agent1 init.d]# yum -y install *.rpm

设置JAVA_HOME

[root@cdh-server7 cdh]#vi /etc/profileexport JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera/export PATH=$JAVA_HOME/bin:$PATH[root@cdh-server7 cdh]#source /etc/profile

关闭防火墙

[root@cdh-server7 deploy]#systemctl stop firewalld.service  #centos7,关闭防火墙

以上步骤执行完毕后,重启主机

reboot
[root@cdh-agent1 init.d]# vi /etc/cloudera-scm-agent/config.ini+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++# Hostname of the CM server.#server_host=localhostserver_host=cdh-server7(node190)+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[root@cdh-server7 cdh]# cd /etc/init.d/[root@cdh-server7 init.d]# ./cloudera-scm-agent startStarting cloudera-scm-agent:                               [  OK  ][root@cdh-server deploy]# tail -f /var/log//cloudera-scm-agent/cloudera-scm-agent.log

注意 :

安装YARN NodeManager失败时,需要删除 /yarn /var/lib/hadoop-yarn 目录再重新添加

CDH最靠谱的安装指南 :

Part 3 恢复启动 Our 集群

3.1 确定 firewalld close

systemctl start firewalld.service#启动firewallsystemctl stop firewalld.service#停止firewallsystemctl disable firewalld.service#禁止firewall开机启动

注意 : 操作之前确定 firewalld 是关闭的

[root@node19x flag]$ vim /etc/rc.local (/etc/rc.local 对应貌似相对dir /ect/init.d)  1 #!/bin/bash  2 # THIS FILE IS ADDED FOR COMPATIBILITY PURPOSES  3 #  4 # It is highly advisable to create own systemd services or udev rules  5 # to run scripts during boot instead of using this file.  6 #  7 # In contrast to previous versions due to parallel execution during boot  8 # this script will NOT be run after all other services.  9 # 10 # Please note that you must run 'chmod +x /etc/rc.d/rc.local' to ensure 11 # that this script will be executed during boot. 12 13 touch /var/lock/subsys/local 14 echo never > /sys/kernel/mm/transparent_hugepage/enabled 15 echo never > /sys/kernel/mm/transparent_hugepage/defrag 16 service ntpd start 17 service elasticsearch start

3.2 启动server端、cm

only at server node

[root@cdh-server7 cdh]# cd /etc/init.d/[root@cdh-server7 init.d]# ./cloudera-scm-server-db start[root@cdh-server7 init.d]# ./cloudera-scm-server startStarting cloudera-scm-server:                              [  OK  ][root@cdh-server7 init.d]# tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log// 等待日志 7180 启动成功, 访问 : http://node190:7180/cmf/home

注意 :

机器重启之后,默认启动会导致异常
需要按照该先启动cloudera-scm-server-db,再启动cloudera-scm-server的顺序执行

一般以下 agent 是自动启动的

[root@node190 init.d]# ./cloudera-scm-agent startcloudera-scm-agent is already runningnode190:./cloudera-scm-agent startnode19x:./cloudera-scm-agent startnode19x:./cloudera-scm-agent start...

3.3 CM页面上启动各服务

  1. CM 页面上重启 service monitor

  2. CM 页面上重启 host monitor

  3. CM 页面上启动各项服务 (如 : ZK, Flume, YARN, HDFS, Hive, Sqoop, Spark etc..)


3.4 各个节点启动 ES

[deploy@node190 init.d]# lltotal 44-rwxr-xr-x  1 root root  8671 Apr  2 04:52 cloudera-scm-agentlrwxrwxrwx. 1 root root    58 Apr 18 16:55 elasticsearch -> /home/deploy/elasticsearch-1.7.1/bin/service/elasticsearch-rw-r--r--. 1 root root 13948 Sep 16  2015 functions-rwxr-xr-x. 1 root root  2989 Sep 16  2015 netconsole-rwxr-xr-x. 1 root root  6630 Sep 16  2015 network-rw-r--r--. 1 root root  1160 Apr  1 00:45 README

deploy

[deploy@node190 init.d]# ./elasticsearch start[deploy@node19x init.d]# ./elasticsearch start[deploy@node19x init.d]# ./elasticsearch start...
http://node190:9200/_plugin/bigdesk/#cluster

等待同步数据完成,一般会很快,等待 Status 从 RED 变为 green 状态

http://node190:9200/_plugin/head/

3.5 启动 kibana

[deploy@node196 ~]#cd /home/deploy/kibana-4.1.1-linux-x64    ./bin/kibana > kibana.log 2>&1 &              --@deploy

转载地址:http://wejbx.baihongyu.com/

你可能感兴趣的文章
在AngularJS中学习javascript的new function意义及this作用域的生成过程
查看>>
盘点物联网网关现有联网技术及应用场景
查看>>
网络钓鱼大讲堂 Part3 | 网络钓鱼攻击向量介绍
查看>>
阿里云与Intel联合发布加密计算,亚洲首个云上“芯片级”数据保护
查看>>
1、下载安装scala编译器(可以理解为scala的jdk),地址:http://www.scala
查看>>
mui 总结2--新建第一个app项目
查看>>
nginx的lua api
查看>>
考研太苦逼没坚持下来!看苑老师视频有点上头
查看>>
HCNA——RIP的路由汇总
查看>>
zabbix监控php状态(四)
查看>>
定时任务的创建
查看>>
实战Django:小型CMS Part2
查看>>
原创]windows server 2012 AD架构试验系列 – 16更改DC计算机名
查看>>
统治世界的十大算法
查看>>
linux svn安装和配置
查看>>
SSH中调用另一action的方法(chain,redirect)
查看>>
数据库基础
查看>>
表格排序
查看>>
关于Android四大组件的学习总结
查看>>
java只能的round,ceil,floor方法的使用
查看>>