Spark学习记录(六):HBase安装
2022-03-28 15:10:33

按照本次学习的教程,之后是进行HBase的安装和配置。(怎么感觉这一系列的名称应该改成“Hadoop安装记录”?好奇怪。🤔)

和之前各软件的安装一样,进入HBase安装包的下载路径,打开终端输入:

1
2
sudo tar -zxf hbase-2.4.10-bin.tar.gz -C /usr/local
sudo mv /usr/local/hbase-2.4.10 /usr/local/hbase

再修改环境变量(vim ~/.bashrc修改,source ~/.bashrc更新),添加以下内容:

1
2
#hbase
export PATH=$PATH:/usr/local/hbase/bin

赋予可执行权限(终端中输入):

1
2
cd /usr/local
sudo chown -R hadoop ./hbase

接下来可以先检查安装是否完成。切换回主目录(cd ~),在终端中输入:

1
hbase version

这时返回以下结果:

1
2
3
4
5
6
7
8
9
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hbase/lib/client-facing-thirdparty/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase 2.4.10
Source code repository git://buildbox/home/apurtell/build/hbase revision=3e5359c73d1a96dd7d2ac5bc8f987e9a89ef90ea
Compiled by apurtell on Mon Feb 28 10:03:15 PST 2022
From source with checksum b38d895e719d82c3d3b2a886e3fca39f754f5ceef732e157604c905c7af3af2d4b9b861abfdd270e627a4fe1ffa8eba74a9b01095b856952d18023666cce66ad

好像有点奇怪?看下面能成功返回版本信息,但这上面几行是怎么回事。可怕。(注:之后解决了这个问题。这一点将会写在文末。

接着开始配置HBase。HBase同样有多种配置方式,本次也采用伪分布式配置。

1
vim /usr/local/hbase/conf/hbase-env.sh

修改以下几项:

1
export JAVA_HOME=/usr/lib/jdk1.8.0_301
1
export HBASE_CLASSPATH=/usr/local/hadoop/conf
1
export HBASE_MANAGES_ZK=true

接着修改另一项文件:

1
vim /usr/local/hbase/conf/hbase-site.xml

找到:

1
2
3
4
5
6
7
8
9
10
11
12
13

<property>
<name>hbase.cluster.distributed</name>
<value>false</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>./tmp</value>
</property>
<property>
<name>hbase.unsafe.stream.capability.enforce</name>
<value>false</value>
</property>

改为:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>./tmp</value>
</property>
<property>
<name>hbase.unsafe.stream.capability.enforce</name>
<value>false</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>

至此配置也完成了。下面将进行检验。

打开终端依次输入:

1
2
3
ssh localhost
start-dfs.sh
jps

看到以下进程,说明Hadoop启动成功:

1
2
3
4
8069 Jps
7959 SecondaryNameNode
7580 NameNode
7724 DataNode

启动HBase进程:

1
start-hbase.sh

再用jps命令查看,出现下列进程说明启动成功:

1
2
3
4
5
6
7
8592 HMaster
8770 HRegionServer
7959 SecondaryNameNode
8471 HQuorumPeer
9114 Jps
7580 NameNode
7724 DataNode

接着就可以启动HBase的shell进行操作了。在终端中输入(exit可退出shell):

1
hbase shell
1
2
3
4
5
6
7
8
9
10
11
12
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hbase/lib/client-facing-thirdparty/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.4.10, r3e5359c73d1a96dd7d2ac5bc8f987e9a89ef90ea, Mon Feb 28 10:03:15 PST 2022
Took 0.0012 seconds
hbase:001:0>

怎么又是这串warning?但又能成功启动,好怪啊。(注:之后解决了这个问题。这一点将会写在文末。)

要关闭HBase,在终端中输入:

1
stop-hbase.sh

不过,在等待了很久后,HBase还是没关掉(已经不知道几个点了):

1
stopping hbase...............................................................................................................................................................................

搜索得知,可以这样关闭。先在终端中输入:

1
hbase-daemon.sh stop master

再输入:

1
stop-hbase.sh

真🐂啊,一下就关掉了。不过这样真的没问题?(还真挖了个坑,详见下文。

然后是关闭Hadoop进程与退出本机ssh连接:

1
2
3
4
#close the hadoop
stop-dfs.sh
#close the ssh
exit

安装和配置就到此完成了……吗?没有,还记得上面两大段warning吗?通过warning本身提供的信息(就是那个网址),可以得知:

大意就是Hadoop和HBase都调用了这个API。并且虽然你会看到warning,但程序会自动选择其中一个运行,并不会影响整体的运行,当然你也可以去掉其中一个就不会出现这种提醒了。

但我哪里敢乱删文件啊,每次看着这一长串又有点烦。于是我后来又检查了一遍HBase的环境变量配置文件(vim /usr/local/hbase/conf/hbase-env.sh),发现只要不让HBase再调用一遍Hadoop的库好像就行了,也就是修改这一项:

1
2
3
# Tell HBase whether it should include Hadoop's lib when start up,
# the default value is false,means that includes Hadoop's lib.
export HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP="true"

再启动一次HBase试试:

1
2
start-dfs.sh
start-hbase.sh

然后我看到了这串输出:

1
2
3
127.0.0.1: running zookeeper, logging to /usr/local/hbase/bin/../logs/hbase-hadoop-zookeeper-ubuntu.out
running master, logging to /usr/local/hbase/bin/../logs/hbase-hadoop-master-ubuntu.out
: regionserver running as process 8770. Stop it first.

啊这,怎么回事呢?

输入jps命令,发现目前运行的进程是这几个:

1
2
3
4
5
13427 Jps
13109 DataNode
13318 SecondaryNameNode
11831 HQuorumPeer
12927 NameNode

还有这事?于是再输入一遍stop-hbase.sh命令,返回输出:

1
2
3
no hbase master found
127.0.0.1: running zookeeper, logging to /usr/local/hbase/bin/../logs/hbase-hadoop-zookeeper-ubuntu.out
127.0.0.1: stopping zookeeper.

此时jps命令返回正常,只有Hadoop相关进程:

1
2
3
4
13109 DataNode
13318 SecondaryNameNode
12927 NameNode
13711 Jps

这时我注意到了输出中的“no hbase master found”。啊这。再输入start-hbase.sh命令重新启动后,这次没有“Stop it first.”了。

不过检查jps进程发现,确实少了个HMaster。我的master呢?你就是我的Master吗?

于是照着之前关闭时的命令输入了:

1
hbase-daemon.sh start master

再输入jps检查,进程终于正常了。

1
2
3
4
5
6
7
14851 Jps
13109 DataNode
13318 SecondaryNameNode
14633 HMaster
13997 HQuorumPeer
14238 HRegionServer
12927 NameNode

这时再启动hbase shell,就没有之前那一长串warning一样的信息了(不过又出来一句新的)。

1
2
3
4
5
6
7
8
2022-03-28 12:18:51,591 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.4.10, r3e5359c73d1a96dd7d2ac5bc8f987e9a89ef90ea, Mon Feb 28 10:03:15 PST 2022
Took 0.0013 seconds
hbase:001:0>

此时退出shell,再输入hbase version命令,也没有那串warning了。

1
2
3
4
HBase 2.4.10
Source code repository git://buildbox/home/apurtell/build/hbase revision=3e5359c73d1a96dd7d2ac5bc8f987e9a89ef90ea
Compiled by apurtell on Mon Feb 28 10:03:15 PST 2022
From source with checksum b38d895e719d82c3d3b2a886e3fca39f754f5ceef732e157604c905c7af3af2d4b9b861abfdd270e627a4fe1ffa8eba74a9b01095b856952d18023666cce66ad

之后正常退出HBase(这次就很快了)和Hadoop即可。本次安装和配置终于完成了😋。

但看着那句新的WARN消息,总觉得不对劲啊,有没有办法去掉它呢?(解决了,方法见下文。)

附:关于“WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable”消息的解决方案

这个消息是HBase不能加载到Hadoop的原生库导致的,可能是上文将HBase的环境变量设置为不调用Hadoop的库所致。那到底调用不调用呢?

但看官方文档Apache HBase ™ Reference Guide的说明:

Let’s presume your Hadoop shipped with a native library that suits the platform you are running HBase on. To check if the Hadoop native library is available to HBase, run the following tool (available in Hadoop 2.1 and greater):

1
2
3
4
5
6
7
8
9
$ ./bin/hbase --config ~/conf_hbase org.apache.hadoop.util.NativeLibraryChecker
2014-08-26 13:15:38,717 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Native library checking:
hadoop: false
zlib: false
snappy: false
lz4: false
bzip2: false
2014-08-26 13:15:38,863 INFO [main] util.ExitUtil: Exiting with status 1

Above shows that the native hadoop library is not available in HBase context.

大意就是说,可以先通过命令检查Hadoop的native libray是否存在,如果检查结果都是false,那就说明是原生库的问题。但本次自己检查没有发现这个问题,故不再记录这一问题的解决方案了。

(注:搜索他人解决方案时看到,hadoop checknative -a命令也能起到检查效果。)

那么原生库存在的情况下,需要怎样解决呢?

To fix the above, either copy the Hadoop native libraries local or symlink to them if the Hadoop and HBase stalls are adjacent in the filesystem. You could also point at their location by setting the environment variable in your hbase-env.sh.LD_LIBRARY_PATH

大意就是说,**需要在hbase-env.sh文件中增加LD_LIBRARY_PATH环境变量,值为Hadoop原生库的路径**(本文是 export LD_LIBRARY_PATH=/usr/local/hadoop/lib/native )。

之后重启HBase,再输入hbase shell命令,终端返回如下:

1
2
3
4
5
6
7
8
~$ hbase shell
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.4.10, r3e5359c73d1a96dd7d2ac5bc8f987e9a89ef90ea, Mon Feb 28 10:03:15 PST 2022
Took 0.0013 seconds
hbase:001:0>

WARN消息没了,问题成功解决😋。