% Hadoop分布式文件系统操作
% 上海交通大学OmniLab
http://omnilab.sjtu.edu.cn
% 2013年11月21日 更新
操作Hadoop分布式文件系统HDFS的基本命令是hdfs
,命令行参数如下:
$ hdfs
Usage: hdfs [--config confdir] COMMAND
where COMMAND is one of:
dfs run a filesystem command on the file systems supported in Hadoop.
namenode -format format the DFS filesystem
secondarynamenode run the DFS secondary namenode
namenode run the DFS namenode
journalnode run the DFS journalnode
zkfc run the ZK Failover Controller daemon
datanode run a DFS datanode
dfsadmin run a DFS admin client
haadmin run a DFS HA admin client
fsck run a DFS filesystem checking utility
balancer run a cluster balancing utility
jmxget get JMX exported values from NameNode or DataNode.
oiv apply the offline fsimage viewer to an fsimage
oev apply the offline edits viewer to an edits file
fetchdt fetch a delegation token from the NameNode
getconf get config values from configuration
groups get the groups which users belong to
Use -help to see options
hdfs命令的子功能很多,如dfs
处理一般文件系统命令,fsck
检查文件系统完整性,dfsadmin
用于文件系统管理。
本教程着重向用户介绍HDFS的一般用法,以dfs
子命令为主要内容。
HDFS的管理在其他文档介绍。
$ hdfs dfs
Usage: hadoop fs [generic options]
[-cat [-ignoreCrc] <src> ...]
[-chgrp [-R] GROUP PATH...]
[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
[-chown [-R] [OWNER][:[GROUP]] PATH...]
[-copyFromLocal <localsrc> ... <dst>]
[-copyToLocal [-ignoreCrc] [-crc] <src> ... <localdst>]
[-count [-q] <path> ...]
[-cp <src> ... <dst>]
[-df [-h] [<path> ...]]
[-du [-s] [-h] <path> ...]
[-expunge]
[-get [-ignoreCrc] [-crc] <src> ... <localdst>]
[-getmerge [-nl] <src> <localdst>]
[-help [cmd ...]]
[-ls [-d] [-h] [-R] [<path> ...]]
[-mkdir [-p] <path> ...]
[-moveFromLocal <localsrc> ... <dst>]
[-moveToLocal <src> <localdst>]
[-mv <src> ... <dst>]
[-put <localsrc> ... <dst>]
[-rm [-f] [-r|-R] [-skipTrash] <src> ...]
[-rmdir [--ignore-fail-on-non-empty] <dir> ...]
[-setrep [-R] [-w] <rep> <path/file> ...]
[-stat [format] <path> ...]
[-tail [-f] <file>]
[-test -[ezd] <path>]
[-text [-ignoreCrc] <src> ...]
[-touchz <path> ...]
[-usage [cmd ...]]
Generic options supported are
-conf <configuration file> specify an application configuration file
-D <property=value> use value for given property
-fs <local|namenode:port> specify a namenode
-jt <local|jobtracker:port> specify a job tracker
-files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars> specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]
格式如下:
$ hdfs dfs -copyFromLocal <localsrc> ... <dst>
复制单个文件:
$ hdfs dfs -copyFromLocal hello.txt /user/jianwen/
$ hdfs dfs -ls /user/jianwen/
Found 1 items
-rw-r--r-- 3 jianwen hadoop 15 2013-11-21 11:21 /user/jianwen/hello.txt
复制文件夹:
$ hdfs dfs -copyFromLocal junkdir /user/jianwen/
hdfs dfs -ls /user/jianwen/junkdir
Found 1 items
-rw-r--r-- 3 jianwen hadoop 15 2013-11-21 11:23 /user/jianwen/junkdir/hello.txt
格式:
$ hdfs dfs -copyToLocal [-ignoreCrc] [-crc] <src> ... <localdst>
用法与copyFromLocal
。
hdfs dfs -cat /user/jianwen/hello.txt
alias hls='hdfs dfs -ls'
alias hcat='hdfs dfs -cat'
alias hrm='hdfs dfs -rm'
alias hmv='hdfs dfs -mv'
alias hrmdir='hdfs dfs -rmdir'
alias hcp='hdfs dfs -cp'
alias hcpr='hdfs dfs -copyFromLocal'
alias hcpl='hdfs dfs -copyToLocal'
alias hmkdir='hdfs dfs -mkdir'
alias hdu='hdfs dfs -du'
alias hdf='hdfs dfs -df'
- 重写HDFS命令介绍一部分,两个Hadoop版本的定义出现了冲突,还是应该以
hadoop fs
为基准。
- "Hadoop MapReduce Tutorial v1.2.1" http://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html
- "HDFS File System Shell Guide v1.2.1" http://hadoop.apache.org/docs/r1.2.1/file_system_shell.html