博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Hadoop源码阅读-HDFS-day2
阅读量:6900 次
发布时间:2019-06-27

本文共 13394 字,大约阅读时间需要 44 分钟。

昨天看到了AbstractFileSystem,也知道应用访问文件是通过FileContext这个类,今天来看这个类的源代码,先看下这个类老长的注释说明

1 /**  2  * The FileContext class provides an interface to the application writer for  3  * using the Hadoop file system.  4  * It provides a set of methods for the usual operation: create, open,   5  * list, etc   6  *   7  * 

8 * *** Path Names *** 9 *

10 * 11 * The Hadoop file system supports a URI name space and URI names. 12 * It offers a forest of file systems that can be referenced using fully 13 * qualified URIs. 14 * Two common Hadoop file systems implementations are 15 *

    16 *
  • the local file system: file:///path 17 *
  • the hdfs file system hdfs://nnAddress:nnPort/path 18 *
19 * 20 * While URI names are very flexible, it requires knowing the name or address 21 * of the server. For convenience one often wants to access the default system 22 * in one's environment without knowing its name/address. This has an 23 * additional benefit that it allows one to change one's default fs 24 * (e.g. admin moves application from cluster1 to cluster2). 25 *

26 * 27 * To facilitate this, Hadoop supports a notion of a default file system. 28 * The user can set his default file system, although this is 29 * typically set up for you in your environment via your default config. 30 * A default file system implies a default scheme and authority; slash-relative 31 * names (such as /for/bar) are resolved relative to that default FS. 32 * Similarly a user can also have working-directory-relative names (i.e. names 33 * not starting with a slash). While the working directory is generally in the 34 * same default FS, the wd can be in a different FS. 35 *

36 * Hence Hadoop path names can be one of: 37 *

    38 *
  • fully qualified URI: scheme://authority/path 39 *
  • slash relative names: /path relative to the default file system 40 *
  • wd-relative names: path relative to the working dir 41 *
42 * Relative paths with scheme (scheme:foo/bar) are illegal. 43 * 44 *

45 * ****The Role of the FileContext and configuration defaults**** 46 *

47 * The FileContext provides file namespace context for resolving file names; 48 * it also contains the umask for permissions, In that sense it is like the 49 * per-process file-related state in Unix system. 50 * These two properties 51 *

    52 *
  • default file system i.e your slash) 53 *
  • umask 54 *
55 * in general, are obtained from the default configuration file 56 * in your environment, (@see {
@link Configuration}). 57 * 58 * No other configuration parameters are obtained from the default config as 59 * far as the file context layer is concerned. All file system instances 60 * (i.e. deployments of file systems) have default properties; we call these 61 * server side (SS) defaults. Operation like create allow one to select many 62 * properties: either pass them in as explicit parameters or use 63 * the SS properties. 64 *

65 * The file system related SS defaults are 66 *

    67 *
  • the home directory (default is "/user/userName") 68 *
  • the initial wd (only for local fs) 69 *
  • replication factor 70 *
  • block size 71 *
  • buffer size 72 *
  • encryptDataTransfer 73 *
  • checksum option. (checksumType and bytesPerChecksum) 74 *
75 * 76 *

77 * *** Usage Model for the FileContext class *** 78 *

79 * Example 1: use the default config read from the $HADOOP_CONFIG/core.xml. 80 * Unspecified values come from core-defaults.xml in the release jar. 81 *

    82 *
  • myFContext = FileContext.getFileContext(); // uses the default config 83 * // which has your default FS 84 *
  • myFContext.create(path, ...); 85 *
  • myFContext.setWorkingDir(path) 86 *
  • myFContext.open (path, ...); 87 *
88 * Example 2: Get a FileContext with a specific URI as the default FS 89 *
    90 *
  • myFContext = FileContext.getFileContext(URI) 91 *
  • myFContext.create(path, ...); 92 * ... 93 *
94 * Example 3: FileContext with local file system as the default 95 *
    96 *
  • myFContext = FileContext.getLocalFSFileContext() 97 *
  • myFContext.create(path, ...); 98 *
  • ... 99 *
100 * Example 4: Use a specific config, ignoring $HADOOP_CONFIG101 * Generally you should not need use a config unless you are doing102 *
    103 *
  • configX = someConfigSomeOnePassedToYou.104 *
  • myFContext = getFileContext(configX); // configX is not changed,105 * // is passed down 106 *
  • myFContext.create(path, ...);107 *
  • ...108 *
109 * 110 */111 112 @InterfaceAudience.Public113 @InterfaceStability.Evolving /*Evolving for a release,to be changed to Stable */114 public class FileContext {
View Code

FileContext类为应用程序写提供一个接口,提供了常用操作:创建(create),打开(open),列举(list)等

Hadoop 文件系统的两个通用实现分别是

  1. 本地文件系统 file:///path
  2. hdfs文件系统 hdfs://nnAddress:nnPort/path

 URI命名非常灵活,它需要知道服务端的名字或者地址。HDFS有一个默认值,这有一个额外的好处就是,允许更改默认的fs(比如:管理员将应用从集群1移到集群2)

Hadoop 支持默认文件系统的理念。用户可以设置他的默认文件系统。

默认的文件系统实现了一个默认的scheme和authority;slash-relative名称(例如:/for/bar) 将解析成相对于默认FS的路径

同理,用户可以拥有自己的working-directory-relative名称(不是以slash开头的)。

因此,Hadoop路径的可以是以下几种:

完全合法的URI                    scheme://authority/path

slash relative names          /path 相对于默认的文件系统

wd-relative  names           path 相对于工作目录

1 private FileContext(final AbstractFileSystem defFs, 2     final FsPermission theUmask, final Configuration aConf) { 3     defaultFS = defFs; 4     umask = FsPermission.getUMask(aConf); 5     conf = aConf; 6     try { 7       ugi = UserGroupInformation.getCurrentUser(); 8     } catch (IOException e) { 9       LOG.error("Exception in getCurrentUser: ",e);10       throw new RuntimeException("Failed to get the current user " +11               "while creating a FileContext", e);12     }13     /*14      * Init the wd.15      * WorkingDir is implemented at the FileContext layer 16      * NOT at the AbstractFileSystem layer. 17      * If the DefaultFS, such as localFilesystem has a notion of18      *  builtin WD, we use that as the initial WD.19      *  Otherwise the WD is initialized to the home directory.20      */21     workingDir = defaultFS.getInitialWorkingDirectory();22     if (workingDir == null) {23       workingDir = defaultFS.getHomeDirectory();24     }25     resolveSymlinks = conf.getBoolean(26         CommonConfigurationKeys.FS_CLIENT_RESOLVE_REMOTE_SYMLINKS_KEY,27         CommonConfigurationKeys.FS_CLIENT_RESOLVE_REMOTE_SYMLINKS_DEFAULT);28     util = new Util(); // for the inner class29   }

 

FileContext传进来三个参数,

  1. defFs   FileContext默认的FS
  2. theUmask   貌似没有使用到,历史遗留问题吗?他的umask使用FsPermission.getUMask(conf)初始化了
  3. conf  配置信息

下面来看它说的几个常用的方法,首先是create,隐藏的是一堆的注释

1 /** 2    * Create or overwrite file on indicated path and returns an output stream for 3    * writing into the file. 4    *  5    * @param f the file name to open 6    * @param createFlag gives the semantics of create; see {
@link CreateFlag} 7 * @param opts file creation options; see {
@link Options.CreateOpts}. 8 *
    9 *
  • Progress - to report progress on the operation - default null10 *
  • Permission - umask is applied against permisssion: default is11 * FsPermissions:getDefault()12 * 13 *
  • CreateParent - create missing parent path; default is to not14 * to create parents15 *
  • The defaults for the following are SS defaults of the file16 * server implementing the target path. Not all parameters make sense17 * for all kinds of file system - eg. localFS ignores Blocksize,18 * replication, checksum19 *
      20 *
    • BufferSize - buffersize used in FSDataOutputStream21 *
    • Blocksize - block size for file blocks22 *
    • ReplicationFactor - replication for blocks23 *
    • ChecksumParam - Checksum parameters. server default is used24 * if not specified.25 *
    26 *
27 * 28 * @return {
@link FSDataOutputStream} for created file29 * 30 * @throws AccessControlException If access is denied31 * @throws FileAlreadyExistsException If file f already exists32 * @throws FileNotFoundException If parent of f does not exist33 * and createParent is false34 * @throws ParentNotDirectoryException If parent of f is not a35 * directory.36 * @throws UnsupportedFileSystemException If file system for f is37 * not supported38 * @throws IOException If an I/O error occurred39 * 40 * Exceptions applicable to file systems accessed over RPC:41 * @throws RpcClientException If an exception occurred in the RPC client42 * @throws RpcServerException If an exception occurred in the RPC server43 * @throws UnexpectedServerException If server implementation throws44 * undeclared exception to RPC server45 * 46 * RuntimeExceptions:47 * @throws InvalidPathException If path f is not valid48 */
View Code

 

 

1 public FSDataOutputStream create(final Path f, 2       final EnumSet
createFlag, Options.CreateOpts... opts) 3 throws AccessControlException, FileAlreadyExistsException, 4 FileNotFoundException, ParentNotDirectoryException, 5 UnsupportedFileSystemException, IOException { 6 Path absF = fixRelativePart(f); 7 8 // If one of the options is a permission, extract it & apply umask 9 // If not, add a default Perms and apply umask;10 // AbstractFileSystem#create11 12 CreateOpts.Perms permOpt = CreateOpts.getOpt(CreateOpts.Perms.class, opts);13 FsPermission permission = (permOpt != null) ? permOpt.getValue() :14 FILE_DEFAULT_PERM;15 permission = permission.applyUMask(umask);16 17 final CreateOpts[] updatedOpts = 18 CreateOpts.setOpt(CreateOpts.perms(permission), opts);19 return new FSLinkResolver
() {20 @Override21 public FSDataOutputStream next(final AbstractFileSystem fs, final Path p) 22 throws IOException {23 return fs.create(p, createFlag, updatedOpts);24 }25 }.resolve(this, absF);26 }

 

create方法是用来在指定的路径上创建或者重写文件并返回outputstream的一个方法 

在最后return时 new的 FSLinkResolver是用来处理路径为符号链接的情况

1 /** 2    * Generic helper function overridden on instantiation to perform a 3    * specific operation on the given file system using the given path 4    * which may result in an UnresolvedLinkException. 5    * @param fs AbstractFileSystem to perform the operation on. 6    * @param p Path given the file system. 7    * @return Generic type determined by the specific implementation. 8    * @throws UnresolvedLinkException If symbolic link path could 9    *           not be resolved10    * @throws IOException an I/O error occurred11    */12   abstract public T next(final AbstractFileSystem fs, final Path p)13       throws IOException, UnresolvedLinkException;14 15 16 17 18 /**19    * Performs the operation specified by the next function, calling it20    * repeatedly until all symlinks in the given path are resolved.21    * @param fc FileContext used to access file systems.22    * @param path The path to resolve symlinks on.23    * @return Generic type determined by the implementation of next.24    * @throws IOException25    */26   public T resolve(final FileContext fc, final Path path) throws IOException {27     int count = 0;28     T in = null;29     Path p = path;30     // NB: More than one AbstractFileSystem can match a scheme, eg 31     // "file" resolves to LocalFs but could have come by RawLocalFs.32     AbstractFileSystem fs = fc.getFSofPath(p);33 34     // Loop until all symlinks are resolved or the limit is reached35     for (boolean isLink = true; isLink;) {36       try {37         in = next(fs, p);38         isLink = false;39       } catch (UnresolvedLinkException e) {40         if (!fc.resolveSymlinks) {41           throw new IOException("Path " + path + " contains a symlink"42               + " and symlink resolution is disabled ("43               + CommonConfigurationKeys.FS_CLIENT_RESOLVE_REMOTE_SYMLINKS_KEY + ").", e);44         }45         if (!FileSystem.areSymlinksEnabled()) {46           throw new IOException("Symlink resolution is disabled in"47               + " this version of Hadoop.");48         }49         if (count++ > FsConstants.MAX_PATH_LINKS) {50           throw new IOException("Possible cyclic loop while " +51                                 "following symbolic link " + path);52         }53         // Resolve the first unresolved path component54         p = qualifySymlinkTarget(fs.getUri(), p, fs.getLinkTarget(p));55         fs = fc.getFSofPath(p);56       }57     }58     return in;59   }
View Code

 

next 是一个一般的helper函数,需要被实例重写,从而在给定路径的文件系统上执行特定的操作,可能会抛UnresolvedLinkException异常

resolve 通过next执行特定的操作,反复的调用next函数,知道路径上所有的符号链接被解析

转载于:https://www.cnblogs.com/nashiyue/p/5331225.html

你可能感兴趣的文章
Bootstrap学习笔记--表格
查看>>
JVM内存区域与多线程
查看>>
光谱响应与量子效率
查看>>
Tcp创建三次握手和关闭四次握手
查看>>
阿里云&数数科技联合打造新一代游戏数据分析系统正式上线
查看>>
机器学习之父Michael I.Jordan刚发了一篇长文反思人工智能,从一个生死攸关的故事说起...
查看>>
除了求婚和送货,无人机还可以用来打游戏
查看>>
民生银行与华为成立联合创新实验室:聚焦七大领域 ,构建“科技+金融”新生态...
查看>>
【AI科幻】地球陨落·真相(下)
查看>>
在 Windows Server 2003 中重置目录服务还原模式的管理员帐户密码
查看>>
AOP简介AOP是什么?
查看>>
SQL Server 2012实施与管理实战指南(笔记)——Ch4数据库连接组件
查看>>
C#实现WinForm DataGridView控件支持叠加数据绑定
查看>>
Zygote浅谈
查看>>
basename函数
查看>>
mysql 数据库的维护,优化
查看>>
设计模式之代理模式之读写分离!!!
查看>>
Windows server 2003 SSL 配置
查看>>
web service简介
查看>>
软路由 - 开篇
查看>>