hadoop分析之三org.apache.hadoop.hdfs.server.namenode各个类的功能与角色
以hadoop0.21为例。
NameNode.java: 主要维护文件系统的名字空间和文件的元数据,以下是代码中的说明。
/********************************************************** * NameNode serves as both directory namespace manager and * "inode table" for the Hadoop DFS. There is a single NameNode * running in any DFS deployment. (Well, except when there * is a second backup/failover NameNode.) * * The NameNode controls two critical tables: * 1) filename ->blocksequence (namespace) * 2) block ->machinelist ("inodes") * * The first table is stored on disk and is very precious. * The second table is rebuilt every time the NameNode comes * up. * * 'NameNode' refers to both this class as well as the 'NameNode server'. * The 'FSNamesystem' class actually performs most of the filesystem * management. The majority of the 'NameNode' class itself is concerned * with exposing the IPC interface and the http server to the outside world, * plus some configuration management. * * NameNode implements the ClientProtocol interface, which allows * clients to ask for DFS services. ClientProtocol is not * designed for direct use by authors of DFS client code. End -users * should instead use the org.apache.nutch.hadoop.fs.FileSystem class. * * NameNode also implements the DatanodeProtocol interface, used by * DataNode programs that actually store DFS data blocks. These * methods are invoked repeatedly and automatically by all the * DataNodes in a DFS deployment. * * NameNode also implements the NamenodeProtocol interface, used by * secondary namenodes or rebalancing processes to get partial namenode's * state, for example partial blocksMap etc. **********************************************************/
FSNamesystem.java:
主要维护几个表的信息:维护了文件名与block列表的映射关系;有效的block的集合;block与节点列表的映射关系;节点与block列表的映射关系;更新的heatbeat节点的LRU cache
/*************************************************** * FSNamesystem does the actual bookkeeping work for the * DataNode. * * It tracks several important tables. * * 1) valid fsname --> blocklist (kept on disk, logged) * 2) Set of all valid blocks (inverted #1) * 3) block --> machinelist (kept in memory, rebuilt dynamically from reports) * 4) machine --> blocklist (inverted #2) * 5) LRU cache of updated -heartbeat machines ***************************************************/
INode.java:
HDFS将文件和文件目录抽象成INode。
/** * We keep an in-memory representation of the file/block hierarchy. * This is a base INode class containing common fields for file and * directory inodes. */
FSImage.java:
需要将INode信息持久化到磁盘上FSImage上。
/** * FSImage handles checkpointing and logging of the