Hadoop NameNode

In this chapter

NameNode is the centerpiece of QueryIO cluster. This chapter explains HDFS NameNode and related functions.
Various aspects of NameNode explained are:

What is NameNode ?

The NameNode is the centerpiece of an HDFS file system. It keeps the directory tree of all files in the file system in form of metadata, and tracks where across the cluster, the file data is kept. It does not store the data of these files itself. Client applications talk to the NameNode whenever they wish to locate a file, or when they want to add/copy/move/delete a file. The NameNode responds the successful requests by returning a list of relevant DataNode servers where the data lives.

An HDFS cluster consists of a single NameNode, a master server that manages the file system namespace and regulates access to files by clients.

There are two types of NameNodes in a cluster:

The Active NameNode is responsible for all client operations in the cluster, while the Standby NameNode is simply acting as a slave. The Active NameNode is a single point of failure for the HDFS cluster. When the NameNode goes down, the file system will go offline. There is an optional Standby NameNode that can be hosted on a separate machine. So when active NameNode goes down, standby name node can be used as active NameNode and file system will be safe. This process is called failover.

To manage NameNodes on your cluster, go to HDFS > NameNode.

name-node

NameNode Summary

Following details about NameNode are displayed:

NameNode Summary

It displays certain attributes about the NameNode in the cluster in a tabular form. Summary attributes contain:

Activity Summary

Displays activity details performed on NameNode. Activities supported by QueryIO are 'Health Check' and 'Balancer'.

Add NameNode

Add Active NameNode

To add a new NameNode to your cluster, click Add button. It will bring forth a wizard to add a new NameNode.


Add Standby NameNode

To add a new Stanby NameNode to your cluster, you must have NFS mounted shared directory. Click here to see how to configure NFS mount point.

Click Add button on NameNode page. It will bring forth a wizard to add a new NameNode.


Start/Stop NameNode

To start or stop a NameNode, select the check box against the node and click on Start or Stop button respectively.

Delete NameNode

NameNode can not be deleted until there is no DataNode in the cluster. Click here to read more.

Start/Stop Monitoring

JMX monitoring takes place which checks NameNode's status, CPU Usage, RAM, N/W Rcvd, N/W Sent, Disk Read and Disk Write. To start or stop NameNode Monitoring, select the check box against the node and click on Start Monitoring or Stop Monitoring button respectively.

Configure NameNode

Select the NameNode and click on Configure. NameNode related properties will be displayed. You can update the settings and click Save to store changes. But you must have privileges to configure NameNode.

You can also add custom configuration properties related to namenode or delete any configuration property.

NameNode-config

Run Health Check

Health Check makes sure that the data stored on HDFS is safe and not corrupted. Running the health check will scan all the files in the system and returns status as "Completed" or "Failed". Every file will be checked by using checksum, and if an error is found in any file, it will return failed status. The result of Health check will be added to activity summary of NameNode. To start a health check, select the NameNode and click on Health Check.

Run Balancer

Work of balancer is basically to distribute data evenly among all the DataNodes. Balancer will scan amount of data storage at all the nodes and distributes them evenly at all nodes. Suppose there are two DataNodes, one node has 15GB of data and other node has 5GB of data. Now balancer will check both nodes and distribute the whole 20GB data evenly among both nodes i.e 10GB on each node.

FailOver

Failover is switching to a redundant or standby NameNode upon the failure or abnormal termination of the previously active NameNode. If NameNode goes down, failover feature will automatically switch active NameNode to standby mode and standby NameNode to active mode. Thus system will not fail. This action can be reversed, once failed NameNode has recovered.

Click on FailOver to perform failover process.

High Availability

High Availability(HA) feature of QueryIO will make sure that QueryIO agent service on host is always available. To make it possible, in every two minutes(configurable), host checks itself whether the QueryIO agent process is up or not and if it is down, then host will start QueryIO agent process on itself.
HDFS has always had a well-known single point of failure which impacts HDFS's availability: The system relies on a single NameNode to coordinate access to the file system data and if NameNode is down, then the whole cluster is unavailable. Hadoop introduced its own High Availability feature to ease this problem.
Click here to read more.

Safemode

During start up NameNode loads the filesystem state from fsimage and edits log file. It then waits for DataNodes to report their blocks so that it does not prematurely start replicating the blocks though enough replicas already exist in the cluster. During this time NameNode stays in safemode. A Safemode for NameNode is essentially a read-only mode for the HDFS cluster, where it does not allow any modifications to filesystem or blocks. Normally NameNode gets out of safemode automatically at the beginning.

NameNode Details

NameNode in the menu displays the NameNodes in the cluster. Just click on the NameNode name in the menu to get all the details of the NameNode system. Details are displayed in the form of charts.

NameNode details contain:

NameNode-charts

Copyright © 2018 QueryIO Corporation. All Rights Reserved.

QueryIO, "Big Data Intelligence" and the QueryIO Logo are trademarks of QueryIO Corporation. Apache, Hadoop and HDFS are trademarks of The Apache Software Foundation.