FAQs

Q: If I add new DataNodes to the cluster will HDFS move the blocks to the newly added nodes in order to balance disk space utilization between the nodes?
Q: On "Add host", QueryIO is unable to SSH to host machine. What should I do?
Q: Does adding number of nodes require different ports?
Q: Adding DataNode shows error : "NameNode HOST-IP mapping not listed in /etc/hosts file in DataNode system" ?
Q: No machine on network is able to access queryIO server UI?
Q: Why QueryIO supports one database instance per NameNode?
Q: Why does NameNode status show : "Started with outdated configuration" ?
Q: I am getting "Connection Refused Exception"?
Q: What is the purpose of the checkpoint NameNode?
Q: How do i browse back to root in "Data Browser"?
Q: Data Browser does not show any data. What may be the reason?
Q: To whom does the system send E-mail on generating an alert when E-mail is configured?
Q: What should I do if I am not getting email notification on violation of a rule?
Q: I am currently evaluating QueryIO and have installed a single node cluster on my laptop which gets its IP address dynamically. My cluster setup keep failing as I switch between my home and office network due to change in IP. I understand that cluster should be setup on machines having static IP addresses but I would like to have some workaround to get it working during evaluation phase with single machine setup.

Q: If I add new DataNodes to the cluster will HDFS move the blocks to the newly added nodes in order to balance disk space utilization between the nodes?

No, HDFS will not move blocks to new nodes automatically. However, newly created files will likely have their blocks placed on the new nodes. You need to run Balancer on NameNode to re-balance the cluster.

Q: On "Add host", QueryIO is unable to SSH to host machine. What should I do?

Please check that SSH service is enabled on remote machine and credentials provided are correct.

Q: Does adding number of nodes require different ports?

Yes, all cluster components like NameNode, DataNode, ResourceManager, NodeManager, host must work on different ports for single host machine. Adding more that one node will require changing port settings.

Q: Adding DataNode shows error : "NameNode HOST-IP mapping not listed in /etc/hosts file in DataNode system" ?

You need to add IP to hostname mapping in your host's /etc/hosts file. Administrative privileges are required to edit /etc/hosts file.

To get system hostname: $ echo $HOSTNAME
To get IP address: ifconfig
Edit /etc/hosts file: sudo /etc/hosts
For example : If your host's IP address is 192.168.0.16 and hostname is "server.local". Then append "192.168.0.16 server.local" in /etc/hosts file.

Q: No machine on network is able to access queryIO server UI?

This problem might be due to ports used by QueryIO are blocked by firewall. All ports used by QueryIO must be open from firewall.

Q: Why QueryIO supports one databsase instance per NameNode?

In a typical Hadoop cluster, total number of files grows to the order of millions over a period of time. Thus with multiple NameNodes having millions of files each, HDFS cluster storage scales horizontally but the namespace does not. In order to scale the name service horizontally, NameNode federation uses multiple independent namespaces. The Namenodes are federated, that is, the Namenodes are independent and don't require coordination with each other. The datanodes are used as common storage for blocks by all the federated Namenodes. Each datanode registers with all the Namenodes in the cluster. QueryIO supports configuration of one database instance per namespace to support NameNode Federation. User can define a database configuration and link it to a namespace. All the metadata / tags associated with the data in given namespace is stored in this linked in database. This feature is required only if user need to use Analytics query feature.

Q: Why does NameNode status show : "Started with outdated configuration" ?

This happens after you have changed configuration properties for NameNode. You need to restart NameNode by first stopping NameNode and then start NameNode. Same is valid for all cluster components. (NameNode, DataNode, ResourceManager, NodeManager).

Q: I am getting "Connection Refused Exception"?

A common cause for this is the Hadoop service isn't running. Make sure all your cluster components are running. You can see status of all components on Dashboard view.

Q: What is the purpose of the checkpoint NameNode?

The only purpose of the secondary name-node is to perform periodic checkpoints. The secondary name-node periodically downloads current name-node image and edits log files, joins them into new image and uploads the new image back to the (primary and the only) name-node.
So if the name-node fails and you can restart it on the same physical node then there is no need to shutdown data-nodes, just the name-node needs to be restarted. If you cannot use the old node anymore you will need to copy the latest image somewhere else. The latest image can be found either on the node that used to be the primary before failure if available; or on the secondary name-node. The latter will be the latest checkpoint without subsequent edits logs, that is the most recent name space modifications may be missing there. You will also need to restart the whole cluster in this case.

Q: How do i browse back to root in "Data Browser"?

Check for the forward slash '/' at the top of data browser, clicking on this this lead you to the root location.

Q: Data Browser does not show any data. What may be the reason?

This problem might occur if any of your NameNode or DataNode is stopped. Please make sure your NameNode and DataNode are running.

Q: To whom does the system send E-mail on generating an alert when E-mail is configured?

E-mail is sent to the Email IDs of all registered users. Please write the Email IDs carefully while creating user accounts.

Q: What should I do if I am not getting Email notification on violation of a rule?

You need to configure notification settings to get email alerts.

Q: I am currently evaluating QueryIO and have installed a single node cluster on my laptop which gets its IP address dynamically. My cluster setup keep failing as I switch between my home and office network due to change in IP. I understand that cluster should be setup on machines having static IP addresses but I would like to have some workaround to get it working during evaluation phase with single machine setup.

For production setup you should consider having Static IPs to your machines, while for a standalone setup where all servers are configured on same host please follow these steps:

In fresh install replace IP address in database JDBC Connection URLs with machine's hostname from 'Data' -> 'Manage Databases'
Add new host by its hostname instead of IP.
Add an entry in /etc/hosts to map current IP with machine's hostname.
Every time you change networks you will have to
- Update IP-Hostname mapping in /etc/hosts.
- Restart QueryIOUI Server by executing <Installation Dir>/QueryIO/tomcat/bin/stop_queryio.sh and
  then <Installation Dir>/QueryIO/tomcat/bin/start_queryio.sh.
- Restart all nodes. (Namenode, Datanode, CheckpointNode, ResourceManager, NodeManager)

QueryIO, "Big Data Intelligence" and the QueryIO Logo are trademarks of QueryIO Corporation. Apache, Hadoop and HDFS are trademarks of The Apache Software Foundation.