Create New File

Create new file operation is used to create a file to the directory. Following interfaces can be used to add file in the directory.

DFS Client API

Apache Hadoop and java.io classes are used to create file in the directory.

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;

import org.apache.commons.io.IOUtils;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hdfs.DFSConfigKeys;

public class PutObject {
	/*
	 * This program reads a file from the local file system and saves it on HDFS.
	 */
	public static void main(String[] args) throws IOException{
		Configuration conf = new Configuration(true);	//Create a configuration object to define hdfs properties
		conf.set(DFSConfigKeys.FS_DEFAULT_NAME_KEY, "hdfs://192.168.0.1:9000"); // URL for your namenode
		conf.set(DFSConfigKeys.DFS_REPLICATION_KEY, "3"); // Replication count for files you write
		
		OutputStream os = null;
		InputStream is = null;
		try{
			// First initialize DFS FileSystem object  
			FileSystem dfs = FileSystem.get(conf);	//Hadoop FileSystem object with QueryIO configuration
			dfs.mkdirs(new Path("/queryio/demo/"));	//created a new directory "demo"
			
			os = dfs.create(new Path("/queryio/demo/file1.txt"));	//return a OutputStream at indicated path
			
			is = new FileInputStream(new File("/local/queryio.txt"));	//InputStream from a local filesystem file
			
			IOUtils.copy(is, os);	//copy bytes from InputStream to OutputStream : create Object Operation
		} finally {
			try{
				if(is!=null)
					is.close();	//close InputStream
			} catch(Exception e){
				e.printStackTrace();
			}
			try{
				if(os!=null)
					os.close();	//close OutputStream
			} catch(Exception e){
				e.printStackTrace();
			}
		}
	}
}	
	

WEBHDFS API

Following version of curl command is used for creating new file.

Step 1: Submit a HTTP PUT request without automatically following redirects and without sending the file data. Request is redirected to datanode where data is to be written.

curl -i -X PUT "http://<HOST>:<PORT>/<PATH>?user.name=<username>&op=CREATE [&overwrite=<true|false>][&blocksize=<LONG>][&replication=<SHORT>] [&permission=<OCTAL>][&buffersize=<INT>]"

Step 2: Submit another HTTP PUT request using the URL in the location header with the file data to be written.

curl -i -X PUT -T <LOCAL_FILE> "http://<DATANODE>:<PORT>/<PATH>?user.name=<username>op=CREATE..."
Step 1: Sample Request:
curl -i -X PUT "http://192.168.0.1:50070/webhdfs/v1/queryio/demo/file1.txt?user.name=admin&op=CREATE&overwrite=false"
HTTP Request:
PUT /webhdfs/v1/queryio/demo/file1.txt?user.name=admin&op=CREATE&overwrite=false HTTP/1.1
User-Agent: curl/7.21.4 (universal-apple-darwin11.0) libcurl/7.21.4 OpenSSL/0.9.8r zlib/1.2.5
Host: 192.168.0.16:50070
Accept: */*
HTTPResponse:
HTTP/1.1 307 TEMPORARY_REDIRECT
Location: http://server.local:50075/webhdfs/v1/queryio/demo/file1.txt?op=CREATE&user.name=admin&namenoderpcaddress=server.local:9000&overwrite=false
Content-Type: application/octet-stream
Content-Length: 0
Server: Jetty(6.1.26)

Step 2: Sample Request:
curl -i -X PUT -T /local/queryio.txt "http://server.local:50075/webhdfs/v1/queryio/demo/file1.txt?op=CREATE&user.name=admin&namenoderpcaddress=server.local:9000&overwrite=false"
HTTP Request:
PUT /webhdfs/v1/queryio/demo/file1.txt?op=CREATE&user.name=admin&namenoderpcaddress=server.local:9000&overwrite=false HTTP/1.1
User-Agent: curl/7.21.4 (universal-apple-darwin11.0) libcurl/7.21.4 OpenSSL/0.9.8r zlib/1.2.5
Host: server.local:50075
Accept: */*
Content-Length: 16011
Expect: 100-continue
HTTP Response:
HTTP/1.1 100 Continue

HTTP/1.1 201 Created
Location: webhdfs://0.0.0.0:50070/queryio/demo/file1.txt
Content-Type: application/octet-stream
Content-Length: 0
Server: Jetty(6.1.26)
	

The client receives a 201 Created response code and the WebHDFS URI of the file in the Location header



Copyright 2017 QueryIO Corporation. All Rights Reserved.

QueryIO, "Big Data Intelligence" and the QueryIO Logo are trademarks of QueryIO Corporation. Apache, Hadoop and HDFS are trademarks of The Apache Software Foundation.