AdHoc Data Definition

In this chapter

This chapter explains about Ad hoc analysis of different file formats.

What is AdHoc Data Definition

QueryIO provides Ad hoc analysis of different file formats like machine generated logs, CSV / TSV, Apache server logs, IIS logs, XML, mbox, key value pairs, JSON, regex patterns etc.

QueryIO AdHoc feature allows you to execute MapReduce jobs for data processing through Query Designer and store parsed data in result table.
QueryIO supports adHoc querying for various file types. You can customize the result table fields, types etc using adHoc data definition feature.

Various supported file type are :

Introduction to Apache Hive

Hive is a data warehouse system for Hadoop that facilitates easy data summarization, ad-hoc queries, and the analysis of large datasets stored in Hadoop compatible file systems. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL.

The Apache Hive data warehouse software facilitates querying and managing large datasets residing in distributed storage. Built on top of Apache Hadoop , it provides

Hive defines a simple SQL-like query language, called QL, that enables users familiar with SQL to query the data. At the same time, this language also allows programmers who are familiar with the MapReduce framework to be able to plug in their custom mappers and reducers to perform more sophisticated analysis that may not be supported by the built-in capabilities of the language. QL can also be extended with custom scalar functions (UDF's), aggregations (UDAF's), and table functions (UDTF's).

Adhoc Data Definition Details

Add new Adhoc Data Definition

To define a new adhoc content processor for a file type, click Add.

General settings

Configurations

Schema Definition

Edit / Delete Adhoc Data Definition

QueryIO allows you to edit general settings of the adhoc definition. Select the adhoc job and click Edit.

To delete it, select the job and click Delete.



Copyright © 2018 QueryIO Corporation. All Rights Reserved.

QueryIO, "Big Data Intelligence" and the QueryIO Logo are trademarks of QueryIO Corporation. Apache, Hadoop and HDFS are trademarks of The Apache Software Foundation.