The Apache Hive data warehouse software facilitates querying and managing large datasets residing in distributed storage. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. Under the hood Hive translates this language to MapReduce code. One great advantage of Hive is that you only have to create your schema on read in stead of on write as used by traditional relational databases.

At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL.


Related news