Apache Hive CLI vs Beeline

Apache Hive development has shifted from the original Hive server (HiveServer1) to the new server (HiveServer2), and hence users and developers need to move to the new access tool. However, there’s more to this process than simply switching the executable name from “hive” to “beeline”.  Apache Hive was a heavyweight command-line tool that accepted the command and runs them utilizing MapReduce. Later, the tool was split into a client-server model, in which HiveServer1 is the server (responsible for compiling and monitoring MapReduce jobs) and Hive CLI is the command-line interface (sends SQL to the server). Recently, the Hive community introduced HiveServer2 which is an enhanced Hive server designed for multi-client concurrency and improved authentication that also provides better support for clients connecting through JDBC and ODBC. Now HiveServer2, with Beeline as the command-line interface, is the recommended solution; HiveServer1 and Hive CLI are deprecated and the latter won’t even work with HiveServer2

The primary difference between the Hive CLI & Beeline involves how the clients connect to Apache Hive.

  • The Hive CLI, which connects directly to HDFS and the Hive Metastore, and can be used only on a host with access to those services.
  • Beeline, which connects to HiveServer2 and requires access to only one .jar file: hive-jdbc-<version>-standalone.jar

Server Connection

Hive CLI connects to a remote HiveServer1 instance using the Thrift protocol. To connect to a server, you specify the hostname and optionally the port number of the remote server:

> hive -h <hostname> -p
<port>

In contrast, Beeline connects to a remote HiveServer2 instance using JDBC. Thus, the connection parameter is a JDBC URL that’s common in JDBC-based clients:

> beeline -u  <url> -n <username> -p
<password>

Here are a few URL examples:

jdbc:hive2://ubuntu:11000/db2?hive.cli.conf.printheader=true;hive.exec.mode.local.auto.inputbytes.max=9999#stab=salesTable;icol=customerID
jdbc:hive2://?hive.cli.conf.printheader=true;hive.exec.mode.local.auto.inputbytes.max=9999#stab=salesTable;icol=customerID
jdbc:hive2://ubuntu:11000/db2;user=foo;password=bar
jdbc:hive2://server:10001/db;user=foo;password=bar?hive.server2.transport.mode=http;hive.server2.thrift.http.path=hs2

Apache Hive CLI VS Beeline:  Query Execution

Executing queries in Beeline is very similar to that in Hive CLI. In Hive CLI:

> hive -e <query in quotes>
> hive -f <query file name>

In Beeline:

> beeline -e <query in quotes>
> beeline -f <query file name>

In either case, if no -e or -f options are given, both client tools go into an interactive mode in which you can give and execute queries or commands line by line.

Apache Hive CLI VS Beeline:  Variables

There are four namespaces for variables:

  • hiveconf for Hive configuration variables
  • system for system variables
  • env for environment variables
  • hivevar for Hive variables (HIVE-1096)

There are two ways to define a variable: as a command-line argument or using the set command in interactive mode.

Defining Hive variables in the command line in Hive CLI:

> hive -d key=value
> hive --define key=value
> hive --hivevar key=value

Defining Hive variables in command line in Beeline

> beeline --hivevar key=value

Beeline Operating Modes and HiveServer2 Transport Modes

Beeline supports the following modes of operation:

Embedded: The Beeline client and the Hive installation both reside on the same host machine. No TCP connectivity is required.
Remote: Use remote mode to support multiple, concurrent clients executing queries against the same remote Hive installation. Remote transport mode supports authentication with LDAP and Kerberos. It also supports encryption with SSL. TCP connectivity is required.

Administrators may start HiveServer2 in one of the following transport modes:

TCP: HiveServer2 uses TCP transport for sending and receiving Thrift RPC messages.
HTTP: HiveServer2 uses HTTP transport for sending and receiving Thrift RPC messages.

While running in TCP transport mode, HiveServer2 supports the following authentication schemes:

 

Comments are closed.