All Stories

Apache Hive Release 3.1.1

Apache Hive Release 3.1.1 is the version which is compatible with Hadoop 3.x.y and fixes 4 bugs and one new Feature Apache Hive Release 3.1.1 Release Note Following Bug Fixes...

Apache Hive Cli Vs Beeline

Apache Hive development has shifted from the original Hive server (HiveServer1) to the new server (HiveServer2), and hence users and developers need to move to the new access tool. However,...

Apache Hive Cheat Sheet

Apache Hive Cheat Sheet is a summary of all functions and syntax for big data engineers and developers reference. It is divided into 5 parts. Apache Hive Cheat Sheet -...

Apache Hive Best Practice

As big data engineer, you must know the apachehive best practices.As you know Apache Hive is not an RDBMS, but it pretends to be one most of the time. It...

Apache Hive Analytical Functions

Apache Hive Analytical Functionsavailable since Hive 0.11.0, are a special group of functions that scan the multiple input rows to compute each output value. Apache Hive Analytical Functions are usually...

Snowflake Architecture Cheat Sheet

Architecture No software, No Hardware, No maintenance. Snowflake is provided as Software-as-a-Service (SaaS) that runs completely on cloud infrastructure Snowflake uses a central data repository for persisted data that is...

PySpark Tutorial

In this PySpark Tutorial, we will understand why PySpark is becoming popular among data engineers and data scientist. This PySpark Tutorial will also highlight the key limilation of PySpark over...

What Is Data Lineage

What is data lineage and why it is important. Data lineage is nothing but its origins and transformation that data goes through with time. Data lineage can also be expressed...

Data Lineage Vs Data Provenance

Data Lineage and Data Provenance are not the same thing. Many data engineer and architect use them interchangible but they are two different concept and has its separate meaning.

Spark Dataframe Minus Minutes Operation In Scala

How to perform minus operation on a date type or timestamp time.

What Is Apache Nifi

Apache NiFi is a software project from the Apache Software Foundation designed to automate the flow of data between software systems (file system, RDBMS, APIs etc in and out) ....

Interacting With Windows Registry Using Chef

One of the most well-known differences between managing UNIX-like systems and Windows systems is the Windows Registry. Chef has resources for creating, modifying, and deleting Windows Registry keys. Beware that...

Installing Software Packages In Windows Using Chef

A large number of managed systems require configuration of software that is outside the scope of the built-in Windows roles and features. Chef has a very handy resource for installing...

Executing Windows Batch Script In Chef

Similar to Linux script resources for bash, ruby, and so on, Chef can execute arbitrarily-defined Windows batch scripts through the command interpreter. When these resources are used, Chef compiles the...

Chef Windows Installing Roles

While using Chef for Windows, there are multiple backends for the Windows feature resourceDISM and servermanagercmd. Each one has a specific Ruby class that will be used based on the...