What Is Information Schema In Snowflake
Summary
What is information Schema or How to get information schema in snowflake
Introduction
A collection of system-defined views and table functions make up the Snowflake Information Schema, often known as the “Data Dictionary,” which contains rich metadata about the objects made in your account. The Snowflake Information Schema is based on the SQL-92 ANSI Information Schema but includes Snowflake-specific views and functions.
Every database in an account that uses Snowflake has a schema called INFORMATION SCHEMA that implements the Information Schema.
The power of Snowflake Information Schema is not well known and the usefulness it brings to data developers and data ops engineers are not discussed anywhere. (Watch complete video )
This episode is a comprehensive & practical guide with hands-on excercise on snowflake information schema. It explains all the views available under information including table functions and will help you to answer the following questions
- 🙋 Does Information Schema hold account level objects?
- 🙋 Can I get load history data from Information Schema?
- 🙋 Information Schema views are secure views or standard views?
- 🙋 Does querying Information Schema cost a lot?
- 🙋 Can I build snowsight dashboard using Information Schema?
- 🙋 How can I list table functions exist inside Information Schema?
- 🙋 What is the difference between Information Schema and account usage schema?
You can watch the complete hands on video tutorial
Information Schema & ANSI-92
Databases are referred to as “catalogues” by ANSI. To ensure standard compliance, the Snowflake Information Schema topics use “catalogue” instead of “database” where applicable. The terms are conceptually equivalent and interchangeable for all intents and purposes.
What is INFORMATION_SCHEMA?
Every database created in your account comes with a built-in, read-only schema called INFORMATION SCHEMA. The following objects are included in the schema:
Views for all of the database’s objects, as well as views for account-level objects (i.e. non-database objects such as roles, warehouses, and databases)
Table functions for your account’s historical and usage data.
SQL Scripts
All the SQLs used in above video are available in this article.
Sequences & File Formats
-- create sequence
create or replace sequence seq_01 start = 1 increment = 1 comment='This is a trial sequence';
create or replace sequence seq_02 start = 1 increment = 2 comment='This is a trial sequence';
create or replace sequence seq_03 start = 0 increment = -2 comment='This is a trial sequence with negative increment';
-- file formats
create or replace file format my_parquet_ff type = 'parquet';
create or replace file format my_csv_ff type = 'csv';
create or replace file format my_json_ff type = 'json';
create or replace file format my_avro_ff type = 'avro';
Stages
-- stages
create or replace stage stg02_parquet file_format = my_parquet_ff;
create or replace stage stg02_csv file_format = my_csv_ff;
create or replace stage stg02_json file_format = my_json_ff;
create or replace stage stg02_avro file_format = my_avro_ff;
-- external stage
create stage s3_ext_stage
url = 's3://demo-s3-bucket/'
comment = 'this customer parquet data';
Transient Table
create or replace transient table landing_item (
item_id varchar,
item_desc varchar,
start_date varchar,
end_date varchar,
price varchar,
item_class varchar,
item_CATEGORY varchar
) comment ='this is item table with in landing schema';
create or replace transient table landing_customer (
customer_id varchar,
salutation varchar,
first_name varchar,
last_name varchar,
birth_day varchar,
birth_month varchar,
birth_year varchar,
birth_country varchar,
email_address varchar
) comment ='this is customer table with in landing schema';
create or replace transient table landing_sales_order (
order_date varchar,
order_time varchar,
item_id varchar,
item_desc varchar,
customer_id varchar,
salutation varchar,
first_name varchar,
last_name varchar,
store_id varchar,
store_name varchar,
order_quantity varchar,
sale_price varchar,
disount_amt varchar,
coupon_amt varchar,
net_paid varchar,
net_paid_tax varchar,
net_profit varchar
) comment ='this is order table with in landing schema';
SnowPipe
-- create pips
create or replace pipe order_pipe
auto_ingest = true
as
copy into landing_sales_order from @s3_orders
file_format = (type=csv COMPRESSION=none)
pattern='.*order.*[.]csv';
create or replace pipe item_pipe
auto_ingest = true
as
copy into landing_item from @s3_items
file_format = (type=csv COMPRESSION=none)
pattern='.*item.*[.]csv'
;
create or replace pipe customer_pipe
auto_ingest = true
as
copy into landing_customer from @s3_customers
file_format = (type=csv COMPRESSION=none)
pattern='.*customer.*[.]csv'
ON_ERROR = 'CONTINUE';
External Table SQL
create or replace external TABLE customer_csv_et (
CUST_KEY varchar AS (value:c1::varchar),
NAME varchar AS (value:c2::varchar),
ADDRESS varchar AS (value:c3::varchar),
NATION_KEY varchar AS (value:c4::varchar),
PHONE varchar AS (value:c5::varchar),
ACCOUNT_BALANCE varchar AS (value:c6::varchar),
MARKET_SEGMENT varchar AS (value:c7::varchar),
COMMENT varchar AS (value:c8::varchar)
)
with location=@s3_customers_ext
auto_refresh = false
file_format = (format_name = csv_ff)
;
Conlusion
For searches that only extract a limited subset of objects from the dictionary, the Information Schema views are best. Filtering on schema and object names can help you improve the efficiency of your queries whenever possible.