Iceberg Tables
2 minute read
Introduction
Iceberg tables uses Apache Iceberg open table format specification to provide an abstraction layer on data files stored in open formats. Iceberg tables for Snowflake offer schema evolution, partitioning, and snapshot isolation to manage the table data efficiently.
The Snowflake emulator supports Iceberg tables, allowing you to create and manage Iceberg tables locally. You can use Iceberg tables to query data in Snowflake tables using the Iceberg table format by using external volumes, with data stored in local/remote S3 buckets.
Getting started
This guide is designed for users new to Iceberg tables and assumes basic knowledge of SQL and Snowflake. Start your Snowflake emulator and connect to it using an SQL client in order to execute the queries further below.
In this guide, you will create an external volume, and an Iceberg table to store and query data using the Iceberg table format.
Create an S3 bucket
You can create a local S3 bucket using the mb
command with the awslocal
CLI.
$ awslocal s3 mb s3://test-bucket
Create an external volume
You can create an external volume using the CREATE OR REPLACE EXTERNAL VOLUME
statement. The external volume is used to define the location of the files that Iceberg will use to store the table data.
CREATE OR REPLACE EXTERNAL VOLUME test_volume
STORAGE_LOCATIONS = (
(
NAME = 'aws-s3-test'
STORAGE_PROVIDER = 'S3'
STORAGE_BASE_URL = 's3://test-bucket/'
STORAGE_AWS_ROLE_ARN = 'arn:aws:iam::000000000000:role/s3-role'
ENCRYPTION=(TYPE='AWS_SSE_S3')
)
)
Grant access to Snowflake role
You can grant access to the Snowflake role using the GRANT USAGE ON EXTERNAL VOLUME
statement.
GRANT USAGE ON EXTERNAL VOLUME test_volume TO ROLE PUBLIC
Create Iceberg table
You can create an Iceberg table using the CREATE ICEBERG TABLE
statement. The Iceberg table is used to define the schema and location of the table data.
CREATE ICEBERG TABLE test_table (c1 TEXT)
CATALOG='SNOWFLAKE', EXTERNAL_VOLUME='test_volume', BASE_LOCATION='test'
Insert and select data
You can insert and select data from the Iceberg table using the INSERT INTO
and SELECT
statements.
INSERT INTO test_table(c1) VALUES ('test'), ('foobar')
SELECT * FROM test_table
The output should be:
+------+
| C1 |
|------|
| test |
| foobar |
+------+
You can also list the content of the S3 bucket:
$ awslocal s3 ls --recursive s3://test-bucket/