Hands-On Introduction to Apache Iceberg — Data Lakehouse Engineering

Setting Up a Practice Environment

For this tutorial you do need to have Docker installed, as we will be using this docker image I created for easy hands on experimenting with Apache Iceberg, Apache Hudi and Delta Lake.

docker run -it --name format-playground alexmerced/table-format-playground
  • hudi-init - to open Spark Shell with Apache Hudi configured
  • delta-init - to open Sparh Shell with Delta Lake configured.

Getting Hands On with Apache Iceberg

  • Start the Docker Container docker run -it --name format-playground alexmerced/table-format-playground
  • Open Spark with Iceberg iceberg-init

Creating a Table in Iceberg

Keep in mind, we are not working with a traditional database but with a data lakehouse. So we are creating and reading files that would exist in your data lake storage (AWS/Azure/Google Cloud). So it may feel like working with a traditional database, and that is the beauty that table formats like Iceberg enable, working with files stored in our data lake in the same way we work with data in a database or data warehouse.

CREATE TABLE iceberg.cool_people (name string) using ICEBERG;
  • using ICEBERG clause tells Spark to use Iceberg to create the table instead of its default of using Hive.

Adding Some Records

Run the following:

INSERT INTO iceberg.cool_people VALUES ("Claudio Sanchez"), ("Freddie Mercury"), ("Cedric Bixler");

Querying the Records

Run the following:

SELECT * FROM iceberg.cool_people;

Ending the Session

  • To quit out of SparkSQL exit;
  • To quit out the docker container exit
  • docker attach format-playground


Now you know how to quickly set yourself up so you can experiment with Apache Iceberg. Check out their docs for many of the great features that exist in Iceberg such as Time Travel, Hidden Partitioning, Partition Evolution, Schema Evolution, ACID transactions and more.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alex Merced Coder

Alex Merced Coder


Alex Merced is a Developer Advocate for Dremio and host of the Web Dev 101 and Datanation Podcasts.