Topic Overview

Databricks Notebooks are the primary interactive development environment within the Databricks platform. They support Python, SQL, Scala, and R, and you can mix languages within a single notebook using magic commands. Notebooks are where most data engineers write, test, and iterate on their code before productionizing it.

Beyond just running code, notebooks in Databricks come with built in features like version history, collaboration, widgets for parameterization, and integration with Lakeflow Jobs for scheduling. They also have a results pane that renders tables, charts, and visualizations directly below each cell.

For the exam, you should know the key capabilities of notebooks, including magic commands, how notebooks attach to clusters, how to parameterize them with widgets, and how they integrate with Git (Databricks Git Folders, formerly Repos).


Key Concepts


Code Examples

Using widgets to parameterize a notebook

# Create widgets
dbutils.widgets.text("env", "dev", "Environment")
dbutils.widgets.dropdown("region", "us-east-1", ["us-east-1", "us-west-2", "eu-west-1"], "Region")

# Retrieve widget values
env = dbutils.widgets.get("env")
region = dbutils.widgets.get("region")

print(f"Running in {env} environment, region: {region}")

# Use in queries
catalog = f"{env}_catalog"
df = spark.read.table(f"{catalog}.sales.transactions")

# Remove widgets when done
dbutils.widgets.removeAll()

Running another notebook with %run vs dbutils.notebook.run()

# %run executes in the SAME context (shares variables)
# Must be the only content in the cell
%run ./includes/setup

# After %run, variables from 'setup' notebook are available here
print(config_variable)  # defined in the setup notebook
# dbutils.notebook.run() executes in a SEPARATE context
# Returns a string result, does NOT share variables
result = dbutils.notebook.run(
    "./transforms/process_orders",
    timeout_seconds=600,
    arguments={"date": "2025-01-15", "env": "prod"}
)
print(f"Notebook returned: {result}")

Accessing widgets in SQL

-- Create a widget in SQL
CREATE WIDGET TEXT env DEFAULT 'dev';

-- Use the widget value with getArgument()
SELECT * FROM identifier(getArgument('env') || '_catalog.sales.orders')
WHERE order_date = current_date();