Using SQLMesh Model Blueprinting to optimize data teams, and improve data quality.
For central data teams, it's common to encounter repetitive patterns in data transformations. For example, consider a hospital network with multiple locations: each location needs the same set of reports (patient admissions, discharges, inventory usage, etc.), but for their specific location or department. Data teams may duplicate SQL logic without a scalable strategy, creating additional copies of the SQL logic for each location or department. This duplicative approach is tedious and risky, as inconsistencies can creep in over time. SQLMesh model blueprinting is a feature designed to solve this exact problem.
What is SQLMesh Model Blueprinting?
SQLMesh model blueprinting lets you define a single template model and automatically instantiate it as multiple models by substituting parameters. One model definition can generate many models by plugging in different values for certain variables. Each generated model (or "blueprint instance") has the same base logic, but is tailored to a specific context (like a particular region, client, or, in our case, hospital location).
Blueprinting works by using a special blueprints property in the model definition. You parameterize the model's name (and optionally parts of its query) with variables, then provide a list of key-value mappings for those variables. SQLMesh will create a separate model for each mapping in the list. All the heavy lifting of duplicating the logic with different parameters is handled for you.
Key points about blueprinting in SQLMesh:
- Template Model: You write one model definition with blueprint variables (e.g., @location in the name or SQL).
- Blueprint Mappings: In the model’s metadata, the blueprints property is a list of mappings, and each list entry is a set of key‑value pairs that bind every template variable to a value. For example:
blueprints (
(location := 'north', some_var := 'foo'),
(location := 'south', some_var := 'bar')
)
Each mapping (the tuple in parentheses) yields one model instance; the individual key‑value pairs inside it provide that instance’s variable values.
- Automatic Model Generation: SQLMesh expands the template for each mapping, creating distinct models (with unique names) that all share the template logic.
The template model’s name must include the variables used in blueprints. This ensures each generated model gets a unique name corresponding to its parameters.
{{join-slack}}
Why Does Blueprinting Matter?
Blueprinting directly addresses several challenges for data teams, especially as projects scale across multiple domains or regions:
- Redundant Code: Without blueprinting, teams often copy-paste SQL (or code) for each model variation. For example, you might maintain separate SQL files for hospital_north_admissions, hospital_south_admissions, etc., which are nearly identical. This duplication violates the DRY (Don't Repeat Yourself) principle and bloats your codebase.
- Inconsistent Logic: When the same logic is implemented in multiple places, there's a risk that it diverges over time. One location’s report might get a bug fix or an improvement that isn’t applied to others, leading to inconsistent metrics. Keeping many copies of a query in sync is error-prone.
- Maintenance Complexity & Overhead: As the number of variations grows (more locations, new departments, additional clients), maintaining separate models becomes a complex task. Onboarding a new hospital location would require creating and validating a new set of duplicate models. Refactoring or updating logic means changing it in every file. This slows down development and increases the chance of mistakes.
SQLMesh’s model blueprinting solves these problems by letting you maintain one source of truth for the logic. You update the template model once, and all variants are updated together. Adding a new location or domain is as simple as adding one entry to the blueprint mapping, instead of writing a brand new model. This brings huge gains in maintainability and consistency for the data pipeline.
The Hospital Network Scenario
Let's put this into a concrete scenario to see blueprinting in action. Imagine we have a healthcare network called Acme Hospitals with two locations: North Hospital and South Hospital. Each location maintains its own operational database. Every day, each hospital needs a report of how many patients were admitted, organized by date.
Without model blueprinting, we might create two separate models or SQL scripts:
- One for North Hospital Admissions (reading from North Hospital’s data).
- One for South Hospital Admissions (reading from South Hospital’s data).
Both reports do the same aggregation (count of daily admissions grouped by date), differing only in the data source (north vs south) and the output table name. Duplicating this logic is straightforward at first, but over time, it's easy to see how issues would arise: someone might change the North report’s logic (say, filtering out certain patient types) and forget to apply it to the South report, leading to mismatched results.
Now imagine scaling this to 5, 10, or 50 hospital locations – the redundant work grows linearly, and so does the potential for error.
Enter SQLMesh model blueprinting: we can define a single Admissions Summary model as a template and use blueprinting to generate one version per hospital location. This ensures all locations use identical logic for the admissions report and makes the system far easier to maintain.
SQL Blueprint Example: Consistent Reports for Multiple Hospitals
First, let's see how to implement the solution using a SQL-based model blueprint in SQLMesh. We'll create a model that produces a daily admissions summary for each hospital location.
SQLMesh typically defines SQL models in .sql files with a MODEL (...) configuration block followed by the query. We'll parameterize the model name with @location and use the blueprints property to list the location names we want to cover. Inside the query, we will also use @location to refer to each location's source table dynamically.
Below is the SQL model definition using blueprinting:
-- models/hospital_admissions_summary.sql
MODEL (
name @location.admissions_summary, -- The model name has a variable part for location
kind FULL,
blueprints (
(location := north),
(location := south)
)
);
SELECT
date,
COUNT(DISTINCT patient_id) AS total_admissions
FROM @location.patient_records
WHERE event_type = 'admission'
GROUP BY date;
Let's break down what's happening here:
- Parameterized Name: The model name is @location.admissions_summary. The @location portion is a variable that each blueprint value will replace. For our two mappings, this will produce two model names: north.admissions_summary and south.admissions_summary. We get two tables, one for each hospital location, created from this template.
- Blueprints List: In the blueprints property, we specify a list of mappings. Here we have two mappings: one where location = north and another where location = south. These are the values that will be substituted. You can think of this as instructing SQLMesh: "Create one model with location = north and another with location = south."
- Query with Blueprint Variables: The SELECT query uses @location in the FROM clause: FROM @location.patient_records. This means the location name will be inserted into the table reference for each generated model. For example, the North Hospital version of the query will read FROM north.patient_records, while the South version will read FROM south.patient_records. Apart from that, the logic (counting distinct patient IDs per date for admissions) is identical.
When SQLMesh processes this model, it will output two actual models under the hood:
- north.admissions_summary – using north.patient_records as input.
- south.admissions_summary – using south.patient_records as input.
Each behaves like a normal model (you can run them, reference them in other models, etc.), but you didn't have to write two separate SQL files; the blueprint did it for you.
If the hospital network expands (say we add "East Hospital"), incorporating it is as easy as adding another (location := east) entry to the blueprints list. The same SQL logic will automatically apply to produce east.admissions_summary. No copy-pasting required.
{{join-slack}}
Python Blueprint Example: Programmatic Model Generation
SQLMesh also allows defining models in Python (using the @model
decorator) for cases where Pythonic logic or dynamic SQL generation is needed. Blueprinting is equally supported in Python-based model definitions. The idea is the same: one function can generate multiple models by specifying blueprints in the decorator.
Let's write the same hospital admissions summary model as a Python-based SQL model.
# models/hospital_admissions_summary.py
from sqlmesh import model
@model(
"{branch}.admissions_summary",
kind="FULL",
blueprints=[
{"branch": "north"},
{"branch": "south"}
],
is_sql=True
)
def entrypoint(evaluator) -> str:
"""
evaluator: MacroEvaluator
"""
# Get the current blueprint value for the 'branch' variable
branch_name = evaluator.blueprint_var("branch") # e.g., "north" or "south"
# Build and return the SQL query for this branch
return f"""
SELECT
date,
COUNT(DISTINCT patient_id) AS total_admissions
FROM {branch_name}.patient_records
WHERE event_type = 'admission'
GROUP BY date
"""
Explanation:
- To turn this into a SQL-producing model, we decorate the
entrypoint
function with@model
.- Model name – The first argument is the parameterized name
"{branch}.admissions_summary"
; the{branch}
placeholder will be replaced with each blueprint’s value (north
,south
, etc.). - Blueprints list – We pass
blueprints=[{"branch": "north"}, {"branch": "south"}]
; each dictionary supplies values for every template variable, so each dictionary produces one model instance. is_sql=True
– This flag tells SQLMesh the function returns a SQL string (not a DataFrame).
- Model name – The first argument is the parameterized name
The entrypoint
function takes one argument —evaluator
(a MacroEvaluator
).Inside, we call evaluator.blueprint_var("branch")
to fetch the current value of branch
. When SQLMesh renders the North-branch instance, that call returns "north"
; for South, it returns "south"
.
We then return an f-string that builds the SQL query, injecting the branch_name
into the FROM
clause:
FROM {branch_name}.patient_records
The generated SQL is identical to the earlier SQL blueprint example, and SQLMesh parses and executes it like any other SQL model. The template yields two physical models—north.admissions_summary
and south.admissions_summary
—each reading from its own schema.
Functionally, the SQL and Python blueprint approaches are similar. The Python version becomes especially useful when you need extra logic (loops, conditionals, helper functions) while still enjoying SQLMesh’s full parsing, lineage, and blueprint capabilities.
Benefits of Blueprinting: Maintainability, Simplicity, Consistency
The hospital example demonstrates the power of model blueprinting. Here are the key benefits data teams experience from this feature:
- Maintainability: With one template powering multiple models, there’s only one place to make changes. If the admissions logic needs to be updated (for instance, if we decide to exclude certain facility codes from the count), we update the template model once. SQLMesh ensures all location-specific models reflect that change. This greatly reduces maintenance effort as your project grows.
- Simplicity: Blueprinting keeps your project DRY and clean. Instead of navigating dozens of nearly identical SQL files or functions, you have a single definition for a whole class of models. Adding a new hospital location or department is straightforward – just add a new blueprint entry – and there's no need to duplicate any SQL or copy-paste code. The overall model DAG (directed acyclic graph) remains easier to understand because the template logically groups what would otherwise be many separate nodes.
- Consistency: Since all generated models come from the same blueprint, they are guaranteed to apply the exact same transformations. This consistency means stakeholders across different locations or domains are truly “on the same page” – e.g. admissions are calculated the same way everywhere. It eliminates the risk of one team accidentally modifying their version of the logic and creating a discrepancy. Consistent logic also simplifies debugging and testing, as you can test the template once with various inputs rather than testing each copy of a query.
These benefits translate to trust and efficiency in the data transformation process. The data team can confidently say that North and South location reports are aligned. The reports scale from one location to many with minimal work.
Conclusion
SQLMesh model blueprinting is a powerful feature for any data team dealing with repetitive patterns. By turning models into reusable blueprints, SQLMesh helps you eliminate redundant code, ensure consistent business logic, and simplify the expansion of data transformations across new regions or use cases.
In the context of our hypothetical hospital network, blueprinting enables a clean and scalable solution for multi-location reporting. A single model definition easily produced multiple location-specific reports.
As you design your data models, look for opportunities to apply blueprinting when you see similar models expanding across dimensions like client, region, or in this case, hospital location. Adopting this approach will make your data models more maintainable and robust in the long run. SQLMesh’s blueprinting feature ensures that your codebase remains manageable as your data domain grows and your logic stays consistent.
Additional Resources
Introducing: Enterprise-ready Tobiko Cloud