You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+130Lines changed: 130 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,135 @@
1
1
# Changelog
2
2
3
+
## v1.9.5
4
+
5
+
### Materialized Lake View Support
6
+
7
+
#### New materialization: `materialized_lake_view`
8
+
9
+
dbt-fabricspark now supports [Materialized Lake Views](https://learn.microsoft.com/en-us/fabric/data-engineering/materialized-lake-views/materialized-lake-views) as a first-class materialization. MLVs are precomputed, incrementally-maintained views in Fabric lakehouses that accelerate queries over Delta tables without manual refresh pipelines.
When `mlv_on_demand: true`, the adapter triggers an immediate refresh via the Fabric Job Scheduler API and polls until the job reaches a terminal status:
When `mlv_schedule` is provided, the adapter creates or updates a refresh schedule via the Fabric REST API. The operation is idempotent — if a schedule already exists, it is updated in place.
86
+
87
+
Supported schedule types:
88
+
-**Cron** — `interval` in minutes
89
+
-**Daily** — list of `times` (e.g., `["06:00", "18:00"]`)
90
+
-**Weekly** — `weekdays` and `times`
91
+
92
+
The `endDateTime` field is mandatory in the schedule configuration. The adapter validates its presence before calling the API and raises a clear error if missing.
93
+
94
+
---
95
+
96
+
#### Automatic lakehouse ID resolution
97
+
98
+
The adapter resolves the lakehouse name (from `database` config or `target.lakehouse`) to a lakehouse ID automatically via `GET /v1/workspaces/{workspaceId}/lakehouses`. Results are cached per workspace for the duration of the run. No manual `mlv_lakehouse_id` configuration is required.
99
+
100
+
---
101
+
102
+
#### Preflight validation (connection open)
103
+
104
+
MLV prerequisites are validated eagerly at connection open time (after Spark version detection). The adapter checks:
105
+
106
+
1.**Not running in local/Docker mode** — MLV requires Fabric Runtime
107
+
2.**Spark version ≥ 3.5** — checked via `SELECT split(version(), ' ')[0]`
108
+
3.**Schema-enabled lakehouse** — detected automatically on connection open
109
+
110
+
If any check fails, a warning is logged immediately and the error is cached. When an MLV model executes, it reads the cached error and fails instantly with a clear message — no wasted time running models that cannot succeed. Non-MLV projects are completely unaffected.
111
+
112
+
---
113
+
114
+
#### Delta source validation
115
+
116
+
At model execution time (before `CREATE OR REPLACE`), the adapter checks that all upstream tables referenced by the MLV are Delta format. Non-Delta sources (e.g., views, CSV tables) cause an immediate model failure with a descriptive error.
117
+
118
+
---
119
+
120
+
#### REST API error handling with retries
121
+
122
+
All Fabric REST API calls use automatic retries with exponential backoff:
123
+
124
+
-**3 attempts** per operation
125
+
-**Exponential backoff:** 2s, 4s, 8s between retries
Errors surface as `MLVApiError` (extends `DbtRuntimeError`) with the operation name, HTTP status, and parsed Fabric error details. Failed API calls always fail the model.
Copy file name to clipboardExpand all lines: README.md
+208-2Lines changed: 208 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,8 +22,6 @@ dbt is the T in ELT. Organize, cleanse, denormalize, filter, rename, and pre-agg
22
22
23
23
The `dbt-fabricspark` package contains all of the code enabling dbt to work with Apache Spark in Microsoft Fabric. This adapter connects to Fabric Lakehouses via Livy endpoints and supports both **schema-enabled** and **non-schema** Lakehouse configurations.
24
24
25
-
**Current version: `1.9.3`**
26
-
27
25
### Key Features
28
26
29
27
-**Livy session management** with session reuse across dbt runs
@@ -49,6 +47,14 @@ pip install dbt-fabricspark
49
47
50
48
Use a Livy endpoint to connect to Apache Spark in Microsoft Fabric. Configure your `profiles.yml` to connect via Livy endpoints.
51
49
50
+
### Connection Modes
51
+
52
+
The adapter supports two connection modes via the `livy_mode` setting:
53
+
54
+
-**Local mode** (`livy_mode: local`) — Connects to a self-hosted Spark instance running in a Docker container (contributed by @mdrakiburrahman). This mode supports the `reuse_session` flag and does not require Fabric compute, making it ideal for offline development and testing.
55
+
56
+
-**Fabric mode** (`livy_mode: fabric`, default) — Connects to Apache Spark in Microsoft Fabric via the Fabric Livy API. For development workflows, enable `reuse_session: true` to persist the Livy session ID to a local file (configured via `session_id_file`, defaults to `./livy-session-id.txt`). On subsequent `dbt` runs, the adapter reuses the existing session from the persisted file instead of creating a new one. If the file does not exist or the session has expired, a new session is created automatically.
57
+
52
58
### Lakehouse without Schema
53
59
54
60
For standard Lakehouses (schema not enabled), use two-part naming. The `schema` field is set to the lakehouse name:
@@ -260,6 +266,206 @@ In this example:
260
266
|**Service Principal**|`SPN`| CI/CD and automation. Uses Azure AD app registration. |`client_id`, `tenant_id`, `client_secret`|
261
267
|**Fabric Notebook**|`fabric_notebook`| Running dbt inside a Fabric notebook. Uses `notebookutils.credentials`. | None (runs in Fabric runtime) |
262
268
269
+
### Materialized Lake Views
270
+
271
+
[Materialized lake views](https://learn.microsoft.com/en-us/fabric/data-engineering/materialized-lake-views/overview-materialized-lake-view) are a Fabric-native construct that materializes a SQL query as a Delta table in your lakehouse, with automatic lineage-based refresh managed by Fabric.
272
+
273
+
#### Prerequisites
274
+
275
+
- Schema-enabled lakehouse
276
+
- Fabric Runtime 1.3+
277
+
- Source tables must be Delta tables
278
+
279
+
#### Basic Usage
280
+
281
+
```sql
282
+
-- models/silver/silver_cleaned_orders.sql
283
+
{{ config(
284
+
materialized='materialized_lake_view',
285
+
database='silver',
286
+
schema='dbo'
287
+
) }}
288
+
289
+
SELECT
290
+
o.order_id,
291
+
o.product_id,
292
+
p.product_name,
293
+
o.quantity,
294
+
p.price,
295
+
o.quantity*p.priceAS revenue
296
+
FROM {{ ref('bronze_orders') }} o
297
+
JOIN {{ ref('bronze_products') }} p
298
+
ONo.product_id=p.product_id
299
+
```
300
+
301
+
#### Configuration Options
302
+
303
+
| Option | Type | Default | Description |
304
+
|--------|------|---------|-------------|
305
+
|`materialized`| string | — | Must be `'materialized_lake_view'`|
-`expression` — Boolean expression each row must satisfy
331
+
-`on_mismatch` — `DROP` (silently remove violating rows) or `FAIL` (stop refresh with error, default)
332
+
333
+
#### Change Data Feed
334
+
335
+
The adapter automatically enables [Change Data Feed](https://learn.microsoft.com/en-us/azure/databricks/delta/delta-change-data-feed) (CDF) on all upstream source tables referenced via `ref()` before creating the MLV. This enables [optimal incremental refresh](https://learn.microsoft.com/en-us/fabric/data-engineering/materialized-lake-views/refresh-materialized-lake-view). To disable:
336
+
337
+
```sql
338
+
{{ config(
339
+
materialized='materialized_lake_view',
340
+
enable_cdf=false
341
+
) }}
342
+
```
343
+
344
+
#### On-Demand Refresh
345
+
346
+
Trigger an immediate MLV lineage refresh after creation:
347
+
348
+
```sql
349
+
{{ config(
350
+
materialized='materialized_lake_view',
351
+
mlv_on_demand=true
352
+
) }}
353
+
```
354
+
355
+
This calls the Fabric Job Scheduler API:
356
+
```
357
+
POST /v1/workspaces/{workspaceId}/lakehouses/{lakehouseId}/jobs/RefreshMaterializedLakeViews/instances
358
+
```
359
+
360
+
#### Scheduled Refresh
361
+
362
+
Create or update a periodic refresh schedule. The adapter uses the [Fabric Job Scheduler API](https://learn.microsoft.com/en-us/fabric/data-engineering/materialized-lake-views/materialized-lake-views-public-api) to manage schedules. Only one active schedule per lakehouse lineage is supported — the adapter automatically updates an existing schedule if one is found.
363
+
364
+
**Cron schedule** (interval in minutes):
365
+
366
+
```sql
367
+
{{ config(
368
+
materialized='materialized_lake_view',
369
+
mlv_schedule={
370
+
"enabled": true,
371
+
"configuration": {
372
+
"startDateTime": "2026-04-10T00:00:00",
373
+
"endDateTime": "2026-12-31T23:59:59",
374
+
"localTimeZoneId": "Central Standard Time",
375
+
"type": "Cron",
376
+
"interval": 10
377
+
}
378
+
}
379
+
) }}
380
+
```
381
+
382
+
**Daily schedule** (specific times):
383
+
384
+
```sql
385
+
{{ config(
386
+
materialized='materialized_lake_view',
387
+
mlv_schedule={
388
+
"enabled": true,
389
+
"configuration": {
390
+
"startDateTime": "2026-04-10T00:00:00",
391
+
"endDateTime": "2026-12-31T23:59:59",
392
+
"localTimeZoneId": "Central Standard Time",
393
+
"type": "Daily",
394
+
"times": ["06:00", "18:00"]
395
+
}
396
+
}
397
+
) }}
398
+
```
399
+
400
+
**Weekly schedule** (specific days and times):
401
+
402
+
```sql
403
+
{{ config(
404
+
materialized='materialized_lake_view',
405
+
mlv_schedule={
406
+
"enabled": true,
407
+
"configuration": {
408
+
"startDateTime": "2026-04-10T00:00:00",
409
+
"endDateTime": "2026-12-31T23:59:59",
410
+
"localTimeZoneId": "Central Standard Time",
411
+
"type": "Weekly",
412
+
"weekdays": ["Monday", "Wednesday", "Friday"],
413
+
"times": ["08:00"]
414
+
}
415
+
}
416
+
) }}
417
+
```
418
+
419
+
#### Full Example with All Options
420
+
421
+
```sql
422
+
{{ config(
423
+
materialized='materialized_lake_view',
424
+
database='gold',
425
+
schema='dbo',
426
+
partitioned_by=['product_type'],
427
+
mlv_comment='Product sales summary with quality checks',
CREATE OR REPLACE MATERIALIZED LAKE VIEW gold.dbo.product_sales_summary
449
+
(
450
+
CONSTRAINT positive_revenue CHECK (total_revenue >=0) ON MISMATCH DROP
451
+
)
452
+
PARTITIONED BY (product_type)
453
+
COMMENT 'Product sales summary with quality checks'
454
+
TBLPROPERTIES ("quality_tier"="gold")
455
+
AS
456
+
SELECT ...
457
+
```
458
+
459
+
#### Limitations
460
+
461
+
-**No ALTER on definition** — Changing the SELECT query, constraints, or partitioning requires drop + recreate. The adapter uses `CREATE OR REPLACE` which handles this automatically.
462
+
-**Only RENAME via ALTER** — `ALTER MATERIALIZED LAKE VIEW ... RENAME TO ...` is the only supported ALTER operation.
463
+
-**No DML** — `INSERT`, `UPDATE`, `DELETE` are not supported on MLVs.
464
+
-**No UDFs** — User-defined functions are not supported in the SELECT query.
465
+
-**No time-travel** — `VERSION AS OF` / `TIMESTAMP AS OF` syntax is not supported.
466
+
-**No temp views as sources** — The SELECT query can reference tables and other MLVs, but not temporary views.
467
+
-**Schedule is per-lakehouse** — One active schedule per lakehouse lineage, not per MLV.
468
+
263
469
## Reporting bugs and contributing code
264
470
265
471
- Want to report a bug or request a feature? Let us know on [Slack](http://community.getdbt.com/), or open [an issue](https://github.com/microsoft/dbt-fabricspark/issues/new)
0 commit comments