perf: allocate CPU-offloaded params from runtime device pinned host buffer by fszontagh · Pull Request #1601 · leejet/stable-diffusion.cpp

fszontagh · 2026-06-03T09:50:11Z

Summary

When params_backend != runtime_backend (i.e. --offload-to-cpu), allocate the runner's params buffer from the runtime device's host buffer type (ggml_backend_dev_host_buffer_type) instead of the regular CPU buffer type. For the CUDA backend this is the pinned (page-locked) host buffer, so subsequent H2D transfers are DMA-direct instead of going through the driver's pageable staging copy. Backends that don't implement get_host_buffer_type fall back to the existing path.

Related Issue / Discussion

Independent perf win on top of #1598 (now merged as a7f2e03).

Additional Information

Z-Image bf16, 512x512, 8 steps, --offload-to-cpu --stream-layers --max-vram 8 on RTX 3060:

Config	Before	After	Delta
no LoRA	34.72 s	22.06 s	-36%
LoRA 0.8	37.75 s	24.36 s	-35%
LoRA 1.5	37.14 s	24.47 s	-34%

LoRA multiplier scaling identical (0.8 vs 1.5 mean pixel diff 26.05 -> 26.05, bit-exact output preserved).

Checklist

I have read and confirmed this PR follows the contribution guidelines.

…uffer

fszontagh marked this pull request as draft June 3, 2026 09:51

fszontagh force-pushed the perf/pinned-host-params branch 2 times, most recently from 284af90 to 553197a Compare June 3, 2026 11:46

perf: allocate CPU-offloaded params from runtime device pinned host b…

9d0a521

…uffer

fszontagh force-pushed the perf/pinned-host-params branch from 553197a to 9d0a521 Compare June 3, 2026 16:19

fszontagh marked this pull request as ready for review June 3, 2026 16:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: allocate CPU-offloaded params from runtime device pinned host buffer#1601

perf: allocate CPU-offloaded params from runtime device pinned host buffer#1601
fszontagh wants to merge 1 commit into
leejet:masterfrom
fszontagh:perf/pinned-host-params

fszontagh commented Jun 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fszontagh commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue / Discussion

Additional Information

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fszontagh commented Jun 3, 2026 •

edited

Loading