You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
| 7 |`storage, read`| CSR indices (`array<u32>`) |
108
+
54
109
## Context and Resource Lifetime
55
110
56
-
-`WebGPUFlushContext` is created once per `FlushCompositions` execution.
57
-
- The same command encoder is reused across all GPU passes in that flush.
58
-
- Transient textures/buffers/bind-groups are tracked in the flush context and released on dispose.
59
-
- Source image texture views are cached per flush context to avoid duplicate uploads.
111
+
-`WebGPUFlushContext` is created once per `FlushCompositions` execution and disposed at the end.
112
+
- The same command encoder is reused across all GPU operations in that flush.
113
+
- Transient textures, texture views, buffers, and bind groups are tracked in the flush context and released on dispose.
114
+
- Source image texture views are cached within the flush context to avoid duplicate uploads.
115
+
- CPU-side edge geometry (`IMemoryOwner<GpuEdge>`) is allocated via `MemoryAllocator` and disposed within the flush.
116
+
- Shared GPU buffers (edge buffer, CSR buffers, params buffer, dispatch config buffer) are managed by `DeviceState` with grow-only reuse across flushes.
117
+
- Edge upload uses dirty-range detection: compares current data word-by-word against a cached copy, uploading only the changed byte range via `QueueWriteBuffer`.
60
118
61
-
## Destination Writeback and Flush Count
119
+
## Destination Writeback
62
120
63
121
-`FlushCompositions` performs one command-buffer submission (`QueueSubmit`) per scene flush.
64
-
- Destination writeback to the render target is one copy from the fine output texture into composition bounds.
65
-
- No destination storage init/blit pass is used in the active flush path.
66
-
- CPU-region targets perform one additional texture->buffer copy and one map/read after the single submit.
122
+
- NativeSurface targets: one GPU-side `CommandEncoderCopyTextureToTexture` from output into the target at composition bounds. No CPU stall.
123
+
- CPU Region targets: readback from the output texture directly (skipping the output-to-target copy). Uses `CommandEncoderCopyTextureToBuffer`, `QueueSubmit`, synchronous `BufferMapAsync` with device polling, then copies mapped bytes to the CPU `Buffer2DRegion<TPixel>`.
67
124
68
125
## Fallback Behavior
69
126
70
-
Fallback is scene-scoped:
127
+
Fallback is scene-scoped and triggered when:
128
+
- The pixel format has no supported WebGPU texture format mapping.
129
+
- Any command uses an unsupported brush type (only `SolidBrush` and `ImageBrush` are GPU-composable).
130
+
- Any GPU operation fails during the flush.
131
+
132
+
Fallback path:
133
+
- If target exposes a CPU region: run `DefaultDrawingBackend.FlushCompositions(...)` directly.
134
+
- If target is native-surface only: rent CPU staging frame, run fallback on staging, upload staging pixels back to native target texture.
135
+
136
+
## Shader Source
137
+
138
+
`CompositeComputeShader` generates WGSL source per target texture format at runtime, substituting format-specific template tokens for texel decode/encode, backdrop/brush load, and output store. Generated source is cached by `TextureFormat` as null-terminated UTF-8 bytes.
71
139
72
-
- if target exposes a CPU region:
73
-
- run `DefaultDrawingBackend.FlushCompositions(...)` directly
74
-
- if target is native-surface only:
75
-
- rent CPU staging frame
76
-
- run `DefaultDrawingBackend.FlushCompositions(...)` on staging
77
-
- upload staging pixels back to native target texture
140
+
The following static WGSL shaders exist for the legacy CSR GPU pipeline but are not used in the current dispatch path (CSR is computed on CPU):
Static WGSL shaders are stored as null-terminated UTF-8 bytes (`U+0000` terminator required at call site), including:
146
+
Coverage rasterization and compositing are fused into a single compute dispatch. Each 16x16 tile workgroup computes coverage inline using a fixed-point scanline rasterizer ported from `DefaultRasterizer`, operating on workgroup shared memory with atomic accumulation. This eliminates the coverage texture, its allocation, write/read bandwidth, and the pass barrier that a separate coverage dispatch would require.
Edge preparation (path flattening, fixed-point conversion, CSR construction) runs on the CPU. The `path.Flatten()` cost is shared with the CPU rasterizer pipeline. CSR construction is three passes over the edge set: count, prefix sum, scatter.
85
149
86
-
`PreparedCompositeFine` is generated per target texture format and emitted as null-terminated UTF-8 bytes at runtime.
150
+
For the benchmark workload (7200x4800 US states GeoJSON polygon, 2px stroke, ~262K edges), NativeSurface performance is at parity with the CPU rasterizer (~28ms).
0 commit comments