feat(vitmatte): object-aware tiling and multi-object matte path#10
Open
johnwbrisbin wants to merge 1 commit into
Open
feat(vitmatte): object-aware tiling and multi-object matte path#10johnwbrisbin wants to merge 1 commit into
johnwbrisbin wants to merge 1 commit into
Conversation
Replace the full-image single-pass VitMatte inference with an object-aware tiled approach that processes only the regions that actually contain uncertain alpha (the trimap boundary zone around each segmented object). vitmatte_backend.py: - Add tiling constants (_TILE=512, _MIN_OV=64, _STRIDE=448) - Remove NEEDS_TRIMAP: trimap is derived internally per object - matte(): rewritten to use _matte_multi_frame for per-object bbox-focused tile selection — skips blank masks entirely - matte_multi(): new method; runs VitMatte for N objects in one forward pass per shared tile, merges per-object alphas into a composite, and returns both the merged alpha and the object list - _matte_multi_frame(): core tiled accumulation loop with cosine blend weights and trimap-presence masking per object - Tile helpers: _global_tile_starts_1d, _optimal_tile_starts_1d, _best_tile_starts_1d, _object_tile_positions - _verify_tile_logic(): offline smoke-test for tile geometry node.py: - Add use_multi gate in the matting section: when the upstream segmenter returns "object_masks"/"object_boxes" keys (e.g. from SAM 3.1 with multiplex tracking), and the matter backend supports matte_multi, route through the multi-object path; otherwise fall back transparently to the single-object matte() call.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this changes
nodes/mask_matting/matters/vitmatte_backend.pyReplaces the full-image single-pass inference with an object-aware tiled approach.
Why: VitMatte's fixed 512x512 input means large images currently get
downscaled before inference and upscaled after, losing edge detail precisely
where it matters most. Tiling processes the image at native resolution, one
512x512 patch at a time, covering only the trimap uncertainty band around each
object rather than the whole frame.
Changes:
matte(): rewritten to locate the object bounding box, select the minimumset of 512x512 tiles that covers it with at least 64px overlap, run
inference on each tile, and blend results back with cosine-weighted
accumulation. Empty masks short-circuit without any model call.
matte_multi(): new method for N-object batches. Tiles shared betweennon-overlapping objects are processed once with a merged trimap; per-object
alphas are recovered by trimap-presence masking, then max-merged into a
composite alpha.
_matte_multi_frame(): core per-frame accumulation loop used by both paths._global_tile_starts_1d,_optimal_tile_starts_1d,_best_tile_starts_1d,_object_tile_positions. Per-object tile count isthe minimum of the object-optimal layout and the global grid covering the
same region.
_blend_weight(): cosine ramp over the overlap band to suppress seams._verify_tile_logic(): offline geometry smoke-test (call manually or in CI).NEEDS_TRIMAPremoved: the trimap is derived internally per object from thecoarse mask and
edge_radius; callers no longer need to supply one.nodes/mask_matting/node.pyAdds a
use_multigate in the matting section. When a segmenter returnsobject_masksandobject_boxesin its output dict (indicating it producedper-object segmentation, e.g. SAM 3.1 with multiplex tracking), and the
matter backend exposes
matte_multi, and the batch size is 1, the node routesthrough
matte_multi()instead of the single-objectmatte()call. All threeconditions must hold; otherwise the existing single-object path is used
unchanged. No change to
RETURN_TYPESor any node interface.Test plan
use_multistays false and output is unchanged from before.
exceeds quality of the previous downscale/upscale path at object edges.
use_multiactivates and each object receives an independent alpha.
without crashing.