Drivers: hv: mshv_vtl: fix GUP into VTL0 device mappings by namancse · Pull Request #141 · microsoft/OHCL-Linux-Kernel

namancse · 2026-06-03T06:06:24Z

Restores GUP (get_user_pages) into VTL0 memory mappings, broken by the 6.15 ZONE_DEVICE / pte_devmap removal (aed877c, d3f7922). After that refactor, GUP only walks PTEs/PMDs/PUDs that point to a real folio with a held reference; the legacy pte_devmap fast-path is gone. mshv_vtl_low was still installing devmap PTEs via vmf_insert_pfn_*, so userspace pins on /dev/mshv_vtl_low mappings silently failed.

Two commits:

use folio-aware inserters for huge VTL0 mappings — switches the PMD/PUD fault paths to vmf_insert_folio_pmd / vmf_insert_folio_pud, resolving the pfn to its struct page / pgmap folio and verifying the folio order matches the fault order.
fix GUP into VTL0 mappings on the 4K fault path — adds a folio-aware 4K path using vmf_insert_page_mkwrite once the pgmap is live, with a pte_special fallback (via vmf_insert_mixed) for early faults before devm_memremap_pages has run. Captures the chardev address_space on first open (cmpxchg) and calls unmap_mapping_range for both the encrypted and DECRYPTED_MASK-aliased pfns after pgmap registration so any stale special PTEs are dropped and refaulted as folio-backed. VM_MIXEDMAP | VM_DONTEXPAND are set on the VMA.

Copilot

Pull request overview

This PR updates the mshv_vtl_low mmap fault paths so VTL0 ZONE_DEVICE mappings become GUP-pinable again after the removal of the pte_devmap fast-path in 6.15. It does so by switching huge faults to folio-aware inserters and by making the 4K fault path insert a refcounted page once the pgmap exists, while providing an early pre-pgmap pte_special fallback and later zapping those stale PTEs.

Changes:

Add a pgmap-backed PFN→struct page resolver and use vmf_insert_page_mkwrite() (4K) / vmf_insert_folio_pmd() / vmf_insert_folio_pud() (huge) so faults install folio-backed entries suitable for GUP.
Capture the /dev/mshv_vtl_low address_space on first open and invalidate early-fault pte_special mappings after pgmap registration.
Tighten VMA flags for the mapping (VM_MIXEDMAP + VM_DONTEXPAND) to support the mixed fallback and keep the mapping size pinned.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

namancse · 2026-06-03T06:54:59Z

Adding the bug link for future reference.
https://microsoft.visualstudio.com/OS/_workitems/edit/62398261/

namancse · 2026-06-08T05:21:33Z

Fixed KPA issues reported by Hardik offline.

=== [1/2] Drivers: hv: mshv_vtl: use folio-aware inserters for huge VTL0 mappings ===

[HIGH] mshv_vtl_low_resolve_page() accepts any user-selected MEMORY_DEVICE_GENERIC ZONE_DEVICE PFN without proving it belongs to mshv_vtl and without taking a live dev_pagemap reference, so the new huge fau…
Sources: review-prompts
Evidence: mshv_vtl_low_resolve_page() accepts any user-selected MEMORY_DEVICE_GENERIC
ZONE_DEVICE PFN without proving it belongs to mshv_vtl and
without taking a live dev_pagemap reference, so the new huge
fault path can map foreign device memory and race foreign pgmap
teardown.
Impact: Severity: High — matches "Use-after-free, NULL deref, or double-free reachable
by any reasonable user-space activity (e.g., bringing an interface
up after a probe failure, reading a debugfs file that points at
freed memory)" because a CAP_SYS_ADMIN process can choose the mmap
offset, fault a foreign generic devmap PFN through this misc
device, and race the owning device's removal while this helper uses
pgmap/page metadata without get_dev_pagemap(). The PFN is derived
directly from the VMA offset: drivers/hv/mshv_vtl_main.c:3723 sets
pfn = vmf->pgoff & ~DECRYPTED_MASK, and
drivers/hv/mshv_vtl_main.c:3682-3698 can only align the huge fault
within the VMA; it does not check a registered mshv_vtl range. The
resolver's success condition is only generic devmap type:
drivers/hv/mshv_vtl_main.c:3710-3718 checks pfn_valid(),
is_zone_device_page(), page_pgmap(page), and pgmap->type ==
MEMORY_DEVICE_GENERIC, then returns page with no owner/range check
and no get_dev_pagemap(). The structure explicitly has an owner
field for this purpose: include/linux/memremap.h:121-123 says
@owner is "Used by various helpers to make sure that no foreign
ZONE_DEVICE memory is accessed." Foreign generic pgmaps exist
in-tree, e.g. drivers/dax/device.c:445-449 sets pgmap->type =
MEMORY_DEVICE_GENERIC and calls devm_memremap_pages(). The proper
live-reference API is mm/memremap.c:399-415, where
get_dev_pagemap() takes a live percpu ref under RCU; this helper
does not use it. Teardown can proceed independently for foreign
pgmaps: mm/memremap.c:112-126 kills the pgmap ref, waits for
completion, unmaps ranges, and exits the ref, while
drivers/hv/mshv_vtl_main.c:3746-3753 and 3759-3766 can continue
from the unpinned page into page_folio(), folio_order(), and
vmf_insert_folio_pmd()/vmf_insert_folio_pud().
Suggested fix: (see Evidence)
Fixed
[HIGH] A fault can race MSHV_ADD_VTL0_MEMORY while devm_memremap_pages() has made PFNs valid and ZONE_DEVICE but before __init_zone_device_page() publishes page->pgmap, causing mshv_vtl_low_resolve_page() t…
Sources: review-prompts
Evidence: A fault can race MSHV_ADD_VTL0_MEMORY while devm_memremap_pages() has made
PFNs valid and ZONE_DEVICE but before __init_zone_device_page()
publishes page->pgmap, causing mshv_vtl_low_resolve_page() to
dereference the lru/pgmap union as a bogus dev_pagemap pointer.
Impact: Severity: High — matches "Race condition with a realistic concurrent access
pattern (probe vs IRQ, suspend vs ioctl, two CPUs hitting the same
hot path) that produces corruption or crash" because one thread can
register VTL0 memory while another faults the same user-selected
PFN through an existing mapping, and the helper dereferences
page_pgmap() without synchronization against page metadata
publication. Registration exposes a wide publication window:
mm/memremap.c:180-183 stores the pgmap in pgmap_array,
mm/memremap.c:221-234 calls arch_add_memory(),
move_pfn_range_to_zone(), and mem_hotplug_done(), then
mm/memremap.c:238-244 only later calls memmap_init_zone_device().
move_pfn_range_to_zone() initializes visible page metadata before
the pgmap backpointer: mm/memory_hotplug.c:776-784 notes a visible
range and calls memmap_init_range(); mm/mm_init.c:581-592 shows
__init_single_page() zeroes the page, sets zone links, initializes
ref/mapcount, and INIT_LIST_HEAD(&page->lru).
__init_zone_device_page() then sets the ZONE_DEVICE page's pgmap
only after __init_single_page(): mm/mm_init.c:1007-1029 calls
__init_single_page(page, pfn, zone_idx, nid), sets PageReserved,
and later assigns page_folio(page)->pgmap = pgmap. The fault
helper's sequence is unsafe in that window:
drivers/hv/mshv_vtl_main.c:3710-3718 checks pfn_valid(), then
is_zone_device_page(), then calls page_pgmap(page) and immediately
dereferences pgmap->type. include/linux/mmzone.h:1209-1213
implements page_pgmap() as return page_folio(page)->pgmap, while
include/linux/mm_types.h:381-394 shows pgmap shares the union with
struct list_head lru; after INIT_LIST_HEAD but before pgmap
assignment, a racing fault can read the list_head value as a
non-NULL pgmap and dereference pgmap->type.
Suggested fix: (see Evidence)
Fixed
[MEDIUM] PMD/PUD faults now install refcounted file-rmapped folios into a non-DAX VM_MIXEDMAP VMA, but the huge zap/split paths classify that VMA as special and skip the matching rmap, RSS, and folio-referenc…
Sources: review-prompts
Evidence: PMD/PUD faults now install refcounted file-rmapped folios into a non-DAX
VM_MIXEDMAP VMA, but the huge zap/split paths classify that VMA
as special and skip the matching rmap, RSS, and folio-reference
teardown.
Impact: Severity: Medium — matches "Resource leak with bounded blast radius (a few
pages, a single file descriptor, one workqueue) on a rare path; not
exploitable as a DoS" because the device is CAP_SYS_ADMIN-gated,
but every successful huge VTL0 mapping can leak the mapping
reference and stale rmap/accounting until the page lifetime ends.
The introduced path is reachable:
drivers/hv/mshv_vtl_main.c:3743-3753 handles PMD_ORDER by resolving
a page, requiring folio_order(folio) == PMD_ORDER, then calling
vmf_insert_folio_pmd(); drivers/hv/mshv_vtl_main.c:3755-3766 does
the same for PUD_ORDER. Those inserters add the resources the
commit relies on: mm/huge_memory.c:1440-1448 builds a normal folio
PMD, calls folio_get(), folio_add_file_rmap_pmd(), and
add_mm_counter(); mm/huge_memory.c:1562-1567 does the analogous PUD
folio_get(), folio_add_file_rmap_pud(), and add_mm_counter(). The
VMA remains a non-DAX VM_MIXEDMAP file VMA: mm/vma.c:2408 sets
vma->vm_file = get_file(map->file),
drivers/hv/mshv_vtl_main.c:3786-3788 assigns mshv_vtl_low_vm_ops
and sets VM_HUGEPAGE | VM_MIXEDMAP | VM_DONTEXPAND, and
include/linux/fs.h:3796-3799 defines vma_is_dax() as
file_is_dax(vma->vm_file) with no DAX setup in this driver.
include/linux/mm.h:4142-4145 makes vma_is_special_huge() true for
vma->vm_file && VM_MIXEDMAP. On teardown,
mm/huge_memory.c:2196-2200 makes zap_huge_pmd() take the
!vma_is_dax(vma) && vma_is_special_huge(vma) branch and unlock
without reaching mm/huge_memory.c:2211-2245, where
folio_remove_rmap_pmd(), add_mm_counter(..., -HPAGE_PMD_NR), and
tlb_remove_page_size() would run. mm/huge_memory.c:2709-2711 makes
zap_huge_pud() take the same special branch and skip
mm/huge_memory.c:2720-2726, where folio_remove_rmap_pud(), RSS
decrement, and tlb_remove_page_size() would run. The split path has
the same classification: mm/huge_memory.c:2851-2860 clears a
non-anonymous PMD and returns immediately for non-DAX special huge
VMAs, skipping mm/huge_memory.c:2869-2878, where the file folio
rmap/ref/accounting cleanup happens.
Suggested fix: (see Evidence)

: Added a comment, this is not practical with current design of OpenVMM:
/*

Note on rmap/RSS accounting for huge VTL0 mappings:
vmf_insert_folio_{pmd,pud}() takes a folio reference, adds a file rmap,
and bumps mm RSS, but the matching teardown is skipped at zap/split time
because vma_is_special_huge() is true (VM_MIXEDMAP) while vma_is_dax() is
false (CONFIG_FS_DAX is not set in OHCL). The drift is theoretical for
OpenVMM/OpenHCL: VTL0 memory is mapped once per partition and held for
its lifetime - there is no map/unmap cycling, no partial munmap, and the
driver is not unloaded. Stale refs land on ZONE_DEVICE folios whose
pgmap is intentionally never released, no real bytes are leaked, and the
mm's inflated RSS is discarded with the mm at process exit.
*/

[MEDIUM] The folio insertion path adds file rmap state for ZONE_DEVICE folios without initializing folio->mapping and folio->index to the mshv_vtl_low address_space, so object-based rmap walks cannot find map…
Sources: review-prompts
Evidence: The folio insertion path adds file rmap state for ZONE_DEVICE folios without
initializing folio->mapping and folio->index to the mshv_vtl_low
address_space, so object-based rmap walks cannot find mappings
that were accounted as file-rmapped.
Impact: Severity: Medium — matches "Partial-state hazard: failure path leaves the
device half-torn-down but a subsequent normal operation (remove,
reboot) cleans it up; user-visible symptom only if userspace
touches the half-state" because the fault path creates
file-rmap/mapcount/accounting state, but later rmap-based
operations on the folio have no file mapping/index to walk. The
driver resolves a ZONE_DEVICE page and directly inserts it as a
file folio: drivers/hv/mshv_vtl_main.c:3746-3753 calls
vmf_insert_folio_pmd(), and drivers/hv/mshv_vtl_main.c:3759-3766
calls vmf_insert_folio_pud(), with no assignment to folio->mapping
or folio->index. ZONE_DEVICE initialization sets pgmap metadata but
not the file mapping fields: mm/mm_init.c:1023-1029 assigns
page_folio(page)->pgmap = pgmap and page->zone_device_data = NULL,
while include/linux/mm_types.h:393-399 shows pgmap, mapping, and
index are distinct folio fields. The reference device-dax path does
the missing setup: drivers/dax/device.c:75-99 computes the file
offset and assigns folio->mapping = filp->f_mapping and
folio->index = pgoff + i before drivers/dax/device.c:173-176 calls
vmf_insert_folio_pmd() and drivers/dax/device.c:219-222 calls
vmf_insert_folio_pud(). The consequence is concrete in rmap:
mm/rmap.c:2937-2952 requires folio->mapping and returns immediately
if it is NULL, so the file-rmap state added by
mm/huge_memory.c:1446-1448 and 1565-1567 cannot be found by
object-based rmap walkers.
Suggested fix: (see Evidence)
: This issue will not happen in practice on OpenHCL, but adding it maintains parity with DAX implementation. Adding it now.
[LOW] missing Fixes:
Sources: llm-analysis
Evidence: missing Fixes: tag
Sources: review-prompts/kernel/missing-fixes-tag.md
Evidence: This is a bug fix: the patch addresses a WARN in
try_grab_folio() and -ENOMEM I/O failure after v6.15 GUP changes
stopped taking pgmap references for ZONE_DEVICE pages while huge
VTL0 mappings still used vmf_insert_pfn_{pmd,pud}() without
holding a folio reference. The commit message identifies the
v6.15 changes as the regression source, but no Fixes: trailer is
present.
Impact: (Phase 3 finding — see Evidence for full reasoning)
Suggested fix: Suggested fix: add Fixes: aed877c2b425
Impact: lost attribution / incomplete stable backports
[LOW] [checkpatch:WARNING] line length of 103 exceeds 100 columns
Sources: tools
Evidence: checkpatch.pl (WARNING) on
0001-Drivers-hv-mshv_vtl-use-folio-aware-inserters-for-huge-VTL0-.patch:
WARNING: line length of 103 exceeds 100 columns
Impact: checkpatch warning: style/quality issue surfaced by the kernel's patch linter.
Suggested fix: Address the checkpatch.pl finding in the patch before submission.
[LOW] [checkpatch:WARNING] line length of 108 exceeds 100 columns
Sources: tools
Evidence: checkpatch.pl (WARNING) on
0001-Drivers-hv-mshv_vtl-use-folio-aware-inserters-for-huge-VTL0-.patch:
WARNING: line length of 108 exceeds 100 columns
Impact: checkpatch warning: style/quality issue surfaced by the kernel's patch linter.
Suggested fix: Address the checkpatch.pl finding in the patch before submission.

=== [2/2] Drivers: hv: mshv_vtl: fix GUP into VTL0 mappings on the 4K fault path ===

[HIGH] mshv_vtl_low_mapping stores only the first opened character-device inode's address_space without pinning its inode and without tracking other active inode mappings, so stale fallback PTEs can be miss…
Sources: review-prompts
Evidence: mshv_vtl_low_mapping stores only the first opened character-device inode's
address_space without pinning its inode and without tracking
other active inode mappings, so stale fallback PTEs can be missed
and a later ADD_VTL0_MEMORY can dereference freed inode storage.
Impact: Severity: High - matches the High row 'Use-after-free, NULL deref, or
double-free reachable by any reasonable user-space activity'
because a temporary or alternate character-device inode can be
opened to publish its embedded address_space, then closed/unlinked
and evicted before a later successful ioctl dereferences the raw
global pointer. The driver publishes only the first inode mapping
at drivers/hv/mshv_vtl_main.c:3668-3672 and has no .release hook in
drivers/hv/mshv_vtl_main.c:3817-3821 to clear it or drop a
reference. struct inode contains the address_space pointer and
embedded i_data at include/linux/fs.h:807 and
include/linux/fs.h:883, and do_dentry_open() initializes each file
from the opened inode at fs/open.c:911-913. VMAs are linked into
file->f_mapping at mm/vma.c:1774-1783, while
misc_open()/chrdev_open() pass the actual opened inode to the
device open path at drivers/char/misc.c:160-163 and
fs/char_dev.c:412-415, so alternate device nodes with the same
dev_t can have different address_space trees that the single global
will not zap. On inode lifetime, iput() calls iput_final() when
i_count reaches zero at fs/inode.c:1957-1966, evict() calls
destroy_inode() at fs/inode.c:809-834, and destroy_inode()
schedules the inode memory for freeing at fs/inode.c:389-401. The
ADD_VTL0_MEMORY success path later blindly uses the saved pointer
in unmap_mapping_pages() at drivers/hv/mshv_vtl_main.c:1220-1222.
Suggested fix: (see Evidence)
: THis is not practical with OpenVMM design. OpenVMM is the only userspace of VTL2 kernel, and such attacks are not possible.
[HIGH] The new order-0 folio path accepts any MEMORY_DEVICE_GENERIC ZONE_DEVICE PFN selected by the mmap offset as if it belonged to mshv_vtl, allowing /dev/mshv_vtl_low to install normal pinnable PTEs for…
Sources: review-prompts
Evidence: The new order-0 folio path accepts any MEMORY_DEVICE_GENERIC ZONE_DEVICE PFN
selected by the mmap offset as if it belonged to mshv_vtl,
allowing /dev/mshv_vtl_low to install normal pinnable PTEs for
foreign device memory.
Impact: Severity: High - matches the High row 'Security boundary violation: missing
capability check on a privileged operation, namespace escape,
missing ns_capable for a write' because the driver bypasses the
owning device's mmap and lifetime policy by treating unrelated
generic dev_pagemap memory as VTL0 memory. The commit claims the
normal path is for an mshv_vtl pgmap, but
mshv_vtl_low_resolve_page() only checks pfn_valid(),
is_zone_device_page(), page_pgmap(page), and pgmap->type ==
MEMORY_DEVICE_GENERIC at drivers/hv/mshv_vtl_main.c:3710-3718. The
mshv registration code creates MEMORY_DEVICE_GENERIC pgmaps at
drivers/hv/mshv_vtl_main.c:1184-1188 but sets no owner and records
no range registry used by resolve_page(). struct dev_pagemap
documents owner as identifying the managing entity and preventing
foreign ZONE_DEVICE access at include/linux/memremap.h:121-123.
Other in-tree code creates MEMORY_DEVICE_GENERIC pgmaps, including
device DAX at drivers/dax/device.c:445-449 and Xen unpopulated
memory at drivers/xen/unpopulated-alloc.c:90-97. The order-0 fault
uses vmf->pgoff as the PFN at drivers/hv/mshv_vtl_main.c:3723 and
maps any non-NULL resolved page with vmf_insert_page_mkwrite() at
drivers/hv/mshv_vtl_main.c:3730-3741; mshv_vtl_low_mmap() performs
no range validation at drivers/hv/mshv_vtl_main.c:3784-3794.
Suggested fix: (see Evidence)
: pre existing issue, not practical with existing OpenVMM design.
[HIGH] The new order-0 vmf_insert_page_mkwrite() path can make PFNs writable for MAP_PRIVATE mappings from an O_RDONLY /dev/mshv_vtl_low file descriptor, violating the read-only fd and private-mapping contr…
Sources: review-prompts
Evidence: The new order-0 vmf_insert_page_mkwrite() path can make PFNs writable for
MAP_PRIVATE mappings from an O_RDONLY /dev/mshv_vtl_low file
descriptor, violating the read-only fd and private-mapping
contract once the PFN resolves to a pgmap-backed page.
Impact: Severity: High - matches the High row 'Security boundary violation: missing
capability check on a privileged operation, namespace escape,
missing ns_capable for a write' because a read-only fd can be
delegated and then used for PROT_WRITE MAP_PRIVATE faults that
write the underlying VTL0/device page instead of creating a private
copy. mmap permission checks require FMODE_WRITE for
MAP_SHARED|PROT_WRITE at mm/mmap.c:445-450, but MAP_PRIVATE only
requires FMODE_READ at mm/mmap.c:466-468; vm_flags already include
the PROT_WRITE-derived VM_WRITE at mm/mmap.c:400-401.
mshv_vtl_low_open() checks CAP_SYS_ADMIN but not filp->f_mode at
drivers/hv/mshv_vtl_main.c:3668-3672, and mshv_vtl_low_mmap() does
not reject private or writable VMAs at
drivers/hv/mshv_vtl_main.c:3784-3794. The new fault path derives
write from FAULT_FLAG_WRITE at drivers/hv/mshv_vtl_main.c:3724 and
calls vmf_insert_page_mkwrite(vmf, page, write) at
drivers/hv/mshv_vtl_main.c:3741. insert_page_into_pte_locked()
makes the PTE dirty and maybe writable when mkwrite is true at
mm/memory.c:2306-2309, and maybe_mkwrite() sets write permission
whenever VM_WRITE is set at include/linux/mm.h:1298-1302.
Suggested fix: (see Evidence)
[MEDIUM] A concurrent order-0 /dev/mshv_vtl_low fault can install a stale pte_special mapping after MSHV_ADD_VTL0_MEMORY has already registered and zapped the range, so later GUP can still fail on a freshly r…
Sources: review-prompts
Evidence: A concurrent order-0 /dev/mshv_vtl_low fault can install a stale pte_special
mapping after MSHV_ADD_VTL0_MEMORY has already registered and
zapped the range, so later GUP can still fail on a freshly
registered VTL0 chunk.
Impact: Severity: Medium - matches the Medium row 'Concurrency hazard that requires
unusual scheduling to trigger (very narrow window, requires
specific hardware behavior)' because it requires a fault racing the
registration ioctl, but the failure is concrete: stale pte_special
state remains and vm_normal_page()/GUP can still fail. The success
path registers the pgmap at drivers/hv/mshv_vtl_main.c:1204, then
performs a one-shot zap at drivers/hv/mshv_vtl_main.c:1216-1223.
The order-0 fault path decides once at
drivers/hv/mshv_vtl_main.c:3730-3739:
mshv_vtl_low_resolve_page(pfn) can return NULL, after which the
handler later calls vmf_insert_mixed(vmf->vma, vmf->address, pfn).
unmap_mapping_pages() only walks and zaps currently mapped PTEs
under i_mmap_lock_read at mm/memory.c:4218-4233; it does not
serialize future fault insertion. vmf_insert_mixed() reaches
insert_pfn(), which takes the PTE lock and installs the PTE at
mm/memory.c:2576-2612. Therefore the structurally possible
interleaving is: CPU0 resolves NULL before devm_memremap_pages() is
visible, CPU1 completes memremap and unmap_mapping_pages() while no
PTE exists, then CPU0 resumes and inserts the special PTE after the
only zap.
Suggested fix: (see Evidence)
: So: real race, narrow window, observable only as the original WARN re-firing. Not a crash, not a security hole. Same severity as the bug the series already addresses, just at a finer interleaving. I am thinking of not over complicating this path.
[LOW] missing Fixes:
Sources: llm-analysis
Evidence: missing Fixes: tag
Sources: review-prompts/kernel/missing-fixes-tag.md
Evidence: This patch fixes a user-visible GUP/O_DIRECT failure in
the existing /dev/mshv_vtl_low 4K fault path caused by
vmf_insert_mixed() installing pte_special PTEs that
pin_user_pages*() cannot pin after MSHV_ADD_VTL0_MEMORY, but it
lacks a Fixes: tag. The affected VTL0 mapping/add_vtl0_mem path
appears to have been introduced by hyperv/mshv_vtl: Add SEV SNP
guest support.
Impact: (Phase 3 finding — see Evidence for full reasoning)
Suggested fix: Suggested fix: add Fixes: be32e6590b6f
Impact: lost attribution / incomplete stable backports
[LOW] [checkpatch:WARNING] line length of 107 exceeds 100 columns
Sources: tools
Evidence: checkpatch.pl (WARNING) on
0002-Drivers-hv-mshv_vtl-fix-GUP-into-VTL0-mappings-on-the-4K-fau.patch:
WARNING: line length of 107 exceeds 100 columns
Impact: checkpatch warning: style/quality issue surfaced by the kernel's patch linter.
Suggested fix: Address the checkpatch.pl finding in the patch before submission.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

Since v6.15 (aed877c, d3f7922), GUP no longer takes a pgmap reference for ZONE_DEVICE pages and walks huge entries through the unified folio path. With vmf_insert_pfn_{pmd,pud}() the mapping holds no folio reference, so a zap racing with pin_user_pages_fast() can briefly drop the folio refcount to 0 and trigger a WARN in try_grab_folio() with the I/O failing as -ENOMEM. Switch the PMD/PUD fault paths to vmf_insert_folio_{pmd,pud}(), mirroring drivers/dax/device.c. Each map takes folio_get(); the matching folio_put() in zap keeps the refcount above 0. Gate the huge inserters on pfn_valid() + ZONE_DEVICE + MEMORY_DEVICE_GENERIC via mshv_vtl_low_resolve_page(); fall back to VM_FAULT_FALLBACK when the folio order does not match PMD_ORDER/PUD_ORDER or the PFN is not yet pgmap-backed, so the core can retry at smaller order. Add VM_DONTEXPAND to the VMA to block mremap() growth past the pgmap. Signed-off-by: Naman Jain <namjain@linux.microsoft.com>

Extend the folio-aware fault path to the 4K case so GUP into /dev/mshv_vtl_low works after MSHV_ADD_VTL0_MEMORY has registered the range. With the previous vmf_insert_mixed() path the PTE was always pte_special, vm_normal_page() returned NULL during pin_user_pages*(), follow_pfn_pte() returned -EEXIST, and io_uring O_DIRECT surfaced it as "disk io error: io error: File exists (os error 17)" on the first DMA into a freshly-registered VTL0 chunk. The 4K path now resolves the PFN via mshv_vtl_low_resolve_page(): when backed by an mshv_vtl pgmap the PTE is installed with vmf_insert_page_mkwrite(), giving GUP a normal pinnable page; otherwise it falls back to vmf_insert_mixed() so early CPU accesses (e.g. the VTL2 guest-memory self test reading GPA 0 before any add_vtl0_mem ioctl) still succeed instead of SIGBUSing. Such fallback PTEs would persist across registration and break later GUP. Capture the cdev's address_space on first open and, on successful MSHV_ADD_VTL0_MEMORY, invalidate the file-offset range via unmap_mapping_range() for both the encrypted (pfn) and decrypted (pfn | DECRYPTED_MASK) aliases that mshv_vtl_low_mmap() exposes. The next access re-faults into the folio path and GUP works. Signed-off-by: Naman Jain <namjain@linux.microsoft.com>

Upgrade kernel used in OpenVMM to 6.18.0.6 release tag. This adds a fix for try_grab_folio warning in VTL2 kernel and associated Hyper-V GuestBVT test failure. Kernel PRs: microsoft/OHCL-Linux-Kernel#141 microsoft/OHCL-Linux-Kernel#144 Bug: https://microsoft.visualstudio.com/OS/_workitems/edit/62100614 Signed-off-by: Naman Jain <namjain@linux.microsoft.com>

Upgrade kernel used in OpenHCL to 6.18.0.6 release tag. This adds a fix for try_grab_folio warning in VTL2 kernel and associated Hyper-V GuestBVT test failure. Kernel PRs: microsoft/OHCL-Linux-Kernel#141 microsoft/OHCL-Linux-Kernel#144 Bug: https://microsoft.visualstudio.com/OS/_workitems/edit/62100614 Signed-off-by: Naman Jain <namjain@linux.microsoft.com> Co-authored-by: Naman Jain <namjain@linux.microsoft.com>

Copilot AI review requested due to automatic review settings June 3, 2026 06:06

Copilot started reviewing on behalf of namancse June 3, 2026 06:06 View session

Copilot AI reviewed Jun 3, 2026

View reviewed changes

Comment thread drivers/hv/mshv_vtl_main.c Outdated

namancse force-pushed the user/namjain/6.18-warn-fix-v3 branch from 09c2a53 to dce4fdc Compare June 3, 2026 06:41

namancse requested a review from hargar19 June 4, 2026 04:31

hargar19 reviewed Jun 5, 2026

View reviewed changes

Comment thread drivers/hv/mshv_vtl_main.c Outdated

hargar19 reviewed Jun 5, 2026

View reviewed changes

Comment thread drivers/hv/mshv_vtl_main.c Outdated

namancse force-pushed the user/namjain/6.18-warn-fix-v3 branch from dce4fdc to b727d48 Compare June 8, 2026 05:19

Copilot AI review requested due to automatic review settings June 8, 2026 05:19

Copilot started reviewing on behalf of namancse June 8, 2026 05:19 View session

Copilot AI reviewed Jun 8, 2026

View reviewed changes

Comment thread drivers/hv/mshv_vtl_main.c

Comment thread Microsoft/hcl-x64.config

Comment thread Microsoft/hcl-arm64.config

Naman Jain added 2 commits June 8, 2026 05:31

namancse force-pushed the user/namjain/6.18-warn-fix-v3 branch from b727d48 to c79bbfd Compare June 8, 2026 05:32

hargar19 approved these changes Jun 9, 2026

View reviewed changes

namancse merged commit ac3cf3e into product/hcl-main/6.18 Jun 9, 2026
11 checks passed

namancse mentioned this pull request Jun 9, 2026

kernel upgrade: Update OpenHCL to use v6.18.0.6 microsoft/openvmm#3699

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drivers: hv: mshv_vtl: fix GUP into VTL0 device mappings#141

Drivers: hv: mshv_vtl: fix GUP into VTL0 device mappings#141
namancse merged 2 commits into
product/hcl-main/6.18from
user/namjain/6.18-warn-fix-v3

namancse commented Jun 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

namancse commented Jun 3, 2026

Uh oh!

Uh oh!

Uh oh!

namancse commented Jun 8, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

namancse commented Jun 3, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

namancse commented Jun 3, 2026

Uh oh!

Uh oh!

Uh oh!

namancse commented Jun 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants