Skip to content

Commit f8a0151

Browse files
Kiryl Shutsemauakpm00
authored andcommitted
mm/khugepaged: do not fail collapse_pte_mapped_thp() on SCAN_PMD_NULL
MADV_COLLAPSE on a file mapping behaves inconsistently depending on if PMD page table is installed or not. Consider following example: p = mmap(NULL, 2UL << 20, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); err = madvise(p, 2UL << 20, MADV_COLLAPSE); fd is a populated tmpfs file. The result depends on the address that the kernel returns on mmap(). If it is located in an existing PMD table, the madvise() will succeed. However, if the table does not exist, it will fail with -EINVAL. This occurs because find_pmd_or_thp_or_none() returns SCAN_PMD_NULL when a page table is missing, which causes collapse_pte_mapped_thp() to fail. SCAN_PMD_NULL and SCAN_PMD_NONE should be treated the same in collapse_pte_mapped_thp(): install the PMD leaf entry and allocate page tables as needed. Link: https://lkml.kernel.org/r/v5ivpub6z2n2uyemlnxgbilzs52ep4lrary7lm7o6axxoneb75@yfacfl5rkzeh Signed-off-by: Kiryl Shutsemau <kas@kernel.org> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Zach O'Keefe <zokeefe@google.com> Cc: Barry Song <baohua@kernel.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mariano Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
1 parent f7a741c commit f8a0151

1 file changed

Lines changed: 19 additions & 1 deletion

File tree

mm/khugepaged.c

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1460,15 +1460,32 @@ static void collect_mm_slot(struct khugepaged_mm_slot *mm_slot)
14601460
static int set_huge_pmd(struct vm_area_struct *vma, unsigned long addr,
14611461
pmd_t *pmdp, struct folio *folio, struct page *page)
14621462
{
1463+
struct mm_struct *mm = vma->vm_mm;
14631464
struct vm_fault vmf = {
14641465
.vma = vma,
14651466
.address = addr,
14661467
.flags = 0,
1467-
.pmd = pmdp,
14681468
};
1469+
pgd_t *pgdp;
1470+
p4d_t *p4dp;
1471+
pud_t *pudp;
14691472

14701473
mmap_assert_locked(vma->vm_mm);
14711474

1475+
if (!pmdp) {
1476+
pgdp = pgd_offset(mm, addr);
1477+
p4dp = p4d_alloc(mm, pgdp, addr);
1478+
if (!p4dp)
1479+
return SCAN_FAIL;
1480+
pudp = pud_alloc(mm, p4dp, addr);
1481+
if (!pudp)
1482+
return SCAN_FAIL;
1483+
pmdp = pmd_alloc(mm, pudp, addr);
1484+
if (!pmdp)
1485+
return SCAN_FAIL;
1486+
}
1487+
1488+
vmf.pmd = pmdp;
14721489
if (do_set_pmd(&vmf, folio, page))
14731490
return SCAN_FAIL;
14741491

@@ -1544,6 +1561,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
15441561
switch (result) {
15451562
case SCAN_SUCCEED:
15461563
break;
1564+
case SCAN_PMD_NULL:
15471565
case SCAN_PMD_NONE:
15481566
/*
15491567
* All pte entries have been removed and pmd cleared.

0 commit comments

Comments
 (0)