[Deepin-Kernel-SIG] [linux 6.6.y] [Upstream] sync fixes for nested translation#1934
Conversation
Reviewer's GuideAdds nested translation support fixes in the Intel IOMMU implementation, including PSI-based IOTLB flushing for parent domains, proper dirty tracking propagation across nested domains, correct lifetime/linkage for nested domains, and various small correctness checks and cleanups in iommufd and PASID handling. Sequence diagram for dirty tracking propagation in nested translationsequenceDiagram
participant Caller as intel_iommu_set_dirty_tracking
participant Parent as dmar_domain_parent
participant Child as dmar_domain_s1
participant DevInfo as device_domain_info
participant IOMMU as intel_iommu
Caller->>Parent: intel_iommu_set_dirty_tracking(domain, enable)
activate Parent
Parent->>Parent: device_set_dirty_tracking(&devices, enable)
Parent-->>Caller: ret
alt Parent.nested_parent && ret == 0
Caller->>Parent: parent_domain_set_dirty_tracking(domain, enable)
activate Parent
loop for each s1_domain in s1_domains
Parent->>Child: device_set_dirty_tracking(&devices, enable)
loop for each info in devices
Child->>IOMMU: intel_pasid_setup_dirty_tracking(iommu, dev, IOMMU_NO_PASID, enable)
end
end
Parent-->>Caller: ret
end
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Pull request overview
This PR updates Intel VT-d nested translation behavior and IOMMUFD validation to improve correctness around IOTLB/TLB synchronization, dirty tracking, and user-facing HW pagetable allocation constraints.
Changes:
- Extend Intel VT-d nested translation flushing so S2 mapping changes trigger appropriate flushes for nested/parent relationships.
- Refactor and correct dirty-tracking enablement (derive DID from PASID entry; propagate to nested child domains).
- Tighten IOMMUFD ioctl validation and fix reserved IOVA unwind during device replacement.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| include/linux/iommu.h | Reorders NULL/type validation in user-data copy helper and clarifies kdoc wording. |
| drivers/iommu/iommufd/hw_pagetable.c | Adds validation for nested HWPT allocation (parent compatibility / user data constraints). |
| drivers/iommu/iommufd/device.c | Fixes reserved IOVA cleanup to unwind against the newly selected paging HWPT on error. |
| drivers/iommu/intel/pasid.h | Updates dirty-tracking helper prototype. |
| drivers/iommu/intel/pasid.c | Fixes dirty-tracking DID derivation and sets SSADE for nested setup when appropriate. |
| drivers/iommu/intel/nested.c | Updates nested attach/free/alloc to maintain parent tracking list and trigger IOTLB updates. |
| drivers/iommu/intel/iommu.h | Adds tracking/locking for S1 domains under a nested parent; exports domain_update_iotlb(). |
| drivers/iommu/intel/iommu.c | Adds parent flush logic, refactors PSI flush helper, adjusts blocked/nested domain lifecycle and dirty-tracking propagation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (parent->auto_domain || !parent->nest_parent || | ||
| parent->common.domain->owner != ops) | ||
| return ERR_PTR(-EINVAL); |
| static struct iommu_domain blocking_domain = { | ||
| .type = IOMMU_DOMAIN_BLOCKED, | ||
| .ops = &(const struct iommu_domain_ops) { | ||
| .attach_dev = blocking_domain_attach_dev, | ||
| .free = intel_iommu_domain_free | ||
| } | ||
| }; |
| xa_for_each(&s1_domain->iommu_array, i, info) | ||
| __iommu_flush_iotlb_psi(info->iommu, info->did, | ||
| pfn, pages, ih); | ||
|
|
aa74066 to
354e168
Compare
mainline inclusion from mainline-v6.7-rc1 category: bugfix This allows a driver to set a global static to an IDENTITY domain and the core code will automatically use it whenever an IDENTITY domain is requested. By making it always available it means the IDENTITY can be used in error handling paths to force the iommu driver into a known state. Devices implementing global static identity domains should avoid failing their attach_dev ops. To make global static domains simpler allow drivers to omit their free function and update the iommufd selftest. Convert rockchip to use the new mechanism. Tested-by: Steven Price <steven.price@arm.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Conflicts: include/linux/iommu.h (cherry picked from commit df31b29) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
mainline inclusion from mainline-v6.7-rc1 category: bugfix This is used when the iommu driver is taking control of the dma_ops, currently only on S390 and power spapr. It is designed to preserve the original ops->detach_dev() semantic that these S390 was built around. Provide an opaque domain type and a 'default_domain' ops value that allows the driver to trivially force any single domain as the default domain. Update iommufd selftest to use this instead of set_platform_dma_ops Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Conflicts: include/linux/iommu.h (cherry picked from commit 1c68cbc) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
mainline inclusion from mainline-v6.7-rc1 category: bugfix POWER is using the set_platform_dma_ops() callback to hook up its private dma_ops, but this is buired under some indirection and is weirdly happening for a BLOCKED domain as well. For better documentation create a PLATFORM domain to manage the dma_ops, since that is what it is for, and make the BLOCKED domain an alias for it. BLOCKED is required for VFIO. Also removes the leaky allocation of the BLOCKED domain by using a global static. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de> (cherry picked from commit 2ad56ef) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
mainline inclusion from mainline-v6.7-rc1 category: bugfix The global static should pre-define the type and the NOP free function can be now left as NULL. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Sven Peter <sven@svenpeter.dev> Link: https://lore.kernel.org/r/2-v2-bff223cf6409+282-dart_paging_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de> (cherry picked from commit 7b6dd84) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
mainline inclusion from mainline-v6.8-rc6 category: bugfix Today the parent domain (s2_domain) is unaware of which DID's are used by and which devices are attached to nested domains (s1_domain) nested on it. This leads to a problem that some operations (flush iotlb/devtlb and enable dirty tracking) on parent domain only apply to DID's and devices directly tracked in the parent domain hence are incomplete. This tracks the nested domains in list in parent domain. With this, operations on parent domain can loop the nested domains and refer to the devices and iommu_array to ensure the operations on parent domain take effect on all the affected devices and iommus. Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240208082307.15759-2-yi.l.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> (cherry picked from commit 85ce8e1) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
mainline inclusion from mainline-v6.8-rc6 category: bugfix Add __iommu_flush_iotlb_psi() to do the psi iotlb flush with a DID input rather than calculating it within the helper. This is useful when flushing cache for parent domain which reuses DIDs of its nested domains. Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240208082307.15759-3-yi.l.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> (cherry picked from commit 0455d31) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
mainline inclusion from mainline-v6.8-rc6 category: bugfix If a domain is used as the parent in nested translation its mappings might be cached using DID of the nested domain. But the existing code ignores this fact to only invalidate the iotlb entries tagged by the domain's own DID. Loop the s1_domains list, if any, to invalidate all iotlb entries related to the target s2 address range. According to VT-d spec there is no need for software to explicitly flush the affected s1 cache. It's implicitly done by HW when s2 cache is invalidated. Fixes: b41e38e ("iommu/vt-d: Add nested domain allocation") Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240208082307.15759-4-yi.l.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> (cherry picked from commit 8219853) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
mainline inclusion from mainline-v6.8-rc6 category: bugfix Should call domain_update_iotlb() to update the has_iotlb_device flag of the domain after attaching device to nested domain. Without it, this flag is not set properly and would result in missing device TLB flush. Fixes: 9838f2b ("iommu/vt-d: Set the nested domain to a device") Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240208082307.15759-5-yi.l.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> (cherry picked from commit 29e1048) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
mainline inclusion from mainline-v6.8-rc6 category: bugfix ATS-capable devices cache the result of nested translation. This result relies on the mappings in s2 domain (a.k.a. parent). When there are modifications in the s2 domain, the related nested translation caches on the device should be flushed. This includes the devices that are attached to the s1 domain. However, the existing code ignores this fact to only loops its own devices. As there is no easy way to identify the exact set of nested translations affected by the change of s2 domain. So, this just flushes the entire device iotlb on the device. As above, driver loops the s2 domain's s1_domains list and loops the devices list of each s1_domain to flush the entire device iotlb on the devices. Fixes: b41e38e ("iommu/vt-d: Add nested domain allocation") Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240208082307.15759-6-yi.l.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> (cherry picked from commit 5e54e86) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
…ing() mainline inclusion from mainline-v6.8-rc6 category: bugfix The only usage of input @Domain is to get the domain id (DID) to flush cache after setting dirty tracking. However, DID can be obtained from the pasid entry. So no need to pass in domain. This can make this helper cleaner when adding the missing dirty tracking for the parent domain, which needs to use the DID of nested domain. Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240208082307.15759-7-yi.l.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> (cherry picked from commit 56ecaf6) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
mainline inclusion from mainline-v6.8-rc6 category: bugfix Add device_set_dirty_tracking() to loop all the devices and set the dirty tracking per the @enable parameter. Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20240208082307.15759-8-yi.l.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> (cherry picked from commit 0c7f249) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
mainline inclusion from mainline-v6.8-rc6 category: bugfix Setting dirty tracking for a s2 domain requires to loop all the related devices and set the dirty tracking enable bit in the PASID table entry. This includes the devices that are attached to the nested domains of a s2 domain if this s2 domain is used as parent. However, the existing dirty tracking set only loops s2 domain's own devices. It will miss dirty page logs in the parent domain. Now, the parent domain tracks the nested domains, so it can loop the nested domains and the devices attached to the nested domains to ensure dirty tracking on the parent is set completely. Fixes: b41e38e ("iommu/vt-d: Add nested domain allocation") Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240208082307.15759-9-yi.l.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> (cherry picked from commit f1e1610) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
mainline inclusion from mainline-v6.8-rc6 category: bugfix Should set the SSADE (Second Stage Access/Dirty bit Enable) bit of the pasid entry when attaching a device to a nested domain if its parent has already enabled dirty tracking. Fixes: 111bf85 ("iommu/vt-d: Add helper to setup pasid nested translation") Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20240208091414.28133-1-yi.l.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> (cherry picked from commit 1f0198f) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
mainline inclusion from mainline-v6.8-rc6 category: bugfix Since the current design doesn't forward the data_type to the driver to check unless there is a data_len/uptr for a driver specific struct we should check and ensure that data_type is 0 if data_len is 0. Otherwise any value is permitted. Fixes: bd529db ("iommufd: Add a nested HW pagetable object") Link: https://lore.kernel.org/r/0-v1-9b1ea6869554+110c60-iommufd_ck_data_type_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> (cherry picked from commit 7adc0c1) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
354e168 to
b0ba61d
Compare
Fixes #1933
Summary by Sourcery
Improve Intel IOMMU nested translation handling, ensuring proper cache flushing and dirty tracking across parent and nested domains, and tighten validation in IOMMUFD user interfaces.
Bug Fixes:
Enhancements:
Documentation: