Skip to content

[linux-nvidia-6.17-next] Add CXL Type-2 device support and CXL RAS error handling#323

Open
JiandiAnNVIDIA wants to merge 78 commits intoNVIDIA:24.04_linux-nvidia-6.17-nextfrom
JiandiAnNVIDIA:cxl_2026-02-16
Open

[linux-nvidia-6.17-next] Add CXL Type-2 device support and CXL RAS error handling#323
JiandiAnNVIDIA wants to merge 78 commits intoNVIDIA:24.04_linux-nvidia-6.17-nextfrom
JiandiAnNVIDIA:cxl_2026-02-16

Conversation

@JiandiAnNVIDIA
Copy link

@JiandiAnNVIDIA JiandiAnNVIDIA commented Feb 17, 2026

Description

This patch series adds comprehensive CXL (Compute Express Link) support to the nvidia-6.17 kernel, including:

  1. CXL Type-2 device support - Enables accelerator devices (like GPUs and SmartNICs) to use CXL for coherent memory access
  2. CXL RAS (Reliability, Availability, Serviceability) error handling - Implements PCIe Port Protocol error handling and logging for CXL devices
  3. Prerequisite CXL driver updates - Cherry-picked commits from Linux v6.18 that are required dependencies

Key Features Added:

  • CXL Type-2 accelerator device registration and memory management
  • CXL region creation by Type-2 drivers
  • DPA (Device Physical Address) allocation interface for accelerators
  • HPA (Host Physical Address) free space enumeration
  • CXL protocol error detection, forwarding, and recovery
  • RAS register mapping for CXL Endpoints and Switch Ports

Justification

CXL Type-2 device support is critical for next-generation NVIDIA accelerators and data center workloads:

  • Enables coherent memory sharing between CPUs and accelerators
  • Supports firmware-provisioned CXL regions for accelerator memory
  • Provides proper error handling and reporting for CXL fabric errors
  • Required for upcoming NVIDIA hardware with CXL capabilities

Source

Patch Breakdown (80 commits total):

Category Count Source
v6.18 CXL driver prerequisites 28 Upstream (cherry-picked from torvalds/linux v6.18)
Terry Bowman's CXL RAS series 25 Upstream (RESEND v13)
Alejandro Lucero's Type-2 series 25 Upstream (v22)
Revert old CXL reset 1 OOT (cleanup)
Config update 1 OOT (build config)

Lore Links:

Upstream Status:

  • Terry Bowman's patches: Under active review, expected to merge in v6.19
  • Alejandro Lucero's patches: Under active review, expected to merge in v6.19/v6.20
  • v6.18 cherry-picks: Already merged upstream
  • ACPI IORT workaround: NVIDIA-specific, will evaluate upstream submission

Testing

Build Validation:

  • ✅ Built successfully for ARM64 4K page size kernel
  • ✅ Built successfully for ARM64 64K page size kernel

Config Verification:

All CXL configs enabled as expected:

CONFIG_CXL_BUS=y
CONFIG_CXL_PCI=y
CONFIG_CXL_MEM=y
CONFIG_CXL_PORT=y
CONFIG_CXL_REGION=y
CONFIG_CXL_RAS=y
CONFIG_CXL_FEATURES=y
CONFIG_SFC_CXL=y
CONFIG_PCIEAER_CXL=y
CONFIG_CXL_ACPI=m
CONFIG_CXL_PMEM=m
CONFIG_CXL_PMU=m
CONFIG_DEV_DAX_CXL=m

Runtime Testing:

  • Boot test on ARM64 system
  • CXL device enumeration test
  • CXL region creation test

Notes

  • CONFIG_CXL_BUS and CONFIG_CXL_PCI changed from tristate to bool by the Type-2 patches (intentional design change for built-in CXL support)
  • Kernel config annotations updated in debian.nvidia-6.17/config/annotations to reflect these changes

CONFIG_CXL_RAS note<'New config from Terry Bowman patches'>

CONFIG_CXL_RCH_RAS policy<{'amd64': 'n', 'arm64': 'n'}>
CONFIG_CXL_RCH_RAS note<'New config from Terry Bowman patches'>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The note should explain what the config is doing rather than who/which series introduced it.

#define PCI_DVSEC_CXL_REG_LOCATOR_BIR_MASK __GENMASK(2, 0)
#define PCI_DVSEC_CXL_REG_LOCATOR_BLOCK_ID_MASK __GENMASK(15, 8)
#define PCI_DVSEC_CXL_REG_LOCATOR_BLOCK_OFF_LOW_MASK __GENMASK(31, 16)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Was it picked cleanly ? The patch looks same but I think this portion was moved. Please mention such changes if there was any.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Jiandi followed my suggestion and picked (w/ b4) Terry's series first to 6.18 to establish context and then cherry-picked from there. Both picks (with b4 and with cherry-pick) were clean and this method (with the established context) allowed git to understand that there was additional content on the 6.17 branch after these defines.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nirmoy We are specifically picking the v13 RESEND of Terry Bowman's patch series. Not his latest series. His v13 RESEND series base-commit is 211ddde which is v6.18-rc2

jan@jan-dev:~/sb/nv-kernels/NV-Kernels$ git describe --tags 211ddde
v6.18-rc2

The ~10 line offset (1244 -> 1234) came from commits and the state of the file (include/uapi/linux/pci_regs.h) that were already in the 6.17.9 kernel which the 6.17-HWE kernel is based on, causing Terry's patch which is based on v6.18-rc2 to apply at a different offset. The -3 three-way merge option handled this automatically. The -3 option of git am enables three-way merge to handle context line shifts.

@clsotog
Copy link
Collaborator

clsotog commented Feb 17, 2026

Apart of the comments from Nirmoy there are some patches that have like version after the Signed-off-by like these ones:
9b8f882
44ce9ff

Also for this commit has cherry-picked line from 1ce7465 but I am not sure if Canonical has access to this git tree.

@nvmochs
Copy link
Collaborator

nvmochs commented Feb 19, 2026

bfbaa38 NVIDIA: VR: SAUCE: [Config] CXL config change for CXL type 2 and CXL RAS support

  • I agree with Nirmoy’s comment about the notes explaining what/why the kconfig is enabled.

  • Also, did you run "fakeroot debian/rules clean && fakeroot debian/rules check_config” ? Did you try compiling for x86 as well?


1ce7465 NVIDIA: VR: SAUCE: WAR: acpi: arm64: set ACPI_IORT_MF_CANWBS

  • As discussed on Slack, I don’t think this one is needed.

44ce9ff NVIDIA: VR: SAUCE: cxl/pci: Remove unnecessary CXL Endpoint handling helper functions
9b8f882 NVIDIA: VR: SAUCE: CXL/PCI: Move CXL DVSEC definitions into uapi/linux/pci_regs.h

  • As Carol mentioned, please remove trailing “Changes” log after your SOB.

432a7b6 Revert "NVIDIA: VR: SAUCE: cxl: add support for cxl reset"

  • Would be good provide a bit more context in the revert commit message.

Something like: Reverted to allow "NVIDIA: VR: SAUCE: CXL/PCI: Move CXL DVSEC definitions into uapi/linux/pci_regs.h" to apply cleanly and because this CXL reset method is outdated and will be replaced by the version that is being pursued upstream.

@JiandiAnNVIDIA
Copy link
Author

JiandiAnNVIDIA commented Feb 19, 2026

bfbaa38 NVIDIA: VR: SAUCE: [Config] CXL config change for CXL type 2 and CXL RAS support

* Also, did you run "fakeroot debian/rules clean && fakeroot debian/rules check_config” ? Did you try compiling for x86 as well?

The check_config target doesn't exist in NV-Kernels debian/rules.

$ fakeroot debian/rules clean && fakeroot debian/rules check_config
...
...
make: *** No rule to make target 'check_config'.  Stop.

Do you mean the --check option of debian/scripts/misc/annotations script?

nvidia@localhost:/home/nvidia/jan/NV-Kernels-2026-02-16/NV-Kernels$ python3 debian/scripts/misc/annotations \
    --file debian.nvidia-6.17/config/annotations \
    --arch arm64 \
    --flavour nvidia-64k \
    --check debian/build/build-nvidia-64k/.config
check-config: loading annotations from debian.nvidia-6.17/config/annotations
check-config: all good

Did not try to compile x86. Will try.

44ce9ff NVIDIA: VR: SAUCE: cxl/pci: Remove unnecessary CXL Endpoint handling helper functions 9b8f882 NVIDIA: VR: SAUCE: CXL/PCI: Move CXL DVSEC definitions into uapi/linux/pci_regs.h

* As Carol mentioned, please remove trailing “Changes” log after your SOB.

Carol's comment is not requesting to remove trailing "Changes" log after my Sign-off-by. The following two patches in Terry Bowman's original patch from the mailing list have Changes log after their sign-off-by. I don't want to alter that.

https://lore.kernel.org/linux-cxl/20251104170305.4163840-2-terry.bowman@amd.com/
https://lore.kernel.org/linux-cxl/20251104170305.4163840-4-terry.bowman@amd.com/

She's pointing out a third patch from Terry Bowman that has the Changes log after the sign-off-bys, but my porting of that patch some how removed the changes log after the sign-off-bys. She's asking me to add it back.

https://lore.kernel.org/linux-cxl/20251104170305.4163840-3-terry.bowman@amd.com/
vs
my port
560ab08

JiandiAnNVIDIA and others added 21 commits February 20, 2026 01:43
This reverts commit 0e06082.

The CXL reset implementation is being reverted to allow
"NVIDIA: VR: SAUCE: CXL/PCI: Move CXL DVSEC definitions into
uapi/linux/pci_regs.h" to apply cleanly.  The reset functionality will be
replaced by the version currently being pursued upstream.

Signed-off-by: Jiandi An <jan@nvidia.com>
Use the string choice helper function str_plural() to simplify the code.

Signed-off-by: Xichao Zhao <zhao.xichao@vivo.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20250811122519.543554-1-zhao.xichao@vivo.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
(cherry picked from commit 22fb4ad)
Signed-off-by: Jiandi An <jan@nvidia.com>
Replace ternary operator with str_enabled_disabled() helper to enhance
code readability and consistency.

[dj: Fix spelling in commit log and subject. ]

Signed-off-by: Nai-Chen Cheng <bleach1827@gmail.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20250812-cxl-region-string-choices-v1-1-50200b0bc782@gmail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
(cherry picked from commit 733c4e9)
Signed-off-by: Jiandi An <jan@nvidia.com>
The root decoder's HPA to SPA translation logic was implemented using
a single function pointer. In preparation for additional per-decoder
callbacks, convert this into a struct cxl_rd_ops and move the
hpa_to_spa pointer into it.

To avoid maintaining a static ops instance populated with mostly NULL
pointers, allocate the ops structure dynamically only when a platform
requires overrides (e.g. XOR interleave decoding).

The setup can be extended as additional callbacks are added.

Co-developed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/818530c82c351a9c0d3a204f593068dd2126a5a9.1754290144.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
(cherry picked from commit 524b2b7)
Signed-off-by: Jiandi An <jan@nvidia.com>
When DPA->SPA translation was introduced, it included a helper that
applied the XOR maps to do the CXL HPA -> SPA translation for XOR
region interleaves. In preparation for adding SPA->DPA address
translation, introduce the reverse callback.

The root decoder callback is defined generically and not all usages
may be self inverting like this XOR function. Add another root decoder
callback that is the spa_to_hpa function.

Update the existing cxl_xor_hpa_to_spa() with a name that reflects
what it does without directionality: cxl_apply_xor_maps(), a generic
parameter: addr replaces hpa, and code comments stating that the
function supports the translation in either direction.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/79d9d72230c599cae94d7221781ead6392ae6d3f.1754290144.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
(cherry picked from commit b83ee96)
Signed-off-by: Jiandi An <jan@nvidia.com>
Add infrastructure to translate System Physical Addresses (SPA) to
Device Physical Addresses (DPA) within CXL regions. This capability
will be used by follow-on patches that add poison inject and clear
operations at the region level.

The SPA-to-DPA translation process follows these steps:
1. Apply root decoder transformations (SPA to HPA) if configured.
2. Extract the position in region interleave from the HPA offset.
3. Extract the DPA offset from the HPA offset.
4. Use position to find endpoint decoder.
5. Use endpoint decoder to find memdev and calculate DPA from offset.
6. Return the result - a memdev and a DPA.

It is Step 1 above that makes this a driver level operation and not
work we can push to user space. Rather than exporting the XOR maps for
root decoders configured with XOR interleave, the driver performs this
complex calculation for the user.

Steps 2 and 3 follow the CXL Spec 3.2 Section 8.2.4.20.13
Implementation Note: Device Decode Logic.

These calculations mirror much of the logic introduced earlier in DPA
to SPA translation, see cxl_dpa_to_hpa(), where the driver needed to
reverse the spec defined 'Device Decode Logic'.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/422f0e27742c6ca9a11f7cd83e6ba9fa1a8d0c74.1754290144.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
(cherry picked from commit dc18117)
Signed-off-by: Jiandi An <jan@nvidia.com>
The core functions that validate and send inject and clear commands
to the memdev devices require holding both the dpa_rwsem and the
region_rwsem.

In preparation for another caller of these functions that must hold
the locks upon entry, split the work into a locked and unlocked pair.

Consideration was given to moving the locking to both callers,
however, the existing caller is not in the core (mem.c) and cannot
access the locks.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/1d601f586975195733984ca63d1b5789bbe8690f.1754290144.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
(cherry picked from commit 25a0207)
Signed-off-by: Jiandi An <jan@nvidia.com>
Add CXL region debugfs attributes to inject and clear poison based
on an offset into the region. These new interfaces allow users to
operate on poison at the region level without needing to resolve
Device Physical Addresses (DPA) or target individual memdevs.

The implementation uses a new helper, region_offset_to_dpa_result()
that applies decoder interleave logic, including XOR-based address
decoding when applicable. Note that XOR decodes rely on driver
internal xormaps which are not exposed to userspace. So, this support
is not only a simplification of poison operations that could be done
using existing per memdev operations, but also it enables this
functionality for XOR interleaved regions for the first time.

New debugfs attributes are added in /sys/kernel/debug/cxl/regionX/:
inject_poison and clear_poison. These are only exposed if all memdevs
participating in the region support both inject and clear commands,
ensuring consistent and reliable behavior across multi-device regions.

If tracing is enabled, these operations are logged as cxl_poison
events in /sys/kernel/tracing/trace.

The ABI documentation warns users of the significant risks that
come with using these capabilities.

A CXL Maturity Map update shows this user flow is now supported.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/f3fd8628ab57ea79704fb2d645902cd499c066af.1754290144.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
(cherry picked from commit c3dd676)
Signed-off-by: Jiandi An <jan@nvidia.com>
…fset()

0day reported warnings of:
drivers/cxl/core/region.c:3664:25: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 4 has type 'resource_size_t' {aka 'unsigned int'} [-Wformat=]

drivers/cxl/core/region.c:3671:37: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 4 has type 'resource_size_t' {aka 'unsigned int'} [-Wformat=]

Replace %#llx with %pr to emit resource_size_t arguments.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202508160513.NAZ9i9rQ-lkp@intel.com/
Cc: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20250818153953.3658952-1-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
(cherry picked from commit e6a9530)
Signed-off-by: Jiandi An <jan@nvidia.com>
Add clarification to comment for memory hotplug callback ordering as the
current comment does not provide clear language on which callback happens
first.

Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20250829222907.1290912-2-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
(cherry picked from commit 6512886)
Signed-off-by: Jiandi An <jan@nvidia.com>
Add helper function node_update_perf_attrs() to allow update of node access
coordinates computed by an external agent such as CXL. The helper allows
updating of coordinates after the attribute being created by HMAT.

Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20250829222907.1290912-3-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
(cherry picked from commit b57fc65)
Signed-off-by: Jiandi An <jan@nvidia.com>
…ough HMAT

The current implementation of CXL memory hotplug notifier gets called
before the HMAT memory hotplug notifier. The CXL driver calculates the
access coordinates (bandwidth and latency values) for the CXL end to
end path (i.e. CPU to endpoint). When the CXL region is onlined, the CXL
memory hotplug notifier writes the access coordinates to the HMAT target
structs. Then the HMAT memory hotplug notifier is called and it creates
the access coordinates for the node sysfs attributes.

During testing on an Intel platform, it was found that although the
newly calculated coordinates were pushed to sysfs, the sysfs attributes for
the access coordinates showed up with the wrong initiator. The system has
4 nodes (0, 1, 2, 3) where node 0 and 1 are CPU nodes and node 2 and 3 are
CXL nodes. The expectation is that node 2 would show up as a target to node
0:
/sys/devices/system/node/node2/access0/initiators/node0

However it was observed that node 2 showed up as a target under node 1:
/sys/devices/system/node/node2/access0/initiators/node1

The original intent of the 'ext_updated' flag in HMAT handling code was to
stop HMAT memory hotplug callback from clobbering the access coordinates
after CXL has injected its calculated coordinates and replaced the generic
target access coordinates provided by the HMAT table in the HMAT target
structs. However the flag is hacky at best and blocks the updates from
other CXL regions that are onlined in the same node later on. Remove the
'ext_updated' flag usage and just update the access coordinates for the
nodes directly without touching HMAT target data.

The hotplug memory callback ordering is changed. Instead of changing CXL,
move HMAT back so there's room for the levels rather than have CXL share
the same level as SLAB_CALLBACK_PRI. The change will resulting in the CXL
callback to be executed after the HMAT callback.

With the change, the CXL hotplug memory notifier runs after the HMAT
callback. The HMAT callback will create the node sysfs attributes for
access coordinates. The CXL callback will write the access coordinates to
the now created node sysfs attributes directly and will not pollute the
HMAT target values.

A nodemask is introduced to keep track if a node has been updated and
prevents further updates.

Fixes: 067353a ("cxl/region: Add memory hotplug notifier for cxl region")
Cc: stable@vger.kernel.org
Tested-by: Marc Herbert <marc.herbert@linux.intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20250829222907.1290912-4-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
(cherry picked from commit 2e454fb)
Signed-off-by: Jiandi An <jan@nvidia.com>
Remove deadcode since CXL no longer calls hmat_update_target_coordinates().

Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20250829222907.1290912-5-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
(cherry picked from commit e99ecbc)
Signed-off-by: Jiandi An <jan@nvidia.com>
Fixed the following typo errors

intersparsed ==> interspersed
in Documentation/driver-api/cxl/platform/bios-and-efi.rst

Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Link: https://patch.msgid.link/20250818175335.5312-1-rakuram.e96@gmail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
(cherry picked from commit a414408)
Signed-off-by: Jiandi An <jan@nvidia.com>
ACPICA commit 710745713ad3a2543dbfb70e84764f31f0e46bdc

This has been renamed in more recent CXL specs, as
type3 (memory expanders) can also use HDM-DB for
device coherent memory.

Link: acpica/acpica@7107457
Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20250908160034.86471-1-dave@stgolabs.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
(cherry picked from commit c427290)
Signed-off-by: Jiandi An <jan@nvidia.com>
…olution

Add documentation on how to resolve conflicts between CXL Fixed Memory
Windows, Platform Low Memory Holes, intermediate Switch and Endpoint
Decoders.

[dj]: Fixed inconsistent spacing after '.'
[dj]: Fixed subject line from Alison.
[dj]: Removed '::' before table from Bagas.

Reviewed-by: Gregory Price <gourry@gourry.net>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
(cherry picked from commit c5dca38)
Signed-off-by: Jiandi An <jan@nvidia.com>
Add a helper to replace the open code detection of CXL device hierarchy
root, or the host bridge. The helper will be used for delayed downstream
port (dport) creation.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Robert Richter <rrichter@amd.com>
Tested-by: Robert Richter <rrichter@amd.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
(cherry picked from commit 4fde895)
Signed-off-by: Jiandi An <jan@nvidia.com>
Refactor the code in reap_dports() out to provide a helper function that
reaps a single dport. This will be used later in the cleanup path for
allocating a dport. Renaming to del_port() and del_dports() to mirror
devm_cxl_add_dport().

[dj] Fixed up subject per Robert

Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Robert Richter <rrichter@amd.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
(cherry picked from commit 8330671)
Signed-off-by: Jiandi An <jan@nvidia.com>
Add a cached copy of the hardware port-id list that is available at init
before all @DPORT objects have been instantiated. Change is in preparation
of delayed dport instantiation.

Reviewed-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Tested-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
(cherry picked from commit 02edab6)
Signed-off-by: Jiandi An <jan@nvidia.com>
Group the decoder setup code in switch and endpoint port probe into a
single function for each to reduce the number of functions to be mocked
in cxl_test. Introduce devm_cxl_switch_port_decoders_setup() and
devm_cxl_endpoint_decoders_setup(). These two functions will be mocked
instead with some functions optimized out since the mock version does
not do anything. Remove devm_cxl_setup_hdm(),
devm_cxl_add_passthrough_decoder(), and devm_cxl_enumerate_decoders() in
cxl_test mock code. In turn, mock_cxl_add_passthrough_decoder() can be
removed since cxl_test does not setup passthrough decoders.
__wrap_cxl_hdm_decode_init() and __wrap_cxl_dvsec_rr_decode() can be
removed as well since they only return 0 when called.

[dj: drop 'struct cxl_port' forward declaration (Robert)]

Suggested-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Robert Richter <rrichter@amd.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
(cherry picked from commit 68d5d97)
Signed-off-by: Jiandi An <jan@nvidia.com>
The current implementation enumerates the dports during the cxl_port
driver probe. Without an endpoint connected, the dport may not be
active during port probe. This scheme may prevent a valid hardware
dport id to be retrieved and MMIO registers to be read when an endpoint
is hot-plugged. Move the dport allocation and setup to behind memdev
probe so the endpoint is guaranteed to be connected.

In the original enumeration behavior, there are 3 phases (or 2 if no CXL
switches) for port creation. cxl_acpi() creates a Root Port (RP) from the
ACPI0017.N device. Through that it enumerates downstream ports composed
of ACPI0016.N devices through add_host_bridge_dport(). Once done, it
uses add_host_bridge_uport() to create the ports that enumerate the PCI
RPs as the dports of these ports. Every time a port is created, the port
driver is attached, cxl_switch_porbe_probe() is called and
devm_cxl_port_enumerate_dports() is invoked to enumerate and probe
the dports.

The second phase is if there are any CXL switches. When the pci endpoint
device driver (cxl_pci) calls probe, it will add a mem device and triggers
the cxl_mem_probe(). cxl_mem_probe() calls devm_cxl_enumerate_ports()
and attempts to discovery and create all the ports represent CXL switches.
During this phase, a port is created per switch and the attached dports
are also enumerated and probed.

The last phase is creating endpoint port which happens for all endpoint
devices.

The new sequence is instead of creating all possible dports at initial
port creation, defer port instantiation until a memdev beneath that
dport arrives. Introduce devm_cxl_create_or_extend_port() to centralize
the creation and extension of ports with new dports as memory devices
arrive. As part of this rework, switch decoder target list is amended
at runtime as dports show up.

While the decoders are allocated during the port driver probe,
The decoders must also be updated since previously they were setup when
all the dports are setup. Now every time a dport is setup per endpoint,
the switch target listing need to be updated with new dport. A
guard(rwsem_write) is used to update decoder targets. This is similar to
when decoder_populate_target() is called and the decoder programming
must be protected.

Also the port registers are probed the first time when the first dport
shows up. This ensures that the CXL link is established when the port
registers are probed.

[dj] Use ERR_CAST() (Jonathan)

Link: https://lore.kernel.org/linux-cxl/20250305100123.3077031-1-rrichter@amd.com/
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
(cherry picked from commit 4f06d81)
Signed-off-by: Jiandi An <jan@nvidia.com>
ktbowman and others added 29 commits February 27, 2026 17:30
The AER driver now forwards CXL protocol errors to the CXL driver via a
kfifo. The CXL driver must consume these work items, initiate protocol
error handling, and ensure RAS mappings remain valid throughout processing.

Implement cxl_proto_err_work_fn() to dequeue work items forwarded by the
AER service driver and begin protocol error processing by calling
cxl_handle_proto_error().

Add a PCI device lock on &pdev->dev within cxl_proto_err_work_fn() to
keep the PCI device structure valid during handling. Locking an Endpoint
will also defer RAS unmapping until the device is unlocked.

For Endpoints, add a lock on CXL memory device cxlds->dev. The CXL memory
device structure holds the RAS register reference needed during error
handling.

Add lock for the parent CXL Port for Root Ports, Downstream Ports, and
Upstream Ports to prevent destruction of structures holding mapped RAS
addresses while they are in use.

Invoke cxl_do_recovery() for uncorrectable errors. Treat this as a stub for
now; implement its functionality in a future patch.

Export pci_clean_device_status() to enable cleanup of AER status following
error handling.

Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
(backported from https://lore.kernel.org/linux-cxl/20251104170305.4163840-1-terry.bowman@amd.com/)
[jan: remove extra pci_dev_get(wd.pdev) in the consumer in cxl_proto_err_work_fn()]
[jan: change to continue instead of return to avoid dropping kfifo entries in cxl_proto_err_work_fn()]
[jan: retrive cxl_dev_state in if (is_pcie_endpoint()) in cxl_handle_proto_error()]
Signed-off-by: Jiandi An <jan@nvidia.com>
…rs_merge_result()

CXL uncorrectable errors (UCE) will soon be handled separately from the PCI
AER handling. The merge_result() function can be made common to use in both
handling paths.

Rename the PCI subsystem's merge_result() to be pci_ers_merge_result().
Export pci_ers_merge_result() to make available for the CXL and other
drivers to use.

Update pci_ers_merge_result() to support recently introduced PCI_ERS_RESULT_PANIC
result.

Signed-off-by: Terry Bowman <terry.bowman@amd.com>
(backported from https://lore.kernel.org/linux-cxl/20251104170305.4163840-1-terry.bowman@amd.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
…r recovery

Implement cxl_do_recovery() to handle uncorrectable protocol
errors (UCE), following the design of pcie_do_recovery(). Unlike PCIe,
all CXL UCEs are treated as fatal and trigger a kernel panic to avoid
potential CXL memory corruption.

Add cxl_walk_port(), analogous to pci_walk_bridge(), to traverse the
CXL topology from the error source through downstream CXL ports and
endpoints.

Introduce cxl_report_error_detected(), mirroring PCI's
report_error_detected(), and implement device locking for the affected
subtree. Endpoints require locking the PCI device (pdev->dev) and the
CXL memdev (cxlmd->dev). CXL ports require locking the PCI
device (pdev->dev) and the parent CXL port.

The device locks should be taken early where possible. The initially
reporting device will be locked after kfifo dequeue. Iterated devices
will be locked in cxl_report_error_detected() and must lock the
iterated devices except for the first device as it has already been
locked.

Export pci_aer_clear_fatal_status() for use when a UCE is not present.

Signed-off-by: Terry Bowman <terry.bowman@amd.com>
(backported from https://lore.kernel.org/linux-cxl/20251104170305.4163840-1-terry.bowman@amd.com/)
[jan: Fix NULL dereference in cxl_proto_err_work_fun() when get_cxl_port() returns NULL]
Signed-off-by: Jiandi An <jan@nvidia.com>
…t probe

CXL protocol errors are not enabled for all CXL devices after boot. These
must be enabled inorder to process CXL protocol errors.

Introduce cxl_unmask_proto_interrupts() to call pci_aer_unmask_internal_errors().
pci_aer_unmask_internal_errors() expects the pdev->aer_cap is initialized.
But, dev->aer_cap is not initialized for CXL Upstream Switch Ports and CXL
Downstream Switch Ports. Initialize the dev->aer_cap if necessary. Enable AER
correctable internal errors and uncorrectable internal errors for all CXL
devices.

Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
(backported from https://lore.kernel.org/linux-cxl/20251104170305.4163840-1-terry.bowman@amd.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
…ing CXL Port cleanup

During CXL device cleanup the CXL PCIe Port device interrupts remain
enabled. This potentially allows unnecessary interrupt processing on
behalf of the CXL errors while the device is destroyed.

Disable CXL protocol errors by setting the CXL devices' AER mask register.

Introduce pci_aer_mask_internal_errors() similar to pci_aer_unmask_internal_errors().
Add to the AER service driver allowing other subsystems to use.

Introduce cxl_mask_proto_interrupts() to call pci_aer_mask_internal_errors().
Add calls to cxl_mask_proto_interrupts() within CXL Port teardown for CXL
Root Ports, CXL Downstream Switch Ports, CXL Upstream Switch Ports, and CXL
Endpoints. Follow the same "bottom-up" approach used during CXL Port
teardown.

Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
(backported from https://lore.kernel.org/linux-cxl/20251104170305.4163840-1-terry.bowman@amd.com/)
[jan: add aer_cap check in cxl_mask_proto_interrupt()]
Signed-off-by: Jiandi An <jan@nvidia.com>
In preparation for CXL accelerator drivers that have a hard dependency on
CXL capability initialization, arrange for the endpoint probe result to be
conveyed to the caller of devm_cxl_add_memdev().

As it stands cxl_pci does not care about the attach state of the cxl_memdev
because all generic memory expansion functionality can be handled by the
cxl_core. For accelerators, that driver needs to know perform driver
specific initialization if CXL is available, or exectute a fallback to PCIe
only operation.

By moving devm_cxl_add_memdev() to cxl_mem.ko it removes async module
loading as one reason that a memdev may not be attached upon return from
devm_cxl_add_memdev().

The diff is busy as this moves cxl_memdev_alloc() down below the definition
of cxl_memdev_fops and introduces devm_cxl_memdev_add_or_reset() to
preclude needing to export more symbols from the cxl_core.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
…attach

Make it so that upon return from devm_cxl_add_endpoint() that cxl_mem_probe() can
assume that the endpoint has had a chance to complete cxl_port_probe().
I.e. cxl_port module loading has completed prior to device registration.

MODULE_SOFTDEP() is not sufficient for this purpose, but a hard link-time
dependency is reliable.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
…ration

Allow for a driver to pass a routine to be called in cxl_mem_probe()
context. This ability mirrors the semantics of faux_device_create() and
allows for the caller to run CXL-topology-attach dependent logic on behalf
of the caller.

This capability is needed for CXL accelerator device drivers that need to
make decisions about enabling CXL dependent functionality in the device, or
falling back to PCIe-only operation.

The probe callback runs after the port topology is successfully attached
for the given memdev.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Differentiate CXL memory expanders (type 3) from CXL device accelerators
(type 2) with a new function for initializing cxl_dev_state and a macro
for helping accel drivers to embed cxl_dev_state inside a private
struct.

Move structs to include/cxl as the size of the accel driver private
struct embedding cxl_dev_state needs to know the size of this struct.

Use same new initialization with the type3 pci driver.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Add CXL initialization based on new CXL API for accel drivers and make
it dependent on kernel CXL configuration.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Inside cxl/core/pci.c there are helpers for CXL PCIe initialization
meanwhile cxl/pci_drv.c implements the functionality for a Type3 device
initialization.

Move helper functions from cxl/core/pci_drv.c to cxl/core/pci.c in order
to be exported and shared with CXL Type2 device initialization.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Export cxl core functions for a Type2 driver being able to discover and
map the device component registers.

Use it in sfc driver cxl initialization.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Type3 relies on mailbox CXL_MBOX_OP_IDENTIFY command for initializing
memdev state params which end up being used for DPA initialization.

Allow a Type2 driver to initialize DPA simply by giving the size of its
volatile hardware partition.

Move related functions to memdev.

Add sfc driver as the client.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Current cxl core is relying on a CXL_DEVTYPE_CLASSMEM type device when
creating a memdev leading to problems when obtaining cxl_memdev_state
references from a CXL_DEVTYPE_DEVMEM type.

Modify check for obtaining cxl_memdev_state adding CXL_DEVTYPE_DEVMEM
support.

Make devm_cxl_add_memdev accessible from a accel driver.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Use cxl API for creating a cxl memory device using the type2
cxl_dev_state struct.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
…tted decoder

A Type2 device configured by the BIOS can already have its HDM
committed. Add a cxl_get_committed_decoder() function for cheking
so after memdev creation. A CXL region should have been created
during memdev initialization, therefore a Type2 driver can ask for
such a region for working with the HPA. If the HDM is not committed,
a Type2 driver will create the region after obtaining proper HPA
and DPA space.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
[jan: Change if (!endpoint) to if (IS_ERR_OR_NULL(endpoint)) in cxl_get_committed_decoder()]
[jan: Fix dangling pointer by removing the put_device(cxled_dev) from cxl_get_committed_decoder()]
[jan: Preserve DVSEC emulation for Type-3 devices that have unprogrammed HDM decoders]
Signed-off-by: Jiandi An <jan@nvidia.com>
A CXL region struct contains the physical address to work with.

Type2 drivers can create a CXL region but have not access to the
related struct as it is defined as private by the kernel CXL core.
Add a function for getting the cxl region range to be used for mapping
such memory range by a Type2 driver.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Zhi Wang <zhiw@nvidia.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
…ators

Add unregister_region() and cxl_decoder_detach() to the accelerator
driver API for a clean exit.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
…mware

Check if device HDM is already committed during firmware/BIOS
initialization.

A CXL region should exist if so after memdev allocation/initialization.
Get HPA from region and map it.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
[jan: The SFC caller must add put_device(&cxled->cxld.dev) in cleanup path]
Signed-off-by: Jiandi An <jan@nvidia.com>
…enumeration

CXL region creation involves allocating capacity from Device Physical
Address (DPA) and assigning it to decode a given Host Physical Address
(HPA). Before determining how much DPA to allocate the amount of available
HPA must be determined. Also, not all HPA is created equal, some HPA
targets RAM, some targets PMEM, some is prepared for device-memory flows
like HDM-D and HDM-DB, and some is HDM-H (host-only).

In order to support Type2 CXL devices, wrap all of those concerns into
an API that retrieves a root decoder (platform CXL window) that fits the
specified constraints and the capacity available for a new region.

Add a complementary function for releasing the reference to such root
decoder.

Based on https://lore.kernel.org/linux-cxl/168592159290.1948938.13522227102445462976.stgit@dwillia2-xfh.jf.intel.com/

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
[jan: Fix HPA free space calculation reporting 2 extra bytes]
Signed-off-by: Jiandi An <jan@nvidia.com>
Use cxl api for getting HPA (Host Physical Address) to use from a
CXL root decoder.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Region creation involves finding available DPA (device-physical-address)
capacity to map into HPA (host-physical-address) space.

In order to support CXL Type2 devices, define an API, cxl_request_dpa(),
that tries to allocate the DPA memory the driver requires to operate.The
memory requested should not be bigger than the max available HPA obtained
previously with cxl_get_hpa_freespace().

Based on https://lore.kernel.org/linux-cxl/168592158743.1948938.7622563891193802610.stgit@dwillia2-xfh.jf.intel.com/

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Use cxl api for getting DPA (Device Physical Address) to use through an
endpoint decoder.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Current code is expecting Type3 or CXL_DECODER_HOSTONLYMEM devices only.
Support for Type2 implies region type needs to be based on the endpoint
type HDM-D[B] instead.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Zhi Wang <zhiw@nvidia.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Davidlohr Bueso <daves@stgolabs.net>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Region creation based on Type3 devices is triggered from user space
allowing memory combination through interleaving.

In preparation for kernel driven region creation, that is Type2 drivers
triggering region creation backed with its advertised CXL memory, factor
out a common helper from the user-sysfs region setup for interleave ways.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Zhi Wang <zhiw@nvidia.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Region creation based on Type3 devices is triggered from user space
allowing memory combination through interleaving.

In preparation for kernel driven region creation, that is Type2 drivers
triggering region creation backed with its advertised CXL memory, factor
out a common helper from the user-sysfs region setup forinterleave
granularity.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Zhi Wang <zhiw@nvidia.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Creating a CXL region requires userspace intervention through the cxl
sysfs files. Type2 support should allow accelerator drivers to create
such cxl region from kernel code.

Adding that functionality and integrating it with current support for
memory expanders.

Based on https://lore.kernel.org/linux-cxl/168592159835.1948938.1647215579839222774.stgit@dwillia2-xfh.jf.intel.com/

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
[jan: Fix hardcoded position 0 for all interleave ways in cxl_create_region()]
Signed-off-by: Jiandi An <jan@nvidia.com>
By definition a type2 cxl device will use the host managed memory for
specific functionality, therefore it should not be available to other
uses.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Davidlohr Bueso <daves@stgolabs.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
(backported from https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/)
[jan: Fix pio_write_vi_base not set in CXL PIO path]
[jan: Fix region leak in efx_cxl_init() error path]
Signed-off-by: Jiandi An <jan@nvidia.com>
…RAS support

CONFIG_CXL_BUS:     Changed to bool for CXL Type-2 device support
CONFIG_CXL_PCI:     Changed to bool for CXL Type-2 device support
CONFIG_CXL_MEM:     Changed to y due to CXL_BUS being bool
CONFIG_CXL_PORT:    Changed to y due to CXL_BUS being bool
CONFIG_FWCTL:       Selected by CXL_BUS when bool
CONFIG_CXL_RAS:     CXL RAS error handling support
CONFIG_CXL_RCH_RAS: CXL Restricted CXL Host protocol error handling
CONFIG_SFC_CXL:     Solarflare SFC9100-family CXL Type-2 device support
CONFIG_ACPI_APEI_EINJ: Required for CONFIG_ACPI_APEI_EINJ_CXL
CONFIG_ACPI_APEI_EINJ_CXL: CXL protocol error injection support via APEI EINJ

Signed-off-by: Jiandi An <jan@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.