I'm having an issue with my Raspberry Pi 5 (8GB) using the official Raspberry Pi M.2 HAT+ and Raspberry Pi SSD. I'm powering it with the official 27W USB-C power supply. I followed the full set-up instructions carefully, including checking the ribbon cable connection multiple times. Initially, the first SSD provided wasn't detected at all. To verify the setup, I tried the NVMe drive from my laptop in the HAT+, and it was detected immediately, so the hardware setup seemed fine. I was then sent a replacement official Raspberry Pi SSD, which was detected — but only when I enabled PCIe Gen 3 using dtparam=pciex1_gen=3 in /boot/config.txt. However, Gen 3 was extremely unstable, and after just a few minutes of use, the SSD became completely undetectable again.
I've tried the following steps:
Verified connections and reseated the FFC ribbon cable multiple times.
Switched between Gen 1, Gen 2, and Gen 3 PCIe settings in /boot/config.txt.
Disabled power-saving features via /boot/cmdline.txt with:
pcie_aspm.policy=performance nvme_core.default_ps_max_latency_us=0.
Confirmed the firmware and OS are fully up to date (EEPROM + kernel).
Rebooted between changes and re-tested the SSD's presence using lsblk and dmesg.
When the SSD is detected, it often shows as 0B in size, and eventually drops off entirely. dmesg frequently reports errors such as:
the controller is down, will reset
Unable to change power state from D3cold to D0
Disabling device after reset failure
Repeated I/O errors and buffer read failures on nvme0n1
[ 0.549867] nvme nvme0: pci function 0001:01:00.0
[ 0.549880] nvme 0001:01:00.0: enabling device (0000 -> 0002)
[ 0.553647] nvme nvme0: D3 entry latency set to 8 seconds
[ 0.567105] nvme nvme0: failed to allocate host memory buffer.
[ 0.574788] nvme nvme0: 4/0/0 default/read/poll queues
[ 3.160857] nvme nvme0: using unchecked data buffer
[ 33.785628] nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0x10
[ 33.785638] nvme nvme0: Does your device have a faulty power saving mode enabled?
[ 33.785640] nvme nvme0: Try "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off" and report a bug
[ 33.829722] nvme0n1: I/O Cmd(0x2) @ LBA 500118016, 32 blocks, I/O Error (sct 0x3 / sc 0x71)
[ 33.829735] I/O error, dev nvme0n1, sector 500118016 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[ 33.881750] nvme 0001:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
[ 33.881763] nvme nvme0: Disabling device after reset failure: -19
[ 33.909649] Buffer I/O error on dev nvme0n1, logical block 31257376, async page read
nvme0n1: rw=0, sector=500118032, nr_sectors = 16 limit=0
[ 33.909667] Buffer I/O error on dev nvme0n1, logical block 31257377, async page read
Additionally, the SSD gets extremely hot, even under Gen 1 speeds.
I'm now at the point where even the replacement SSD is not being detected at all. Given that a known-good SSD worked in the same setup, I’m wondering if both official SSDs were faulty, or if this points to a broader compatibility or power issue. I’d appreciate help diagnosing whether this is a firmware, power delivery, or hardware fault — and whether this is a known issue with the official SSD or HAT+.
I've tried the following steps:
Verified connections and reseated the FFC ribbon cable multiple times.
Switched between Gen 1, Gen 2, and Gen 3 PCIe settings in /boot/config.txt.
Disabled power-saving features via /boot/cmdline.txt with:
pcie_aspm.policy=performance nvme_core.default_ps_max_latency_us=0.
Confirmed the firmware and OS are fully up to date (EEPROM + kernel).
Rebooted between changes and re-tested the SSD's presence using lsblk and dmesg.
When the SSD is detected, it often shows as 0B in size, and eventually drops off entirely. dmesg frequently reports errors such as:
the controller is down, will reset
Unable to change power state from D3cold to D0
Disabling device after reset failure
Repeated I/O errors and buffer read failures on nvme0n1
[ 0.549867] nvme nvme0: pci function 0001:01:00.0
[ 0.549880] nvme 0001:01:00.0: enabling device (0000 -> 0002)
[ 0.553647] nvme nvme0: D3 entry latency set to 8 seconds
[ 0.567105] nvme nvme0: failed to allocate host memory buffer.
[ 0.574788] nvme nvme0: 4/0/0 default/read/poll queues
[ 3.160857] nvme nvme0: using unchecked data buffer
[ 33.785628] nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0x10
[ 33.785638] nvme nvme0: Does your device have a faulty power saving mode enabled?
[ 33.785640] nvme nvme0: Try "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off" and report a bug
[ 33.829722] nvme0n1: I/O Cmd(0x2) @ LBA 500118016, 32 blocks, I/O Error (sct 0x3 / sc 0x71)
[ 33.829735] I/O error, dev nvme0n1, sector 500118016 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[ 33.881750] nvme 0001:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
[ 33.881763] nvme nvme0: Disabling device after reset failure: -19
[ 33.909649] Buffer I/O error on dev nvme0n1, logical block 31257376, async page read
nvme0n1: rw=0, sector=500118032, nr_sectors = 16 limit=0
[ 33.909667] Buffer I/O error on dev nvme0n1, logical block 31257377, async page read
Additionally, the SSD gets extremely hot, even under Gen 1 speeds.
I'm now at the point where even the replacement SSD is not being detected at all. Given that a known-good SSD worked in the same setup, I’m wondering if both official SSDs were faulty, or if this points to a broader compatibility or power issue. I’d appreciate help diagnosing whether this is a firmware, power delivery, or hardware fault — and whether this is a known issue with the official SSD or HAT+.
Statistics: Posted by sap144 — Thu May 15, 2025 9:39 pm