Skip to content

drivers/usb/hub.c: delay hub_hc_release_resources() until address0_mutex is held #5

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 226 commits into
base: bazzite-6.14
Choose a base branch
from

Conversation

ovflowd
Copy link

@ovflowd ovflowd commented May 15, 2025

The call to hub_hc_release_resources() was incorrectly made before acquiring hcd->address0_mutex. Since the reset_device callback can put the device into the default state, this sequence risks violating the intended serialization. Moving the call ensures proper synchronization.

See: https://bugzilla.kernel.org/show_bug.cgi?id=220069
Especially, Comment torvalds#43, torvalds#48, torvalds#51

antheas and others added 30 commits May 14, 2025 21:01
Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
…lly"

This reverts commit 891f889.

Revert because of compatibility issues of MS Surface devices and GRUB
with NX. In short, these devices get stuck on boot with NX advertised.
So to not advertise it, add the respective option back in.

Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Patchset: secureboot
On some Surface 3, the DMI table gets corrupted for unknown reasons
and breaks existing DMI matching used for device-specific quirks.

This commit adds the (broken) DMI data into dmi_system_id tables used
for quirks so that each driver can enable quirks even on the affected
systems.

On affected systems, DMI data will look like this:
    $ grep . /sys/devices/virtual/dmi/id/{bios_vendor,board_name,board_vendor,\
    chassis_vendor,product_name,sys_vendor}
    /sys/devices/virtual/dmi/id/bios_vendor:American Megatrends Inc.
    /sys/devices/virtual/dmi/id/board_name:OEMB
    /sys/devices/virtual/dmi/id/board_vendor:OEMB
    /sys/devices/virtual/dmi/id/chassis_vendor:OEMB
    /sys/devices/virtual/dmi/id/product_name:OEMB
    /sys/devices/virtual/dmi/id/sys_vendor:OEMB

Expected:
    $ grep . /sys/devices/virtual/dmi/id/{bios_vendor,board_name,board_vendor,\
    chassis_vendor,product_name,sys_vendor}
    /sys/devices/virtual/dmi/id/bios_vendor:American Megatrends Inc.
    /sys/devices/virtual/dmi/id/board_name:Surface 3
    /sys/devices/virtual/dmi/id/board_vendor:Microsoft Corporation
    /sys/devices/virtual/dmi/id/chassis_vendor:Microsoft Corporation
    /sys/devices/virtual/dmi/id/product_name:Surface 3
    /sys/devices/virtual/dmi/id/sys_vendor:Microsoft Corporation

Signed-off-by: Tsuchiya Yuto <kitakar@gmail.com>
Patchset: surface3-oemb
The most recent firmware of the 88W8897 card reports a hardcoded LTR
value to the system during initialization, probably as an (unsuccessful)
attempt of the developers to fix firmware crashes. This LTR value
prevents most of the Microsoft Surface devices from entering deep
powersaving states (either platform C-State 10 or S0ix state), because
the exit latency of that state would be higher than what the card can
tolerate.

Turns out the card works just the same (including the firmware crashes)
no matter if that hardcoded LTR value is reported or not, so it's kind
of useless and only prevents us from saving power.

To get rid of those hardcoded LTR reports, it's possible to reset the
PCI bridge device after initializing the cards firmware. I'm not exactly
sure why that works, maybe the power management subsystem of the PCH
resets its stored LTR values when doing a function level reset of the
bridge device. Doing the reset once after starting the wifi firmware
works very well, probably because the firmware only reports that LTR
value a single time during firmware startup.

Patchset: mwifiex
Currently, mwifiex fw will crash after suspend on recent kernel series.
On Windows, it seems that the root port of wifi will never enter D3 state
(stay on D0 state). And on Linux, disabling the D3 state for the
bridge fixes fw crashing after suspend.

This commit disables the D3 state of root port on driver initialization
and fixes fw crashing after suspend.

Signed-off-by: Tsuchiya Yuto <kitakar@gmail.com>
Patchset: mwifiex
The Marvell 88W8897 combined wifi and bluetooth card (pcie+usb version)
is used in a lot of Microsoft Surface devices, and all those devices
suffer from very low 2.4GHz wifi connection speeds while bluetooth is
enabled. The reason for that is that the default passive scanning
interval for Bluetooth Low Energy devices is quite high in Linux
(interval of 60 msec and scan window of 30 msec, see hci_core.c), and
the Marvell chip is known for its bad bt+wifi coexisting performance.

So decrease that passive scan interval and make the scan window shorter
on this particular device to allow for spending more time transmitting
wifi signals: The new scan interval is 250 msec (0x190 * 0.625 msec) and
the new scan window is 6.25 msec (0xa * 0,625 msec).

This change has a very large impact on the 2.4GHz wifi speeds and gets
it up to performance comparable with the Windows driver, which seems to
apply a similar quirk.

The interval and window length were tested and found to work very well
with a lot of Bluetooth Low Energy devices, including the Surface Pen, a
Bluetooth Speaker and two modern Bluetooth headphones. All devices were
discovered immediately after turning them on. Even lower values were
also tested, but they introduced longer delays until devices get
discovered.

Patchset: mwifiex
Some Surface devices, specifically the Surface Go and AMD version of the
Surface Laptop 3 (wich both come with QCA6174 WiFi chips), work better
with a different board file, as it seems that the firmeware included
upstream is buggy.

As it is generally not a good idea to randomly overwrite files, let
alone doing so via packages, we add module parameters to override those
file names in the driver. This allows us to package/deploy the override
via a modprobe.d config.

Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Patchset: ath10k
Signed-off-by: Dorian Stoll <dorian.stoll@tmsp.io>
Patchset: ipts
Adds a quirk so that IOMMU uses passthrough mode for the IPTS device.
Otherwise, when IOMMU is enabled, IPTS produces DMAR errors like:

DMAR: [DMA Read NO_PASID] Request device [00:16.4] fault addr
0x104ea3000 [fault reason 0x06] PTE Read access is not set

This is very similar to the bug described at:
https://bugs.launchpad.net/bugs/1958004

Fixed with the following patch which this patch basically copies:
https://launchpadlibrarian.net/586396847/43255ca.diff

Signed-off-by: Dorian Stoll <dorian.stoll@tmsp.io>
Patchset: ipts
Based on linux-surface/intel-precise-touch@8abe268

Signed-off-by: Dorian Stoll <dorian.stoll@tmsp.io>
Patchset: ipts
Signed-off-by: Dorian Stoll <dorian.stoll@tmsp.io>
Patchset: ithc
Based on quo/ithc-linux@34539af4726d.

Signed-off-by: Maximilian Stoll <luzmaximilian@gmail.com>
Patchset: ithc
Microsoft Surface Pro 4 and Book 1 devices access the MSHW0030 I2C
device via a generic serial bus operation region and RawBytes read
access. On the Surface Book 1, this access is required to turn on (and
off) the discrete GPU.

Multiple things are to note here:

a) The RawBytes access is device/driver dependent. The ACPI
   specification states:

   > Raw accesses assume that the writer has knowledge of the bus that
   > the access is made over and the device that is being accessed. The
   > protocol may only ensure that the buffer is transmitted to the
   > appropriate driver, but the driver must be able to interpret the
   > buffer to communicate to a register.

   Thus this implementation may likely not work on other devices
   accessing I2C via the RawBytes accessor type.

b) The MSHW0030 I2C device is an HID-over-I2C device which seems to
   serve multiple functions:

   1. It is the main access point for the legacy-type Surface Aggregator
      Module (also referred to as SAM-over-HID, as opposed to the newer
      SAM-over-SSH/UART). It has currently not been determined on how
      support for the legacy SAM should be implemented. Likely via a
      custom HID driver.

   2. It seems to serve as the HID device for the Integrated Sensor Hub.
      This might complicate matters with regards to implementing a
      SAM-over-HID driver required by legacy SAM.

In light of this, the simplest approach has been chosen for now.
However, it may make more sense regarding breakage and compatibility to
either provide functionality for replacing or enhancing the default
operation region handler via some additional API functions, or even to
completely blacklist MSHW0030 from the I2C core and provide a custom
driver for it.

Replacing/enhancing the default operation region handler would, however,
either require some sort of secondary driver and access point for it,
from which the new API functions would be called and the new handler
(part) would be installed, or hard-coding them via some sort of
quirk-like interface into the I2C core.

Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Patchset: surface-sam-over-hid
Add driver exposing the discrete GPU power-switch of the  Microsoft
Surface Book 1 to user-space.

On the Surface Book 1, the dGPU power is controlled via the Surface
System Aggregator Module (SAM). The specific SAM-over-HID command for
this is exposed via ACPI. This module provides a simple driver exposing
the ACPI call via a sysfs parameter to user-space, so that users can
easily power-on/-off the dGPU.

Patchset: surface-sam-over-hid
The power button on the AMD variant of the Surface Laptop uses the
same MSHW0040 device ID as the 5th and later generation of Surface
devices, however they report 0 for their OEM platform revision.  As the
_DSM does not exist on the devices requiring special casing, check for
the existance of the _DSM to determine if soc_button_array should be
loaded.

Fixes: c394159 ("Input: soc_button_array - add support for newer surface devices")
Co-developed-by: Maximilian Luz <luzmaximilian@gmail.com>

Signed-off-by: Sachi King <nakato@nakato.io>
Patchset: surface-button
The AMD variant of the Surface Laptop report 0 for their OEM platform
revision.  The Surface devices that require the surfacepro3_button
driver do not have the _DSM that gets the OEM platform revision.  If the
method does not exist, load surfacepro3_button.

Fixes: 64dd243 ("platform/x86: surfacepro3_button: Fix device check")
Co-developed-by: Maximilian Luz <luzmaximilian@gmail.com>

Signed-off-by: Sachi King <nakato@nakato.io>
Patchset: surface-button
The touchpad on the Type-Cover of the Surface Go 3 is sometimes not
being initialized properly. Apply USB_QUIRK_DELAY_INIT to fix this
issue.

More specifically, the device in question is a fairly standard modern
touchpad with pointer and touchpad input modes. During setup, the device
needs to be switched from pointer- to touchpad-mode (which is done in
hid-multitouch) to fully utilize it as intended. Unfortunately, however,
this seems to occasionally fail silently, leaving the device in
pointer-mode. Applying USB_QUIRK_DELAY_INIT seems to fix this.

Link: linux-surface/linux-surface#1059
Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Patchset: surface-typecover
The Type Cover for Microsoft Surface devices supports a special usb
control request to disable or enable the built-in keyboard backlight.
On Windows, this request happens when putting the device into suspend or
resuming it, without it the backlight of the Type Cover will remain
enabled for some time even though the computer is suspended, which looks
weird to the user.

So add support for this special usb control request to hid-multitouch,
which is the driver that's handling the Type Cover.

The reason we have to use a pm_notifier for this instead of the usual
suspend/resume methods is that those won't get called in case the usb
device is already autosuspended.

Also, if the device is autosuspended, we have to briefly autoresume it
in order to send the request. Doing that should be fine, the usb-core
driver does something similar during suspend inside choose_wakeup().

To make sure we don't send that request to every device but only to
devices which support it, add a new quirk
MT_CLS_WIN_8_MS_SURFACE_TYPE_COVER to hid-multitouch. For now this quirk
is only enabled for the usb id of the Surface Pro 2017 Type Cover, which
is where I confirmed that it's working.

Patchset: surface-typecover
The Surface Pro Type Cover has several non standard HID usages in it's
hid report descriptor.
I noticed that, upon folding the typecover back, a vendor specific range
of 4 32 bit integer hid usages is transmitted.
Only the first byte of the message seems to convey reliable information
about the keyboard state.

0x22 => Normal (keys enabled)
0x33 => Folded back (keys disabled)
0x53 => Rotated left/right side up (keys disabled)
0x13 => Cover closed (keys disabled)
0x43 => Folded back and Tablet upside down (keys disabled)
This list may not be exhaustive.

The tablet mode switch will be disabled for a value of 0x22 and enabled
on any other value.

Patchset: surface-typecover
Work around buggy EFI firmware: On some Microsoft Surface devices
(Surface Pro 9 and Surface Laptop 5) the EFI ResetSystem call with
EFI_RESET_SHUTDOWN doesn't function properly. Instead of shutting the
system down, it returns and the system stays on.

It turns out that this only happens after PCI shutdown callbacks ran for
specific devices. Excluding those devices from the shutdown process
makes the ResetSystem call work as expected.

TODO: Maybe we can find a better way or the root cause of this?

Not-Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Patchset: surface-shutdown
Add the lid GPE used by the Surface Pro 9.

Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Patchset: surface-gpe
… device

The clk and regulator frameworks expect clk/regulator consumer-devices
to have info about the consumed clks/regulators described in the device's
fw_node.

To work around cases where this info is not present in the firmware tables,
which is often the case on x86/ACPI devices, both frameworks allow the
provider-driver to attach info about consumers to the clks/regulators
when registering these.

This causes problems with the probe ordering wrt drivers for consumers
of these clks/regulators. Since the lookups are only registered when the
provider-driver binds, trying to get these clks/regulators before then
results in a -ENOENT error for clks and a dummy regulator for regulators.

One case where we hit this issue is camera sensors such as e.g. the OV8865
sensor found on the Microsoft Surface Go. The sensor uses clks, regulators
and GPIOs provided by a TPS68470 PMIC which is described in an INT3472
ACPI device. There is special platform code handling this and setting
platform_data with the necessary consumer info on the MFD cells
instantiated for the PMIC under: drivers/platform/x86/intel/int3472.

For this to work properly the ov8865 driver must not bind to the I2C-client
for the OV8865 sensor until after the TPS68470 PMIC gpio, regulator and
clk MFD cells have all been fully setup.

The OV8865 on the Microsoft Surface Go is just one example, all X86
devices using the Intel IPU3 camera block found on recent Intel SoCs
have similar issues where there is an INT3472 HID ACPI-device, which
describes the clks and regulators, and the driver for this INT3472 device
must be fully initialized before the sensor driver (any sensor driver)
binds for things to work properly.

On these devices the ACPI nodes describing the sensors all have a _DEP
dependency on the matching INT3472 ACPI device (there is one per sensor).

This allows solving the probe-ordering problem by delaying the enumeration
(instantiation of the I2C-client in the ov8865 example) of ACPI-devices
which have a _DEP dependency on an INT3472 device.

The new acpi_dev_ready_for_enumeration() helper used for this is also
exported because for devices, which have the enumeration_by_parent flag
set, the parent-driver will do its own scan of child ACPI devices and
it will try to enumerate those during its probe(). Code doing this such
as e.g. the i2c-core-acpi.c code must call this new helper to ensure
that it too delays the enumeration until all the _DEP dependencies are
met on devices which have the new honor_deps flag set.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Patchset: cameras
Intel IPU(Image Processing Unit) has its own (IO)MMU hardware,
The IPU driver allocates its own page table that is not mapped
via the DMA, and thus the Intel IOMMU driver blocks access giving
this error: DMAR: DRHD: handling fault status reg 3 DMAR:
[DMA Read] Request device [00:05.0] PASID ffffffff
fault addr 76406000 [fault reason 06] PTE Read access is not set
As IPU is not an external facing device which is not risky, so use
IOMMU passthrough mode for Intel IPUs.

Change-Id: I6dcccdadac308cf42e20a18e1b593381391e3e6b
Depends-On: Iacd67578e8c6a9b9ac73285f52b4081b72fb68a6
Tracked-On: #JIITL8-411
Signed-off-by: Bingbu Cao <bingbu.cao@intel.com>
Signed-off-by: zouxiaoh <xiaohong.zou@intel.com>
Signed-off-by: Xu Chongyang <chongyang.xu@intel.com>
Patchset: cameras
The TPS68470 PMIC has an I2C passthrough mode through which I2C traffic
can be forwarded to a device connected to the PMIC as though it were
connected directly to the system bus. Enable this mode when the chip
is initialised.

Signed-off-by: Daniel Scally <djrscally@gmail.com>
Patchset: cameras
ACPI _HID INT347E represents the OmniVision 7251 camera sensor. The
driver for this sensor expects a single pin named "enable", but on
some Microsoft Surface platforms the sensor is assigned a single
GPIO who's type flag is INT3472_GPIO_TYPE_RESET.

Remap the GPIO pin's function from "reset" to "enable". This is done
outside of the existing remap table since it is a more widespread
discrepancy than that method is designed for. Additionally swap the
polarity of the pin to match the driver's expectation.

Signed-off-by: Daniel Scally <dan.scally@ideasonboard.com>
Patchset: cameras
Update the control ID for the gain control in the ov7251 driver to
V4L2_CID_ANALOGUE_GAIN.

Signed-off-by: Daniel Scally <dan.scally@ideasonboard.com>
Patchset: cameras
antheas and others added 23 commits May 14, 2025 22:39
By moving the Display On/Off calls outside of the suspend sequence and
introducing a slight delay after Display Off, the ROG Ally controller
functions exactly as it does in Windows.

Therefore, remove the quirk that fixed the controller only when the
mcu_powersave attribute was disabled, while adding a large amount of
delay to the suspend and wake process.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
Windows enters the s2idle screen_off state before hibernation, which
turns off e.g., the keyboard light and in some devices pulses the
suspend light. It may also turn off auxiliaries prior to hibernation.
These auxiliaries are 1) not used to maintain the swap of the system
and 2) can misbehave during the wake-ups in hibernation.

One example is the OneXPlayer X1 controller, which will have its
vibration motor stuck on, if it is not first deactivated by entering
the sleep state.

Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
Add a sysfs attribute to allow informing the kernel about the current
standby state, those being: "active", "screen_off", "sleep", and
"resume" (to prepare turning the display on). The final modern
standby state DRIPS is omitted, as that is entered during the kernel
suspend process and userspace will never see it.

Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
Certain OneXPlayer and GPD devices contain a hibernation feature where
the EC wakes up the device and pretends to overheat, causing Windows to
hibernate. This is a three part process, where after the device enters
S2idle through calling any _DSM callback (e.g., Sleep Entry, LPS0
Entry), they begin a counter that measures how much battery is drained.
Once it is at 5%, the EC of the device wakes it up if its in LPS0 by
pressing the powerbutton, and once it is in Modern Standby again, it
triggers the overheat event.

For some reason, this notification is missed in Linux after wake-up.
So call _WAK when exiting s2idle, which sets OSFG to 1 and has the
EC resend overheat notification by triggering Query 0x58.

The three OXP devices have no side effects, they only set OSFG. For
GPD, we do S3, which has also sets some other variables.

Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
Modify msi_wmi_platform_query() to reuse the input buffer for
returning the result of a WMI method call. Using a separate output
buffer to return the result is unnecessary because the WMI interface
requires both buffers to have the same length anyway.

Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Adds fan curve support for the MSI platform. These devices contain
support for two fans, where they are named CPU and GPU but in the
case of the Claw series just map to left and right fan.

Co-developed-by: Antheas Kapenekakis <lkml@antheas.dev>
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
MSI uses the WMI interface as a passthrough for writes to the EC
and uses a board name match and a quirk table from userspace on
Windows. Therefore, there is no auto-detection functionality and
we have to fallback to a quirk table.

Introduce it here, prior to starting to add features to it.

Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
MSI's version of platform profile in Windows is called shift mode.
Introduce it here, and add a profile handler to it.

It has 5 modes: sport, comfort, green, eco, and user.
Confusingly, for the Claw, MSI only uses sport, green, and eco,
where they correspond to performance, balanced, and low-power.
Therefore, comfort is mapped to balanced-performance, and user to
custom.

Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
…ibutes

Adds PL1, and PL2 support through the firmware attributes interface.
The min and max values are quirked, and the attributes are only defined
if they are set to a non-zero value. These values are meant to be set
in conjunction with shift mode, where shift mode automatically sets
an upper bound on PL1/PL2 (e.g., low-power would be used with 8W).

Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
The battery of MSI laptops supports charge threshold. Add support for it.

Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
Currently, the platform driver always exposes 4 fans, since the
underlying WMI interface reads 4 values from the EC. However, most
devices only have two fans. Therefore, at least in the case of the
Claw series, quirk the driver to only show two hwmon fans.

Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
… unload

MSI software is a bit weird in that even when the manual fan curve is
disabled, the fan speed is still somewhat affected by the curve. So
we have to restore the fan curves on unload and PWM disable, as it
is done in Windows.

Suggested-by: Armin Wolf <W_Armin@gmx.de>
Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
Using a crit message breaks seamless boot by making the kernel
wipe the framebuffer. This leads to an unfavorable user experience.

Therefore, lower to err.

Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
May break USB devices after a while if powersave mode is enabled.

This reverts commit 59bfeaf.
This reverts commit 3a9626c.

This breaks S4 because we end up setting the s3/s0ix flags
even when we are entering s4 since prepare is used by both
flows.  The causes both the S3/s0ix and s4 flags to be set
which breaks several checks in the driver which assume they
are mutually exclusive.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3634
Cc: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Set the s3/s0ix and s4 flags in the pm notifier so that we can skip
the resource evictions properly in pm prepare based on whether
we are suspending or hibernating.  Drop the eviction as processes
are not frozen at this time, we we can end up getting stuck trying
to evict VRAM while applications continue to submit work which
causes the buffers to get pulled back into VRAM.

v2: Move suspend flags out of pm notifier (Mario)

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4178
Fixes: 2965e63 ("drm/amd: Add Suspend/Hibernate notification callback support")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Suspend and hibernate notifications are available specifically when
the sequence starts and finishes.  However there are no notifications
during the process when tasks have been frozen.

Introduce two new events `PM_SUSPEND_POST_FREEZE` and
`PM_HIBERNATE_POST_FREEZE` that drivers can subscribe to and take
different actions specifically knowing userspace is frozen.

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
commit 2965e63 ("drm/amd: Add Suspend/Hibernate notification
callback support") introduced a VRAM eviction earlier in the PM
sequences when swap was still available for evicting to. This helped
to fix a number of memory pressure related bugs but also exposed a
new one. If a userspace process is actively using the GPU when suspend
starts then a deadlock could occur. For this reason, it was reverted.

Re-introduce it and instead of going off the prepare notifier, use the
PM notifiers that occur after processes have been frozen to do evictions.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4178
Fixes: 2965e63 ("drm/amd: Add Suspend/Hibernate notification callback support")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
…tex is held

The call to hub_hc_release_resources() was incorrectly made before acquiring hcd->address0_mutex. Since the reset_device callback can put the device into the default state, this sequence risks violating the intended serialization. Moving the call ensures proper synchronization.
@ovflowd
Copy link
Author

ovflowd commented May 15, 2025

(This PR is solely with the purpose of verifying this patch;) @antheas is there a way I can build the Bazzite Kernel locally and get GRUB loading it?

I'm unfamiliar if replacing a kernel on a ostree-based system is even safe; Apologies, I'm a newcomer when it comes on building the Kernel. (I could just build the source Kernel with my patch; But I'm unfamiliar if a given flavor of the Kernel is used on my system, based on bazzite-kernel)

@antheas
Copy link
Member

antheas commented May 15, 2025

I don't know how to replace the kernel either. I have a dev ssd. There is a draft sync script that I never managed to finish.

What is this fixing?

@ovflowd
Copy link
Author

ovflowd commented May 15, 2025

I don't know how to replace the kernel either. I have a dev ssd. There is a draft sync script that I never managed to finish.

What is this fixing?

Apparently a bug was introduced on torvalds@2b66ef84d0d2 that is similar to the bug report I initially opened a while ago; The bug freezes the xhci controller. Comments 42, 43 on https://bugzilla.kernel.org/show_bug.cgi?id=220069 explain it a bit more.

@antheas
Copy link
Member

antheas commented May 15, 2025

The easiest way for you to test this is install an arch Linux distro such as endeavoros, then patch and build the stock 6.14 kernel and see if that fixes it.

If it does I will include the fix and it will go into testing

You can build the kernel with make pacman-0kg -j $(ncores) and then install with sudo pacman -U linux-upstream*

@antheas antheas force-pushed the bazzite-6.14 branch 3 times, most recently from eed3fb0 to d4364ec Compare May 20, 2025 18:43
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.