Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Fix detach request callback being treated as manifest request callback #41

Merged
merged 6 commits into from
Sep 11, 2021

Conversation

twelho
Copy link
Contributor

@twelho twelho commented Sep 6, 2021

Hello @devanlai! 👋 First of all thank you for creating dapboot and dap42, these have been invaluable tools in establishing a modern embedded debugging workflow. Over the past days I've put in some work to have dapboot replace a proprietary bootloader solution on a STM32F103-based board, but its flashing system and firmware required some options/flexibility not currently present in dapboot. In order to contribute back to this awesome project I've managed to implement these features, such as per-target backup register and magic value overrides for entering the bootloader and alternate setting support (as described here) for the exposed USB DFU interface. Together with the new board support (it has 256 KiB flash, I've added a high memory option for that as well) this is quite a large changeset, so I'll split it up into small(ish) PRs for easier review. While working on the bootloader I also uncovered and fixed some bugs, with these fixes for example dfu-util no longer throws errors after flashing (which stopped the update mechanism of the new board I was working on).

First one: For the dfu_setup function there is a callback called on_manifest_request, which in the header was called on_detach_request. Subsequently target_manifest_app (which acts as a detach callback, resetting the whole MCU) was passed in as a manifest callback and the board reset itself every time "dfuMANIFEST-SYNC" was supposed to complete. I have fixed the naming discrepancy by assuming that the callback was meant to be on_manifest_request, as that is how it's being used in dfu.c. This prevents the bootloader from detaching prematurely after download and makes it compliant with the DFU 1.1 spec (chapter 7. Manifestation Phase). Additionally many flashing routines do not expect an automatic detach, so I've assumed the DFU spec option to return to idle after download as what was originally intended (an automatic reset can still be achieved by passing the -R flag to dfu-util for example), in other words the bootloader was intended to have the USB_DFU_MANIFEST_TOLERANT attribute since there is no usage of STATE_DFU_MANIFEST_WAIT_RESET in the code.

Confirmed working with dfu-util 0.9 and 0.10 when flashing an STM32F103RC.

This prevents the bootloader from detaching prematurely after download and
makes it compliant with the DFU 1.1 spec. Many flashing routines do not
expect an automatic detach, so I've assumed the DFU spec option to return
to idle after download as what was originally intended (an automatic reset
can still be achieved by passing the -R flag to dfu-util for example), in
other words the bootloader was intended to have the USB_DFU_MANIFEST_TOLERANT
attribute since there is no STATE_DFU_MANIFEST_WAIT_RESET in the code.
@devanlai
Copy link
Owner

devanlai commented Sep 7, 2021

Hi @twelho, I'm glad you've found it useful. Your changes look well thought out, but there are some implications that I need to work through before I would be comfortable merging them.

Some of the internal names / state transitions weren't correct - I apologize for that. It indeed should have been on_manifest_request in the header as well as the body. And as you've worked out it's not consistent for the device to transition to the dfuMANIFEST-SYNC but then automatically reset itself.

Instead, it should have been transitioning to the dfuMANIFEST-WAIT-RESET state. That would then be consistent for the device's self-declared bManifestationTolerant=0 and bWillDetach=1.

Your changes are correct for changing the behavior to await an external reset with bManifestationTolerant=1 and bWillDetach=0 and indeed works with dfu-util for me. It makes sense that there are cases where this is preferable, as it would allow for reading back the firmware or continuing on to interact with other DFU interfaces.

However, when using WebUSB to drive implement the DFU protocol, there are some platforms where the USB reset doesn't work correctly and it's necessary for the device to reset itself, rather than awaiting an external reset signal. For that reason, I think I would need to add a configuration option to select between the old auto-reset behavior and the new manifestation tolerant behavior you're adding here.

…y default

Enabling DFU_WILL_DETACH will set bitManifestationTolerant = 0 and
bitWillDetach = 1, disabling it will set the opposite. If DFU_WILL_DETACH
is set, the target will automatically detach and reset following a successful
reprogramming, and if it's not set, the target will enter the dfuIDLE state.
Both of these behaviors are described in the DFU 1.1 spec. DFU_WILL_DETACH is
enabled by default for WebUSB compatibility, as some platforms will apparently
not instruct the target to reset after flashing.
@twelho
Copy link
Contributor Author

twelho commented Sep 10, 2021

I've now added DFU_WILL_DETACH to toggle between the old and new behavior, with it being enabled by default to mimic the old behavior out of the box. Setting it to zero enables the new non-detaching/resetting dfuMANIFEST-SYNC behavior. Note that for simplicity I set both bitManifestationTolerant = 0 and bitWillDetach = 1 when DFU_WILL_DETACH is enabled, since based on your description it won't work for WebUSB if the device enters dfuMANIFEST-WAIT-RESET without detaching/resetting itself as well, and needing to wait for a host-initiated reset by essentially halting the CPU seems like a rarely used feature anyways.

Could you test if this works as intended with the problematic WebGPU platforms @devanlai?

Always enter dfuMANIFEST-SYNC, but subsequently reset to manifest if the firmware is valid.
Call the target manifest callback if provided, but reset in case the callback is absent or fails to reset the device.
Per the DFU spec, a USB reset should cause the device to either reenumerate as the application or enter an error state if invalid.
Register a USB reset callback to detect and handle USB resets, taking care not to trigger the first time the bootloader is enumerated.
Run the manifest callback if present when manifesting via USB reset.
If the firmware has been written to at least once, treat a detach/USB reset as an implicit manifestation request and run the manifest request callback.
Otherwise treat it as a detach request and simply reset.
@devanlai
Copy link
Owner

Hi @twelho, I've made some adjustments to your changes.

  1. I decided to overhaul the manifestation logic a bit so that it still transitions to the dfuMANIFEST-SYNC state even when DFU_WILL_DETACH is set since that gives it a chance to run through the validation logic and enter the error state if the firmware is obviously not valid (admittedly, the firmware validation logic is rather primitive).
  2. It occurred to me that there wasn't any logic that actually explicitly handled the notion of waiting for a USB reset and then resetting to the application, so I was curious why it worked. It looks like dfu-util is actually triggering the re-enumeration via a DFU detach request, rather than a USB bus reset alone. I've added a USB reset handler that detects USB bus resets and treats them as re-enumeration requests.
  3. A DFU detach request or USB bus reset will trigger the manifest request callback if the download command was used to write to flash. This is a mostly invisible change that only applies if the manifest request callback does something visible like lighting up an LED pattern.

With these changes, the USB bus reset now works correctly to exit the bootloader with DFU_WILL_DETACH=0 with my WebUSB implementation on Linux, though Windows is still not able to generate the USB bus reset in general.

Let me know if you encounter any issues with these changes or have any suggestions, otherwise I'll go ahead and merge these combined changes.

The initial behavior of entering dfuMANIFEST-SYNC was incorrect, sorry about
that. I've now thoroughly read the DFU spec, which actually states that
dfuMANIFEST-SYNC shall be entered *twice* after a download operation if
bitMainfestationTolerant = 1 (i.e. DFU_WILL_DETACH == 0), once after the
download completes, which begins manifestation by transitioning to dfuMANIFEST,
and again after manifestation has completed successfully, and only after that
should the transition to dfuIDLE be made. This means that there should be *two*
DFU_GETSTATUS calls from the host, and indeed dfu-util does exactly this
(verified using Wireshark).

Manifesting on USB reset or on detach is not according to spec. It should only
happen right after a download operation has finished. For this I've overhauled
the dfu_on_manifest_request() function to work as it should with both
DFU_WILL_DETACH == 1 and DFU_WILL_DETACH == 0. Validation should actually be
part of manifestation, and I've now added it as the manifestation callback
(since there is no other manifestation work to be done). The scb_reset_system()
call in the USB reset callback is what you need to perform detach-by-USB-reset,
but to avoid having random reset calls in multiple places it makes much more
sense to call the actual dfu_on_detach_complete() callback which will then
perform the reset. The USB reset callback should also not validate, e.g. a
blank flash with just the bootloader should not trigger a firmware error
status, so I've removed that logic.
@twelho
Copy link
Contributor Author

twelho commented Sep 11, 2021

It did work somewhat reliably, but not correctly and fully according to the spec. Partially my fault as well. I've now spent many hours looking at the USB DFU 1.1 spec, and came to the conclusion that state transitions around manifestation were still incorrect when bitManifestationTolerant = 1. Here are all the changes I made explained:

First, the dfuMANIFEST-SYNC state needs to be entered twice when not detaching after manifestation to be fully spec compliant. Here's what the spec says about the meaning of that state:

Device has received the final block of firmware from the host 
and is waiting for receipt of DFU_GETSTATUS to begin the 
Manifestation phase; or device has completed the 
Manifestation phase and is waiting for receipt of 
DFU_GETSTATUS. (Devices that can enter this state after 
the Manifestation phase set bmAttributes bit 
bitManifestationTolerant to 1.) 

The correct flow should be the following (for both DFU_WILL_DETACH configurations):

  • Download finishes, device enters dfuMANIFEST-SYNC
  • Host issues DFU_GETSTATUS, device transitions to dfuMANIFEST
  • Device performs manifestation, which in this case is solely the application validation
    • validate_application should be the manifest_request_callback
  • If DFU_WILL_DETACH is set, the device technically transitions to dfuMANIFEST-WAIT-RESET, but because bitWillDetach = 1 it should issue a detach, which means calling dfu_on_detach_complete, which then resets the MCU. Done!
  • If DFU_WILL_DETACH is not set, the device should transition to dfuMANIFEST-SYNC again
  • Host issues DFU_GETSTATUS again, device transitions to dfuIDLE
  • Done!

This is what the correct behavior outputs on dfu-util:

Copying data from PC to DFU device
Download        [=========================] 100%        23280 bytes
Download done.
state(7) = dfuMANIFEST, status(0) = No error condition is present
state(2) = dfuIDLE, status(0) = No error condition is present
Done!

Note the transition to dfuMANIFEST where the manifestation (= validation in this case) takes place.


Additionally the USB reset handling you implemented did the trick for WebUSB due to resetting the system like the DFU_DETACH handler does, essentially it's a substitute for that as you said. I've kept the reset behavior (by actually calling dfu_on_detach_complete here, I don't like random MCU reset calls all over the code), but removed the validation and manifestation, since those do not follow the spec. Also manifestation during DFU_DETACH is not according to spec. The only place where manifestation (= validation) should happen is after the download procedure completes.

With these changes I hope that the state transitions now follow the specification. I've tested both DFU_WILL_DETACH configurations with success, but you should check still check it with your WebUSB configuration @devanlai.

@devanlai
Copy link
Owner

Hi @twelho, thanks for the followup work on the manifestation behavior. I agree with your reading of the spec and from my testing it works the way that we currently expect it to. I have some followup changes to add a new hook for the bootloader to do something just before detaching, but since that's a new feature, I won't burden your pull request with more changes.

@devanlai devanlai merged commit cf8eb38 into devanlai:master Sep 11, 2021
@twelho twelho deleted the fix-manifest-reset branch September 13, 2021 14:37
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants