Conversation
|
@benedekkupper Please let me know when I should test a CI build. |
c094435 to
f2f5ac6
Compare
kareltucek
left a comment
There was a problem hiding this comment.
As for initial review: looks great. Thanks for all those refactors - they are spot on. I didn't expect you to go this deep into our side of code.
Are there any areas that should I focus on specifically with review?
I mean, are there any notable changes in logic, or potentially controversial areas? (I have tried to read through all the changes, but it is easy to miss things among minor changes, mass renames, etc....)
|
|
||
| static void addBasicScancode(uint8_t scancode, macro_usb_keyboard_reports_t* reports) | ||
| { | ||
| if (!scancode) { |
There was a problem hiding this comment.
I am wondering if it is safe to remove these...
There was a problem hiding this comment.
It is safe to remove them, because the call chain checks for zero values too. It does mean that zero values are processed longer, so if you want to speed up the error handling, I can restore these checks.
| EventVector_KeyboardLedState = 1 << 10, | ||
| EventVector_UsbMacroCommandWaitingForExecution = 1 << 11, | ||
| EventVector_ProtocolChanged = 1 << 12, | ||
| // EventVector_ProtocolChanged = 1 << 12, // unused |
There was a problem hiding this comment.
Where is the protocol change (nkro vs 6kro) now handled?
There was a problem hiding this comment.
Although, I think it is correct that the usb_report_udpater logic shouldn't be needed these days, as everything is stored in nkro format, so indeed this should be a no issue.
There was a problem hiding this comment.
Now the UHK60 handles this internally, just like the UHK80 already does.
What exactly do you have in mind? What comes to mind given this prompt:
Whether and how the reports should be queued - I assume a queue of length 1 - the switching of active and "inactive" report buffers:
Assume it takes 1ms to construct a report. (Well, I should make some statistics of this.) |
UHK60 crashes somewhere in initUsb. What is the state of testing on your end @benedekkupper . Can you efficiently test and debug with actual UHK60 hardware, or is that up to me? |
|
I have pushed a minor PID fix of west_agent.py. |
| //return; | ||
| } | ||
| std::copy(buffer.begin(), buffer.end(), in_buffer_.data() + sizeof(in_id_)); | ||
| auto result = send_report(in_buffer_); |
There was a problem hiding this comment.
Have you been dropping all errors here all along?
The key areas are the
Now it will return
This is doable, but it needs to split handling USB and BLE report sending, so I'd do it in another round.
Don't design too much around this high latency, there is a new BLE spec version already supported by nRF Connect SDK, that can drastically reduce this (LE Shorter Connection Intervals)
I have debugged the firmware a bit, but not in a production environment, I flashed the firmware without bootloader present. The keys and mouse movement is working in general, I didn't test further so far due to the limited time availability. Do you have a JTAG/SWD debugger attached, or you use some kind of log and trace functionality? Did yo flash the CI build that crashed, or your local build? Make sure you run the usual west command sequence before build: |
I tried to attach to a gdb via an swd, but failed. Can look into it deeper on monday I think. It was a local build. |
|
As for those crashes:
|
Is there any specific concern regarding this? It seems to me that the UsbReportUpdateSemaphore takes care of things quite optimally for now, and so can be checked off. I went through the usb_report_updater and hid/transport.cpp, and I don't see any obvious issues. I am running this firmware on uhk80 and it works like a charm so far. |
|
I have lost the ability to flash uhk80 via Agent with this PR. |
5e777d3 to
e61c346
Compare
What error message are you getting? |
037c4a4 to
a074135
Compare
|
It wasn't able to find the bootloader. |
|
(I mean the Agent.) |
|
Is the USB HID index of the agent command interface still hardcoded in the agent? |
|
No iirc, but I have no hard data for it. |
|
@kareltucek I can reproduce the crash, but I'm unable to debug it, as the crash only happens when the bootloader starts the application, and not the debugger. Can you provide some further information about the crash site? Maybe a wireshark capture of the USB communication up to the point of the crash? |
|
@kareltucek @mondalaci @pcooke9 can you sum up the list of regressions on your side in this PR compared to master? |
|
I am now aware only of this, but given that only dongle is affected, I suspect that it is not a c2usb issue:
I am a bit apprehensive about the hogp freezes that I reported above, as I am not aware of any related fixes, but it looks like Phil is not affected, and the last time I tested, I didn't run into it. |
This appears to be the only issue I'm aware of now. |
I don't recall any freezes here. Was it a hard freeze, requiring a reboot/reset? Can you give more details to help me try to reproduce it? Did you also have a dongle, USB, or more than one BLE device connected when it froze? |
Actually, after some time (10 or so seconds) the board would usually have crashed, so I don't recall having to hard reset it often. But yes, it would require power-cycle. The freeze kicked in right after a hogp host connected. (Either right after pairing, or right after connection if it was already paired.)
Argh. I didn't write that down. 🙈 |
|
Once a host is successfully paired, things seem to work reliably. However, often after a "successful" pairing, the device ends up in a non-functional state, returning some of -16, -104 and -128 error codes. The only solution is to forget the connection on the device and repair again :-(. Just disconnecting such connection after we detect the problem doesn't seem to suffice. This seems like a regression to me. Not sure about its source. One such failed attempt (android device): Steps to reproduce:
|
b66442e to
66431d5
Compare
On Windows after pairing and left connected with cable, the device shows as "UHK 80 Right NUS", which is confusing for the user.
The c2usb error type is just an alias to the C++ standard
Before I was testing with the right half only, not knowing that it doesn't advertise anymore if it doesn't have a connection to the left side yet. Now I could connect over BLE to Windows 11 PC, and haven't found any issues. I will test other OSes on Friday.
On my side there hasn't been any noteworthy changes to BLE, other than the report sending, which now doesn't block but returns an error instead. |
|
Regarding hogp freezes; I tested with both v16.2.0-2ff017c and 16.2.1-acebaf1, but the UHK80 never froze or crashed. I tried to reproduce it for at least thirty minutes or so.🤷🏻♂️ The following is the only unresponsive mouse output that I observed in this testing (only happens occasionally): 16.2.1-acebaf1 After pairing phone, the mouse cursor will temporarily be visible on the phone, but unresponsive. Mouse keys and touchpad output don't work, but other keystrokes and functions are fine. Switching host to USB works as normal, but mouse output remains unresponsive when switching back to phone. Rebooting the UHK80 resolves it. UART adapter output: Even more rare, but similar variation to the above; sometimes the OLED shows the phone (BLE) as the active host, but output is going to the USB host instead. (lost UART output for this one, sorry) |
|
In case I'm missing something; is simply having |
|
Not at the moment. It is possible to tweak log levels using Anyways, this is one variation of what I am seeing - one of them is busy, one of them is connection reset by peer and the last one is key has been revoked. I hope Benedek will be able to shed some light on it, otherwise it is going to be a couple daunting sessions for me in order to trace the origins of those messages. |
When you get into this state, what happens if you switchHost to usb and then again to the BLE? |
I've seen this as well.
I can't remember for sure, but I'll let you know if I see it again. |
Right. That explains why only dongle operation is affected. It is also not a regression. My guess is that the Hid_MouseScrollResolutionsChanged doesn't get called when the instance().ms_enum_.msos2_support() state changes in the above workaround. (Or doesn't get called with the correct arguments.) I am adding a naive update check into the main loop for it... I guess it is not worth to spend more time on it. |
|
I just noticed a couple of issues that are present with FW v16.2.1-3db9354, but not v16.2.0-2ff017c (so something done on April 7th?).
When reaching the Windows login screen, the LEDs turn white (if left-half USB isn't connected, left side LEDs aren't affected) and then OLED shows triangle warning icon. Agent error panel shows:
EDIT: Also happens when clicking the "Sign in" button on the Epic Games website https://store.epicgames.com/en-US (Google Chrome or Microsoft Edge browsers. Doesn't happen with Firefox.) Logs collected when using FW v16.2.1-2cbc849: UART LogAfter closing the Epic Games app or affected internet browser, the UHK80 settles down and displays the following error in Agent: I'm assuming this might be related to #1169. |
|
FW v16.2.1-2cbc849 does appear to resolve #1406 via dongle with Fast Startup enabled. But I can't be sure if it resolves it when resuming from S3 or hibernation states, since I was never able to reproduce it that way. |
|
@pcooke9 these crashes are on me, can you retry them with the new build? |
Ok, FW v16.2.1-6bd914a resolves the error when booting the PC, and the crash loop.😎 The |
This is expected (error log printing without any negative effect), the good news is that zephyr is migrating to a different allocation strategy in its upcoming release, which will eliminate this issue. |
|
I tested with Windows 11, mac and Android, they work fine for me at least for a 5 minute test each. From your logs I could reconstruct the problem of freezing BLE input, I pushed the fix here, feel free to test again. |
|
Thanks! You are the best! Will test on monday. (I am away from my uhks now.) |
major refactoring of UHK60 right USB interface
This PR is work in progress, the following tasks are still ahead:
investigatetry to reproduce uhk80 hogp crashes @kareltucekusb_report_updater@kareltucek @benedekkupper