-
-
Notifications
You must be signed in to change notification settings - Fork 116
feat(linux): Add Wayland support for text and window title capture #229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This commit introduces comprehensive Wayland support for text and active window title capture, running alongside the existing X11 implementation. A GUI backend system has been implemented to abstract the underlying display server, with separate backends for X11 and Wayland. The application now detects the session type via the XDG_SESSION_TYPE environment variable and loads the appropriate backend. The new WaylandBackend uses wl-paste (from wl-clipboard) for text capture. For retrieving the active window title, it sequentially attempts to use: - wlrctl for wlroots-based compositors (Sway, Hyprland). - kwin5 for KDE Plasma. - A D-Bus call to a GNOME Shell extension for GNOME (Mutter). The dbus-next library has been added to requirements.txt to support the D-Bus integration on GNOME. Documentation has been added to the README.md explaining the new Wayland support and the necessary system dependencies.
|
Hello @momokrono @theJayTea Please consider this an initial draft, I tested it in KDE wayland(and KDE x11) and it works flawlessly. Of-course for wayland I could NOT find python packages to I need help with testing this upgrade in various Linux DE's. Also I use uv for dependency management, hence the addition of Oh, I also ran |
|
One more though occurred, we might want to use litellm instead of these libraries This would allow us to support more LLM's easily. |
|
Hi @CsBigDataHub! This is a really interesting approach, thanks for your contribution. As for LiteLLM, that’s something I'll consider for the future! |
|
@momokrono , you might find this interesting :) (PS - how've you been!) |
|
Hi and thank you for the draft, really appreciate the effort you've put into adding Wayland support! While reviewing the changes, I noticed that a significant portion of the diff comes from stylistic updates, like switching string quotes from logging.debug('Initializing WritingToolApp')to logging.debug("Initializing WritingToolApp")or reformatting single-line expressions into multi-line blocks: self.providers = [GeminiProvider(self), OpenAICompatibleProvider(self), OllamaProvider(self)]to self.providers = [
GeminiProvider(self),
OpenAICompatibleProvider(self),
OllamaProvider(self),
]These changes aren’t necessarily bad (in fact, they might improve consistency or readability) but because they’re mixed in with the actual functional changes, the PR has ballooned to +2,740 −888 lines. That makes it way harder to review the actual Wayland-related code (which is only around 200 lines). GitHub doesn't even load the full diff by default. To keep things focused, could you split this into two PRs? One for the style/formatting cleanup (which could be discussed with input from @theJayTea, since I believe they should have the final word about the code style), and another that contains just the Wayland support. That way, we can review and merge the feature without unrelated changes, and ensure any style changes are intentional and agreed upon separately (perhaps using something like Thanks again! |
Hello @momokrono , Yes I used I am traveling right now and will try to split this PR up in 2 weeks time. Please have a look at |
|
@CsBigDataHub @momokrono thank you for helping with this! Sorry about the formatting hassle; I'll add a formatter recommendation in the contribution section later so this doesn't happen again. (I used autopep8, which isn't anywhere near as aggressive about changing every quote lol. i like using single quotes; it kept that). @CsBigDataHub , as momokrono suggested, would you be able to copy and paste just the actual Wayland-related code into a new PR? (it's okay if that section has That way, the PR can actually be reviewed haha. |
…and window management Windows_and_Linux/WritingToolApp.py (paste_simulation, _handle_wayland_paste): Add robust clipboard backup and restore mechanism Windows_and_Linux/backends/wayland_backend.py: Implement paste_text method with multi-fallback approach using pyperclip as primary method Windows_and_Linux/WritingToolApp.py (_handle_wayland_paste, _try_comprehensive_replacement, _store_original_window): Add window management and clipboard synchronization improvements Windows_and_Linux/backends/wayland_backend.py: Add KDE-specific paste methods for enhanced compatibilityfixed way
…ucture * Windows_and_Linux/backends/wayland_backend_new.py (func1, func2): Add multi-compositor support for window title detection with fallback mechanisms * Windows_and_Linux/test_comprehensive_wayland.py: Implement full environment simulation and tool availability checks * Windows_and_Linux/test_enhanced_wayland.py: Validate ydotool daemon responsiveness and keyboard simulation methods * Windows_and_Linux/test_final_complete.py: Test all replacement strategies for Wayland input with fallback handling * Windows_and_Linux/test_final_solution.py: Verify timeout resolution and performance improvements Windows_and_Linux/WritingToolApp.py (func1, func2): Add security safeguards to ydotool key command execution with length limits and character validation
instructions` This commit adds documentation for installing and configuring the ydotool service, which is now required for the project. The changes include: 1. Updated package installation commands across all supported distributions to include ydotool 2. Added detailed instructions for setting up the ydotool service 3. Included troubleshooting tips for common permission issues 4. Provided alternative approaches for service configuration The documentation explains how to properly configure ydotool as a system service rather than a user service, which was causing issues in the default installation.
This commit introduces comprehensive Wayland support for text and active window
title capture, running alongside the existing X11 implementation.
A GUI backend system has been implemented to abstract the underlying display
server, with separate backends for X11 and Wayland. The application now detects
the session type via the XDG_SESSION_TYPE environment variable and loads the
appropriate backend.
The new WaylandBackend uses wl-paste (from wl-clipboard) for text capture. For
retrieving the active window title, it sequentially attempts to use:
The dbus-next library has been added to requirements.txt to support the D-Bus
integration on GNOME. Documentation has been added to the README.md explaining
the new Wayland support and the necessary system dependencies.