Skip to content

Conversation

@CsBigDataHub
Copy link

This commit introduces comprehensive Wayland support for text and active window
title capture, running alongside the existing X11 implementation.

A GUI backend system has been implemented to abstract the underlying display
server, with separate backends for X11 and Wayland. The application now detects
the session type via the XDG_SESSION_TYPE environment variable and loads the
appropriate backend.

The new WaylandBackend uses wl-paste (from wl-clipboard) for text capture. For
retrieving the active window title, it sequentially attempts to use:

  • wlrctl for wlroots-based compositors (Sway, Hyprland).
  • kwin5 for KDE Plasma.
  • A D-Bus call to a GNOME Shell extension for GNOME (Mutter).

The dbus-next library has been added to requirements.txt to support the D-Bus
integration on GNOME. Documentation has been added to the README.md explaining
the new Wayland support and the necessary system dependencies.

This commit introduces comprehensive Wayland support for text and active window
  title capture, running alongside the existing X11 implementation.

  A GUI backend system has been implemented to abstract the underlying display
  server, with separate backends for X11 and Wayland. The application now detects
  the session type via the XDG_SESSION_TYPE environment variable and loads the
  appropriate backend.

  The new WaylandBackend uses wl-paste (from wl-clipboard) for text capture. For
  retrieving the active window title, it sequentially attempts to use:
   - wlrctl for wlroots-based compositors (Sway, Hyprland).
   - kwin5 for KDE Plasma.
   - A D-Bus call to a GNOME Shell extension for GNOME (Mutter).

  The dbus-next library has been added to requirements.txt to support the D-Bus
  integration on GNOME. Documentation has been added to the README.md explaining
  the new Wayland support and the necessary system dependencies.
@CsBigDataHub
Copy link
Author

CsBigDataHub commented Sep 27, 2025

Hello @momokrono @theJayTea

Please consider this an initial draft, I tested it in KDE wayland(and KDE x11) and it works flawlessly.

Of-course for wayland I could NOT find python packages to get_selected_text and get_active_window_title , so I leveraged few system libraries based on desktop environment and DBus. I called it out in README(temp one I created in Windows and Linux directory).

I need help with testing this upgrade in various Linux DE's.

Also I use uv for dependency management, hence the addition of pyptoject.toml and uv.lock. These files can be deleted before merging if you are NOT comfortable with it.

Oh, I also ran uv format on the code base, hence changes to existing code.

@CsBigDataHub
Copy link
Author

once merged, we should be able to close

#93
#99

@CsBigDataHub
Copy link
Author

One more though occurred, we might want to use litellm instead of these libraries

google-generativeai
openai
ollama

This would allow us to support more LLM's easily.
It is quite popular and used in many projects. Aider uses it.

@theJayTea
Copy link
Owner

Hi @CsBigDataHub! This is a really interesting approach, thanks for your contribution.
I have ongoing exams until 11 days from now, so I'll take a look at this then and we can discuss merging it (looking forward to merging it! I just have to see if I should merge this before or after my own upcoming update changes).

As for LiteLLM, that’s something I'll consider for the future!

@theJayTea
Copy link
Owner

@momokrono , you might find this interesting :) (PS - how've you been!)

@momokrono
Copy link
Collaborator

Hi and thank you for the draft, really appreciate the effort you've put into adding Wayland support!

While reviewing the changes, I noticed that a significant portion of the diff comes from stylistic updates, like switching string quotes from ' to ":

logging.debug('Initializing WritingToolApp')

to

logging.debug("Initializing WritingToolApp")

or reformatting single-line expressions into multi-line blocks:

self.providers = [GeminiProvider(self), OpenAICompatibleProvider(self), OllamaProvider(self)]

to

self.providers = [
    GeminiProvider(self),
    OpenAICompatibleProvider(self),
    OllamaProvider(self),
]

These changes aren’t necessarily bad (in fact, they might improve consistency or readability) but because they’re mixed in with the actual functional changes, the PR has ballooned to +2,740 −888 lines. That makes it way harder to review the actual Wayland-related code (which is only around 200 lines). GitHub doesn't even load the full diff by default.

To keep things focused, could you split this into two PRs? One for the style/formatting cleanup (which could be discussed with input from @theJayTea, since I believe they should have the final word about the code style), and another that contains just the Wayland support.

That way, we can review and merge the feature without unrelated changes, and ensure any style changes are intentional and agreed upon separately (perhaps using something like editorconfig? IDK).

Thanks again!

@CsBigDataHub
Copy link
Author

CsBigDataHub commented Oct 2, 2025

Hi and thank you for the draft, really appreciate the effort you've put into adding Wayland support!

While reviewing the changes, I noticed that a significant portion of the diff comes from stylistic updates, like switching string quotes from ' to ":

logging.debug('Initializing WritingToolApp')

to

logging.debug("Initializing WritingToolApp")

or reformatting single-line expressions into multi-line blocks:

self.providers = [GeminiProvider(self), OpenAICompatibleProvider(self), OllamaProvider(self)]

to

self.providers = [
    GeminiProvider(self),
    OpenAICompatibleProvider(self),
    OllamaProvider(self),
]

These changes aren’t necessarily bad (in fact, they might improve consistency or readability) but because they’re mixed in with the actual functional changes, the PR has ballooned to +2,740 −888 lines. That makes it way harder to review the actual Wayland-related code (which is only around 200 lines). GitHub doesn't even load the full diff by default.

To keep things focused, could you split this into two PRs? One for the style/formatting cleanup (which could be discussed with input from @theJayTea, since I believe they should have the final word about the code style), and another that contains just the Wayland support.

That way, we can review and merge the feature without unrelated changes, and ensure any style changes are intentional and agreed upon separately (perhaps using something like editorconfig? IDK).

Thanks again!

Hello @momokrono ,

Yes I used uv format to format the files, it went ahead and formatted the files according to default standards set up by the linters. uv format used ruff in the background.

I am traveling right now and will try to split this PR up in 2 weeks time.

Please have a look at uv project, it has quickly become a standard for python development in the community.

@theJayTea
Copy link
Owner

@CsBigDataHub @momokrono thank you for helping with this!

Sorry about the formatting hassle; I'll add a formatter recommendation in the contribution section later so this doesn't happen again.

(I used autopep8, which isn't anywhere near as aggressive about changing every quote lol. i like using single quotes; it kept that).


@CsBigDataHub , as momokrono suggested, would you be able to copy and paste just the actual Wayland-related code into a new PR? (it's okay if that section has ruff's different formatting!)

That way, the PR can actually be reviewed haha.

…and window management

Windows_and_Linux/WritingToolApp.py (paste_simulation, _handle_wayland_paste): Add robust clipboard backup and restore mechanism
Windows_and_Linux/backends/wayland_backend.py: Implement paste_text method with multi-fallback approach using pyperclip as primary method

Windows_and_Linux/WritingToolApp.py (_handle_wayland_paste, _try_comprehensive_replacement, _store_original_window): Add window management and clipboard synchronization improvements
Windows_and_Linux/backends/wayland_backend.py: Add KDE-specific paste methods for enhanced compatibilityfixed way
…ucture

* Windows_and_Linux/backends/wayland_backend_new.py (func1, func2): Add multi-compositor support for window title detection with fallback mechanisms
* Windows_and_Linux/test_comprehensive_wayland.py: Implement full environment simulation and tool availability checks
* Windows_and_Linux/test_enhanced_wayland.py: Validate ydotool daemon responsiveness and keyboard simulation methods
* Windows_and_Linux/test_final_complete.py: Test all replacement strategies for Wayland input with fallback handling
* Windows_and_Linux/test_final_solution.py: Verify timeout resolution and performance improvements

Windows_and_Linux/WritingToolApp.py (func1, func2): Add security safeguards to ydotool key command execution with length limits and character validation
instructions`

This commit adds documentation for installing and configuring the
ydotool service, which is now required for the project. The changes
include:

1. Updated package installation commands across all supported
distributions to include ydotool 2. Added detailed instructions for
setting up the ydotool service 3. Included troubleshooting tips for
common permission issues 4. Provided alternative approaches for
service configuration

The documentation explains how to properly configure ydotool as a
system service rather than a user service, which was causing issues
in the default installation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants