Summary
I would like to propose improving Element Web/Desktop's screenshot/media send workflow for cases where a user pastes or drops multiple screenshots while explaining a bug, support case, design issue, or technical conversation.
The desired product experience is the full flow, not only a partial composer tweak:
- stage pasted/dropped screenshots in a bounded composer tray before sending;
- use the normal composer text as the shared context/caption for the screenshot set;
- send Matrix-compatible standard media events under the hood;
- render same-batch screenshots as one coherent visual group in the timeline where supported.
I have a working local prototype and can prepare a cleaned PR if this direction is aligned with Element's product/design expectations.
User problem
When a user needs to send several screenshots together, the context is usually shared across the set:
- "Here are the three screens where the issue happens."
- "The first image is before, the second is after, the third is the error."
- "This is the flow I followed; the issue is visible across these screenshots."
Today this can feel fragmented because the screenshots are sent as separate visible items, and the user does not get a clean composer-level staging experience where one normal message explains the whole screenshot set.
This is especially noticeable on desktop where screenshots are frequently pasted or drag-dropped from OS screenshot tools.
Proposed experience
Composer
- Pasted/dropped images are staged above the normal composer input as bounded thumbnails.
- The thumbnail tray scrolls or wraps safely for multiple images and large screenshots.
- Each thumbnail has an accessible in-frame remove control.
- The normal composer text remains the single shared context/caption for the set.
- Pressing send submits the staged media set together from the user's perspective.
Event/model compatibility
The proposal should preserve interoperability:
- continue sending standard Matrix
m.room.message media events such as m.image;
- avoid a custom bundled event type that other Matrix clients cannot understand;
- optionally add namespaced content metadata to same-send media events so Element clients can identify the batch, for example an object containing a batch id, index, and count.
Other Matrix clients would still see normal media events. Element Web/Desktop could use the metadata to improve the rendering.
Timeline rendering
When several same-sender, consecutive image events belong to the same send batch, Element could render them as one coherent media group:
- one visual bundle in the timeline;
- image order preserved;
- shared caption/context shown once;
- standard per-image events still exist underneath for protocol compatibility.
Why this matters
This would make Element feel much better for:
- support/debugging conversations;
- design review;
- product/QA workflows;
- technical teams sharing logs/screenshots;
- any desktop workflow where multiple screenshots need one explanation.
It also reduces accidental context loss: the user writes the explanation once in the normal composer, and that context stays visually attached to the screenshot set.
Questions for maintainers/design
- Is this full product direction aligned with Element Web/Desktop?
- Would namespaced batch metadata on standard media events be acceptable for Element-side grouped rendering?
- Should this be contributed as one feature PR, or split into reviewable stages such as:
- composer staging/shared caption first;
- timeline grouping/batch metadata second?
- Are there existing design patterns or product constraints I should follow before preparing the PR?
Contribution readiness
I can contribute a cleaned implementation against develop with:
- TypeScript/React changes for composer staging and timeline grouping;
- Jest coverage for send behavior, metadata, and rendering;
- happy-path Playwright coverage for pasted/dropped multiple screenshots;
- before/after screenshots using synthetic test images;
- no private/local deployment artifacts.
I am opening this issue first because the contribution docs ask contributors to align new feature direction before opening a PR.
Summary
I would like to propose improving Element Web/Desktop's screenshot/media send workflow for cases where a user pastes or drops multiple screenshots while explaining a bug, support case, design issue, or technical conversation.
The desired product experience is the full flow, not only a partial composer tweak:
I have a working local prototype and can prepare a cleaned PR if this direction is aligned with Element's product/design expectations.
User problem
When a user needs to send several screenshots together, the context is usually shared across the set:
Today this can feel fragmented because the screenshots are sent as separate visible items, and the user does not get a clean composer-level staging experience where one normal message explains the whole screenshot set.
This is especially noticeable on desktop where screenshots are frequently pasted or drag-dropped from OS screenshot tools.
Proposed experience
Composer
Event/model compatibility
The proposal should preserve interoperability:
m.room.messagemedia events such asm.image;Other Matrix clients would still see normal media events. Element Web/Desktop could use the metadata to improve the rendering.
Timeline rendering
When several same-sender, consecutive image events belong to the same send batch, Element could render them as one coherent media group:
Why this matters
This would make Element feel much better for:
It also reduces accidental context loss: the user writes the explanation once in the normal composer, and that context stays visually attached to the screenshot set.
Questions for maintainers/design
Contribution readiness
I can contribute a cleaned implementation against
developwith:I am opening this issue first because the contribution docs ask contributors to align new feature direction before opening a PR.