Skip to content

Improve multi-screenshot paste/drop workflow with shared composer context #3231

@pmbemax

Description

@pmbemax

Summary

I would like to propose improving Element Web/Desktop's screenshot/media send workflow for cases where a user pastes or drops multiple screenshots while explaining a bug, support case, design issue, or technical conversation.

The desired product experience is the full flow, not only a partial composer tweak:

  1. stage pasted/dropped screenshots in a bounded composer tray before sending;
  2. use the normal composer text as the shared context/caption for the screenshot set;
  3. send Matrix-compatible standard media events under the hood;
  4. render same-batch screenshots as one coherent visual group in the timeline where supported.

I have a working local prototype and can prepare a cleaned PR if this direction is aligned with Element's product/design expectations.

User problem

When a user needs to send several screenshots together, the context is usually shared across the set:

  • "Here are the three screens where the issue happens."
  • "The first image is before, the second is after, the third is the error."
  • "This is the flow I followed; the issue is visible across these screenshots."

Today this can feel fragmented because the screenshots are sent as separate visible items, and the user does not get a clean composer-level staging experience where one normal message explains the whole screenshot set.

This is especially noticeable on desktop where screenshots are frequently pasted or drag-dropped from OS screenshot tools.

Proposed experience

Composer

  • Pasted/dropped images are staged above the normal composer input as bounded thumbnails.
  • The thumbnail tray scrolls or wraps safely for multiple images and large screenshots.
  • Each thumbnail has an accessible in-frame remove control.
  • The normal composer text remains the single shared context/caption for the set.
  • Pressing send submits the staged media set together from the user's perspective.

Event/model compatibility

The proposal should preserve interoperability:

  • continue sending standard Matrix m.room.message media events such as m.image;
  • avoid a custom bundled event type that other Matrix clients cannot understand;
  • optionally add namespaced content metadata to same-send media events so Element clients can identify the batch, for example an object containing a batch id, index, and count.

Other Matrix clients would still see normal media events. Element Web/Desktop could use the metadata to improve the rendering.

Timeline rendering

When several same-sender, consecutive image events belong to the same send batch, Element could render them as one coherent media group:

  • one visual bundle in the timeline;
  • image order preserved;
  • shared caption/context shown once;
  • standard per-image events still exist underneath for protocol compatibility.

Why this matters

This would make Element feel much better for:

  • support/debugging conversations;
  • design review;
  • product/QA workflows;
  • technical teams sharing logs/screenshots;
  • any desktop workflow where multiple screenshots need one explanation.

It also reduces accidental context loss: the user writes the explanation once in the normal composer, and that context stays visually attached to the screenshot set.

Questions for maintainers/design

  1. Is this full product direction aligned with Element Web/Desktop?
  2. Would namespaced batch metadata on standard media events be acceptable for Element-side grouped rendering?
  3. Should this be contributed as one feature PR, or split into reviewable stages such as:
    • composer staging/shared caption first;
    • timeline grouping/batch metadata second?
  4. Are there existing design patterns or product constraints I should follow before preparing the PR?

Contribution readiness

I can contribute a cleaned implementation against develop with:

  • TypeScript/React changes for composer staging and timeline grouping;
  • Jest coverage for send behavior, metadata, and rendering;
  • happy-path Playwright coverage for pasted/dropped multiple screenshots;
  • before/after screenshots using synthetic test images;
  • no private/local deployment artifacts.

I am opening this issue first because the contribution docs ask contributors to align new feature direction before opening a PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions