Skip to content

Extracted facts lose context when compressing multi-fact statements #33

@brgsk

Description

@brgsk

Problem

When the extraction pipeline compresses conversation context into atomic facts, it sometimes drops critical contextual details that were "obvious" in the original conversation but are essential for answering questions at retrieval time.

Example (LongMemEval 51a45a95)

Original conversation turns:

USER: I like the idea of using a binder with labeled sections. I've been using the
      Cartwheel app from Target and it's been really helpful...
USER: I actually redeemed a $5 coupon on coffee creamer last Sunday, which was a
      nice surprise since I didn't know I had it in my email inbox.
USER: I shop at Target pretty frequently, maybe every other week.

Extracted facts:

  1. User redeemed a $5 coupon on coffee creamer on 2023-05-28, which was a surprise since User didn't know they had it in their email inbox.
  2. User shops at Target pretty frequently, approximately every other week.

Question: "Where did I redeem a $5 coupon on coffee creamer?"
Gold answer: Target
Got: In your email inbox.

The extractor kept the surprising detail ("found in email inbox") but dropped the location ("at Target" / "using Cartwheel app") because it seemed obvious from surrounding conversation context. At retrieval time, that context is gone — only the extracted facts remain.

Root Cause

The extraction prompt emphasizes atomicity and conciseness. When a user turn contains multiple pieces of information, the extractor compresses them and makes implicit judgments about what's "important." Details that seem redundant in conversation context (the store name, when the whole conversation is about Target) become essential when the fact stands alone.

Impact

This is a fundamental tension in extract-then-retrieve architectures: extraction must decide what to keep. Over-compressing loses context; under-compressing creates noise. The current prompts lean toward compression, which hurts retrieval accuracy.

Observed in LongMemEval benchmark — contributes to errors in single-session-user type questions.

Possible Approaches

  1. Prompt tuning: Instruct the extractor to make facts fully self-contained — "each fact should be understandable without any surrounding context"
  2. Entity co-reference: When a fact involves an action (redeemed coupon), ensure the location/entity from the conversation context is included
  3. Episode-level retrieval fallback: Store and retrieve raw episode text alongside extracted facts, so the answer LLM has both compressed facts and original context (already in PLAN.md parking lot)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions