Skip to content

unreliable test: memory integration test is flaky #432

@Hweinstock

Description

@Hweinstock

ex. https://github.com/aws/bedrock-agentcore-sdk-python/actions/runs/24842115061/job/72718841096

FAILED tests_integ/memory/integrations/test_session_manager.py::TestAgentCoreMemorySessionManager::test_session_manager_with_retrieval_config_adds_context - assert '<user_context>' in '[{\'role\': \'user\', \'content\': [{\'text\': \'I like sushi with tuna\'}]}, {\'role\': \'assistant\', \'content\': [{\'text\': "That\'s great! Tuna is one of the most popular sushi ingredients. There are several delicious varieties you might enjoy:\\n\\n- **Akami** - The deep red, lean cut from the sides and back\\n- **Chutoro** - Medium fatty tuna from the belly, with a buttery texture\\n- **Otoro** - The fattiest, most prized cut that melts in your mouth\\n- **Spicy tuna** - Minced tuna mixed with spicy mayo, often in rolls\\n\\nDo you have a favorite style of tuna sushi? Are you into traditional nigiri, or do you prefer it in rolls like spicy tuna rolls or tuna avocado rolls?"}], \'metadata\': {\'usage\': {\'inputTokens\': 20, \'outputTokens\': 159, \'totalTokens\': 179}, \'metrics\': {\'latencyMs\': 2876, \'timeToFirstByteMs\': 902}}}, {\'role\': \'user\', \'content\': [{\'text\': \'What do I like to eat?\'}]}, {\'role\': \'assistant\', \'content\': [{\'text\': "Based on what you\'ve told me, I know you like sushi with tuna! But that\'s the only food preference you\'ve shared so far. \\n\\nI don\'t have information about your other food likes or dislikes. Would you like to tell me more about what you enjoy eating? I\'d be happy to discuss other foods, suggest recipes, or talk about cuisines you\'re interested in!"}], \'metadata\': {\'usage\': {\'inputTokens\': 189, \'outputTokens\': 86, \'totalTokens\': 275}, \'metrics\': {\'latencyMs\': 2135, \'timeToFirstByteMs\': 973}}}]'
 +  where '[{\'role\': \'user\', \'content\': [{\'text\': \'I like sushi with tuna\'}]}, {\'role\': \'assistant\', \'content\': [{\'text\': "That\'s great! Tuna is one of the most popular sushi ingredients. There are several delicious varieties you might enjoy:\\n\\n- **Akami** - The deep red, lean cut from the sides and back\\n- **Chutoro** - Medium fatty tuna from the belly, with a buttery texture\\n- **Otoro** - The fattiest, most prized cut that melts in your mouth\\n- **Spicy tuna** - Minced tuna mixed with spicy mayo, often in rolls\\n\\nDo you have a favorite style of tuna sushi? Are you into traditional nigiri, or do you prefer it in rolls like spicy tuna rolls or tuna avocado rolls?"}], \'metadata\': {\'usage\': {\'inputTokens\': 20, \'outputTokens\': 159, \'totalTokens\': 179}, \'metrics\': {\'latencyMs\': 2876, \'timeToFirstByteMs\': 902}}}, {\'role\': \'user\', \'content\': [{\'text\': \'What do I like to eat?\'}]}, {\'role\': \'assistant\', \'content\': [{\'text\': "Based on what you\'ve told me, I know you like sushi with tuna! But that\'s the only food preference you\'ve shared so far. \\n\\nI don\'t have information about your other food likes or dislikes. Would you like to tell me more about what you enjoy eating? I\'d be happy to discuss other foods, suggest recipes, or talk about cuisines you\'re interested in!"}], \'metadata\': {\'usage\': {\'inputTokens\': 189, \'outputTokens\': 86, \'totalTokens\': 275}, \'metrics\': {\'latencyMs\': 2135, \'timeToFirstByteMs\': 973}}}]' = str([{'content': [{'text': 'I like sushi with tuna'}], 'role': 'user'}, {'content': [{'text': "That's great! Tuna is one of the most popular sushi ingredients. There are several delicious varieties you might enjoy:\n\n- **Akami** - The deep red, lean cut from the sides and back\n- **Chutoro** - Medium fatty tuna from the belly, with a buttery texture\n- **Otoro** - The fattiest, most prized cut that melts in your mouth\n- **Spicy tuna** - Minced tuna mixed with spicy mayo, often in rolls\n\nDo you have a favorite style of tuna sushi? Are you into traditional nigiri, or do you prefer it in rolls like spicy tuna rolls or tuna avocado rolls?"}], 'metadata': {'metrics': {'latencyMs': 2876, 'timeToFirstByteMs': 902}, 'usage': {'inputTokens': 20, 'outputTokens': 159, 'totalTokens': 179}}, 'role': 'assistant'}, {'content': [{'text': 'What do I like to eat?'}], 'role': 'user'}, {'content': [{'text': "Based on what you've told me, I know you like sushi with tuna! But that's the only food preference you've shared so far. \n\nI don't have information about your other food likes or dislikes. Would you like to tell me more about what you enjoy eating? I'd be happy to discuss other foods, suggest recipes, or talk about cuisines you're interested in!"}], 'metadata': {'metrics': {'latencyMs': 2135, 'timeToFirstByteMs': 973}, 'usage': {'inputTokens': 189, 'outputTokens': 86, 'totalTokens': 275}}, 'role': 'assistant'}])

Acceptance Criteria:

  • investigate what mechanisms currently exist to avoid this flakiness.
  • look into adding more to made this test reliable.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions