Skip to content

Say can#7

Open
libgoncalv wants to merge 5 commits into
mainfrom
SayCan
Open

Say can#7
libgoncalv wants to merge 5 commits into
mainfrom
SayCan

Conversation

@libgoncalv
Copy link
Copy Markdown
Collaborator

@libgoncalv libgoncalv commented Feb 23, 2026

Can be merged after hybrid-intelligence/SHARPIE#76

libgoncalv and others added 3 commits February 22, 2026 23:43
Introduce saycan experiment package for language-conditioned robotic manipulation. Adds README and modules to integrate CLIPort (TransporterNet + CLIP), ViLD object detection, an Ollama-backed LLM interface, PyBullet UR5e environment, dataset utilities, and helper functions. Includes asset downloader and configuration (config.py), a CLIPort wrapper with checkpoint loading/migration logic (cliport.py), a SHARPIE-friendly environment wrapper that combines LLM planning, affordance scoring and CLIPort execution (environment.py), dataset collection code (datasets.py), and utility helpers (helpers.py).
Make environment and model behavior more robust and production-friendly:

- README: Update example defaults (wait_for_inputs -> False, inputs_type -> 'other').
- cliport.py: Remove interactive/demo run_cliport helper to avoid shipping debug UI code.
- config.py: Call download_assets() on import rather than only when executed as __main__.
- environment.py:
  - Add cv2 import and convert cached frames / camera images to RGB in render().
  - Replace ad-hoc 'task:'/plan handling with run_task(), which sets a task and automatically plans+executes steps up to a limit, returning a summary (steps, total_reward, termination_reason) and caching video frames for rendering.
  - Adjust step handling to return run_task results for 'task:' actions.
- vild.py:
  - Replace eager CLIP/TF loading with lazy getters (get_clip_model, get_tf_session) and a cleanup_models() helper to free resources.
  - Add VILD_LAZY_LOAD env var to opt into lazy loading; otherwise behavior remains backward-compatible.
  - Update build_text_embedding to obtain the CLIP model lazily, move embeddings to CPU promptly, and clear GPU cache after use to reduce memory spike.

Has been tested on an Nvidia A16 but ViLD still crashes when running multiple times
Copy link
Copy Markdown
Collaborator

@florisdenhengst florisdenhengst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider:

  • adding a Runner object insert to the README
  • factoring out the SayCan-specific logic into a separate file for environment.py

Comment thread saycan/README.md
Assets (robot URDFs, ViLD model, CLIPort checkpoint) are downloaded automatically on first run.

## Setting on the Webserver

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for completeness it would be nice to have a runner here too

Comment thread saycan/environment.py
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file mixes (a lot of) saycan-specific logic with (some) SHARPIE logic.

Perhaps it would be more illustrative to separate these in order to clarify what SHARPIE needs vs what SayCan needs.

See this PR for an (untested) suggestion/direction SayCan...saycan-split-env

Proposal to factor out base env and SHARPIE env wrapper
Add directory to path for imports in environment.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants