Skip to content

Some changes to make things easier and faster#10

Open
siilats wants to merge 2 commits intofikrikarim:mainfrom
siilats:main
Open

Some changes to make things easier and faster#10
siilats wants to merge 2 commits intofikrikarim:mainfrom
siilats:main

Conversation

@siilats
Copy link
Copy Markdown

@siilats siilats commented Apr 6, 2026

Changed host to localhost, 0.0.0.0 does not work when clicked. Added .env to gitignore. Installed litert-lm with uv so it gets used in python. Changed model to 4B

….env to gitignore. Installed litert-lm with uv so it gets used in python. Changed model to 4B
@fikrikarim
Copy link
Copy Markdown
Owner

Hi. Several things:

  1. We intentionally use 0.0.0.0 instead of localhost so it's exposed on the local network. For example, you can open the website from your phone if it's on the same network.
  2. I also want to keep the E2B as the default model, so that people with lower-end machine can still run this.
  3. Could you explain why we need the litert-lm-api-nightly?

@siilats
Copy link
Copy Markdown
Author

siilats commented Apr 8, 2026

I got a red error about not using the libLiteRtMetalAccelerator. You have in readme that you need to symlink -

"The pip package only ships libLiteRtWebGpuAccelerator.dylib. The C++ build includes native Metal
(libLiteRtMetalAccelerator.dylib) which shows ~6x faster prefill in the C++ binary. However:

That nightly fixes the red errors and looks faster, not 6x.
Maybe there is a way to show a blue hyperlink thats not 0.0.0.0 when the server starts? It gives a bad websocket error thats not clear. You can just print message too, don't type 0.0.0.0 into chrome, the ws will not work.
E4 is a lot better, maybe leave that commented in or .env. Someone did a fork with Qwen to this repo, I have been using mlx-community/Qwen3.5-35B-A3B-4bit its also realtime on the M5 Max, much nicer.

Overall amazing project, I did system prompt of

"You are a friendly AI talking to a 3 year old. She is listening "
"through a microphone and you see her through the camera. "
"She likes to talk about Rapunzel hair. Do not use emoticons or bold * symbols."

and she loves it :)

@fikrikarim
Copy link
Copy Markdown
Owner

fikrikarim commented Apr 9, 2026

and she loves it :)

Ohhh that's lovely :)

I personally prefer the E2B model since I'm on M3 Pro and the speed is much better with the E2B model. That's a good point that people should be able to easily use a different model.

Regarding the libLiteRtMetalAccelerator, it's not in README, it's in artifacts docs that I used when developing the repo. It might contain mistakes since it's AI generated. Based on this pip website, the nightly is behind on the litert-lm. The nightly is still on 0.10.0.dev while the litert-lm is on 0.10.1

Ok I see what you're saying that the websocket doesn't work with 0.0.0.0.

@siilats
Copy link
Copy Markdown
Author

siilats commented Apr 14, 2026

Regarding the libLiteRtMetalAccelerator, it's not in README, it's in artifacts docs that I used when developing the repo. It might contain mistakes since it's AI generated. Based on this pip website, the nightly is behind on the litert-lm. The nightly is still on 0.10.0.dev while the litert-lm is on 0.10.1

i think one is cli and the other is python bindings

…n (sometimes missing audio end message on barge)
@siilats
Copy link
Copy Markdown
Author

siilats commented Apr 16, 2026

I added another commit to fix the html issues where on barge it would not speak

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants