feat: enhance chat CLI with readline history, line editing, and distributed support#1382
feat: enhance chat CLI with readline history, line editing, and distributed support#1382Vlor999 wants to merge 5 commits into
Conversation
| ) | ||
|
|
||
|
|
||
| def broadcast_string( |
There was a problem hiding this comment.
Why exactly this is needed?
| while True: | ||
| query = ui.prompt() | ||
| query = ui.prompt() if rank == 0 else "" | ||
| query = broadcast_string(query, group).strip() |
There was a problem hiding this comment.
If I understand correctly, here we communicate prompt other ranks, but I am not sure that I understand what is the goal.
There was a problem hiding this comment.
The goal of that section is to make interactive chat work correctly in distributed mode.
What problem it solves:
- In distributed execution, several processes/ranks are running at once.
- We do not want every rank to ask the user for input.
- We want only rank 0 to read the prompt from the terminal.
- Then we want that same prompt to be sent to all the other ranks so they all generate from the exact same user message.
So the flow is:
User enter : "Hello"
Rank 0 reads "Hello"
Rank 1, rank 2, etc. read ""
broadcast_string(...) sends "Hello" from rank 0 to everyone after that, every rank has "Hello"
Without this, distributed chat would break in one of these ways:
- every rank would try to read from stdin
- non-root ranks would hang
- different ranks could end up with inconsistent input
There was a problem hiding this comment.
The prompt is read separately on each rank from its own stdin. Could you please clarify why non-root ranks would hang and why different ranks could end up with inconsistent input?
There was a problem hiding this comment.
A simple failure may looks like this:
- rank 0 reads "Hello"
- rank 1 is still blocked in ui.prompt()
- rank 0 starts generation or reaches a collective operation
- rank 1 has not reached the same point yet
Now one rank is waiting for compute synchronization while the other is still waiting for terminal input
There was a problem hiding this comment.
Did you have this issue?
There was a problem hiding this comment.
It does not append to me no.
I added this since a friend and I were speaking about it and decided that it was safer !
|
Thanks for your contribution! Adding history file sounds like a great idea. I am not sure about default markdown prompt, I need to sometime to think about it. Do you mind recording 2 demo videos for single and distributed mlx_lm chat? |
|
Here is the first demo using single mlx_chat. |
|
Thanks! Single chat looks great, could you do the same for distributed setting? |
|
Here is the version with distributed mlx_lm chat. |
Description
This PR improves the
mlx_lm.chatinteractive interface by adding standard terminal features and ensuring compatibility across different execution environments.Changes
~/.mlx_lm_chat_history.max_tokenslimits.Motivation
The current chat interface is a basic loop that lacks standard CLI ergonomics. Users cannot easily correct typos or recall previous instructions, which hinders the local testing workflow. These changes bring mlx-lm closer to the UX of modern LLM toolkits.
Future Work / UI Enhancements
I have also prototyped an enhanced UI version using the rich library (see screenshot below) which supports Markdown rendering and syntax highlighting for code blocks. I kept this PR minimal to avoid adding new dependencies, but I am open to integrating a "soft dependency" version if the maintainers are interested in a more polished visual output.
Closes #818