feat: enhance chat CLI with readline history, line editing, and distributed support by Vlor999 · Pull Request #1382 · ml-explore/mlx-lm

Vlor999 · 2026-06-09T13:01:24Z

Description

This PR improves the mlx_lm.chat interactive interface by adding standard terminal features and ensuring compatibility across different execution environments.

Changes

Persistent History: Integrated readline to support command history (Up/Down arrows) saved to ~/.mlx_lm_chat_history.
Line Editing: Enabled full cursor movement (Left/Right arrows) and in-place text editing for prompts.
Distributed Support: Implemented broadcast_string logic to ensure the chat loop remains synchronized across all ranks in multi-GPU/distributed mode.
Graceful Fallbacks: Added safety checks and EOFError handling (Ctrl+D to exit) for a smoother terminal experience.
User Feedback: Added an explicit warning message when model output is truncated due to max_tokens limits.

Motivation

The current chat interface is a basic loop that lacks standard CLI ergonomics. Users cannot easily correct typos or recall previous instructions, which hinders the local testing workflow. These changes bring mlx-lm closer to the UX of modern LLM toolkits.

Future Work / UI Enhancements

I have also prototyped an enhanced UI version using the rich library (see screenshot below) which supports Markdown rendering and syntax highlighting for code blocks. I kept this PR minimal to avoid adding new dependencies, but I am open to integrating a "soft dependency" version if the maintainers are interested in a more polished visual output.

Closes #818

Pr 841 refresh

…ocab_size) bound (ml-explore#1377)

nastya236 · 2026-06-09T14:02:04Z

+)
+
+
+def broadcast_string(


Why exactly this is needed?

nastya236 · 2026-06-09T14:05:09Z

        while True:
-            query = ui.prompt()
+            query = ui.prompt() if rank == 0 else ""
+            query = broadcast_string(query, group).strip()


If I understand correctly, here we communicate prompt other ranks, but I am not sure that I understand what is the goal.

The goal of that section is to make interactive chat work correctly in distributed mode.

What problem it solves:

In distributed execution, several processes/ranks are running at once.

We do not want every rank to ask the user for input.

We want only rank 0 to read the prompt from the terminal.

Then we want that same prompt to be sent to all the other ranks so they all generate from the exact same user message.

So the flow is:
User enter : "Hello"
Rank 0 reads "Hello"
Rank 1, rank 2, etc. read ""
broadcast_string(...) sends "Hello" from rank 0 to everyone after that, every rank has "Hello"
Without this, distributed chat would break in one of these ways:

every rank would try to read from stdin

non-root ranks would hang

different ranks could end up with inconsistent input

The prompt is read separately on each rank from its own stdin. Could you please clarify why non-root ranks would hang and why different ranks could end up with inconsistent input?

A simple failure may looks like this:

rank 0 reads "Hello"

rank 1 is still blocked in ui.prompt()

rank 0 starts generation or reaches a collective operation

rank 1 has not reached the same point yet

Now one rank is waiting for compute synchronization while the other is still waiting for terminal input

Did you have this issue?

It does not append to me no.
I added this since a friend and I were speaking about it and decided that it was safer !

nastya236 · 2026-06-09T14:14:38Z

Thanks for your contribution! Adding history file sounds like a great idea. I am not sure about default markdown prompt, I need to sometime to think about it. Do you mind recording 2 demo videos for single and distributed mlx_lm chat?

Vlor999 · 2026-06-09T14:47:32Z

Here is the first demo using single mlx_chat.

nastya236 · 2026-06-09T14:55:11Z

Thanks! Single chat looks great, could you do the same for distributed setting?

Vlor999 · 2026-06-09T15:07:48Z

Here is the version with distributed mlx_lm chat.
Sorry for the delay, i catched a small issue on the distributed version !

Vlor999 and others added 4 commits June 9, 2026 11:27

chat: sync distributed input on rank 0

8675d56

chat: restore rich interactive ui

0383cdb

Merge pull request #1 from Vlor999/pr-841-refresh

e887f88

Pr 841 refresh

fix(sample_utils): correct top_k error message to the exclusive (0, v…

aed9b59

…ocab_size) bound (ml-explore#1377)

nastya236 requested changes Jun 9, 2026

View reviewed changes

Restore rich chat UI under distributed launch

d2ac7bd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: enhance chat CLI with readline history, line editing, and distributed support#1382

feat: enhance chat CLI with readline history, line editing, and distributed support#1382
Vlor999 wants to merge 5 commits into
ml-explore:mainfrom
Vlor999:main

Vlor999 commented Jun 9, 2026

Uh oh!

nastya236 Jun 9, 2026

Uh oh!

nastya236 Jun 9, 2026

Uh oh!

Vlor999 Jun 9, 2026

Uh oh!

nastya236 Jun 9, 2026 •

edited

Loading

Uh oh!

Vlor999 Jun 9, 2026

Uh oh!

nastya236 Jun 10, 2026

Uh oh!

Vlor999 Jun 11, 2026

Uh oh!

nastya236 commented Jun 9, 2026 •

edited

Loading

Uh oh!

Vlor999 commented Jun 9, 2026

Uh oh!

nastya236 commented Jun 9, 2026

Uh oh!

Vlor999 commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		)


		def broadcast_string(

Conversation

Vlor999 commented Jun 9, 2026

Description

Changes

Motivation

Future Work / UI Enhancements

Uh oh!

nastya236 Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

nastya236 Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

Vlor999 Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

nastya236 Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Vlor999 Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

nastya236 Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

Vlor999 Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

nastya236 commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Vlor999 commented Jun 9, 2026

Uh oh!

nastya236 commented Jun 9, 2026

Uh oh!

Vlor999 commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nastya236 Jun 9, 2026 •

edited

Loading

nastya236 commented Jun 9, 2026 •

edited

Loading