-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Hello, @MikukuOvO I'm PhD in autonomic computing and runtime adaptation and I got super interested in your work. I read the paper "The Vision of Autonomic Computing: Can LLMs Make It a Reality?" and watched the video. I tried to experiment with the LLM-based multi-agents. However, I couldn't reproduce the experiments, the outcomes are quite different from the paper and video.
I tried to run auto-kube against my local LLM (Ollama -- deepseekr1-1.4 and llama3.2) and it didn't worked, it is missing cloudgpt_aoai.py. Then, I tried the code from paper_artifact_arXiv_2407_14402 and this one seems to work.
I tried a few examples as in the README:
- on
working_mechanism_1:report CPU from your component --component catalogue,scale your component to three replicas --component catalogue, - on
working_mechanism_2:Reduce the total P99 latency of catalogue and front-end to under 400 ms --components catalogue,front-end
In all cases, I observed the agents reasoning and trying many different things, but they never converge to what I've asked. For example, when I request for working_mechanism_1 report CPU from your component --component catalogue, the agent diverge after a dozen of iterations and start trying to fetch the response time of the service instead of CPU. Stopping after 50 iterations with no results.
I've tried different models (deepseek-r1:32b, deepseek-r1:14b, llama3.2, etc), and in all cases the outcome diverges from the original task. I'm not sure if it is an issue with the models that are very generic or the setup prompts that need to be fine tuned.
Could you provide some guidance, am I missing anything?
Besides GPT, do you have any suggestion about which other model could perform better? -- I don't have access to GPT model.
In regarding to the prompts, have you tried variations on them? Do you have any hint on how could I customize them to achieve better results?
Thx!