I’ve been working on benchmarking agentic systems for industrial reasoning, and recently came across your GitHub.
I’m building a benchmark called AssetOpsBench that focuses on predictive maintenance, anomaly detection, and failure-mode reasoning over real industrial datasets.
I would really value your feedback on the benchmark design and task formulations: https://github.com/IBM/AssetOpsBench
If you find the project useful, a GitHub star ⭐ would also mean a lot and helps others in the community discover it.
Thanks so much for your work — and I’d be happy to return feedback on your projects as well.
Regards
Dhaval
I’ve been working on benchmarking agentic systems for industrial reasoning, and recently came across your GitHub.
I’m building a benchmark called AssetOpsBench that focuses on predictive maintenance, anomaly detection, and failure-mode reasoning over real industrial datasets.
I would really value your feedback on the benchmark design and task formulations: https://github.com/IBM/AssetOpsBench
If you find the project useful, a GitHub star ⭐ would also mean a lot and helps others in the community discover it.
Thanks so much for your work — and I’d be happy to return feedback on your projects as well.
Regards
Dhaval