LactChain: Language Action Chain Reinforcement Learning

This repo serves as a template for coding out a Reinforcement Learning (RL) system. This system is meant to be a multi-purpose system with multiple possible applications.

TODOs:

Add things needed to enforce structure of any subsequent code

Make generic baseclass for actor (policy) network
Make generic baseclass for critic (value) network
Make generic baseclass for reward function
Finish thinking about generic lactchain baseclass. Yes, it is state-->action-->state, but what is action? Does action involve taking in a fluid prompt? A prompt menu? What?
Write unit tests

Build out specific use cases

Draw schematic of simple use case
Add plausibly useful language action chains using lactchain class
Add code extractor and other functions in state class
Add other extractors to lactchains if you need to pull certain things (like code) from gpt4 responses
Define example format for textblock in state class
Define Policy and Value Function networks
Define Actor-Critic teaching moments (TD learning? Whatever it's called)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LactChain: Language Action Chain Reinforcement Learning

TODOs:

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

LactChain: Language Action Chain Reinforcement Learning

TODOs: