Skip to content

eeEunjuLee/Practice_LangChain

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project Period: Jan 13, 2026 – present

This project is under active development. Some components are incomplete or unstable, as the current focus is on validating the overall agent pipeline design rather than full execution.


Title: LangChain Framework for Content Generation Agent Development

Practice implementation of a LangChain-based FFmpeg agent


Overview

The goal of this practice project is not to build a fully automated video editing system. Instead, it focuses on exploring an agent-based pipeline that translates high-level, natural language content creation goals (e.g., “Create a YouTube Shorts video”) into executable commands.

The primary objective is to observe where and why the pipeline succeeds or fails, and to analyze failure cases not as simple errors, but as signals indicating missing structural information, insufficient intermediate representations, or inadequate learning signals.

At the current stage, the focus is on validating the agent flow and chain structure rather than on model performance itself. Therefore, the entire pipeline is initially implemented using a well-pretrained general-purpose language model (e.g., gpt-4o-mini) via API. Once the behavior and failure points of the current agent pipeline are sufficiently observed, the model will be replaced with locally executable open-weight LLMs or SLMs (e.g., via Ollama or Hugging Face).


[First Try] Design (Jan 13, 2026)

  • a_chain: Goal Decomposition and Task Planning

    • The purpose of a_chain is to transform a user’s high-level, natural language goal into an ordered sequence of concrete editing tasks.
    • Formally, the chain is defined as:
    a_chain = a_chain1 | a_chain2
    a_chain1 = user_goal | LLM | planning
    a_chain2 = planning | LLM | task_sequence
    
  • b_chain: Capability Analysis and Intermediate Representation

    • The b_chain is responsible for interpreting each task generated by a_chain and determining whether it can be directly executed using FFmpeg.
    • It consists of two parallel sub-chains:
    b_chain = b_chain1 ⊕ b_chain2
    b_chain1 = task | LLM | capability_analysis
    b_chain2 = task | LLM | structured_representation
    
    • b_chain1 (Capability Analysis)
      • Determines whether a given task is executable via FFmpeg without additional human input.
      • For example, tasks such as “review the video” or “decide the clip order” are explicitly classified as non-executable.
    • b_chain2 (Structured Representation)
      • For tasks deemed executable, the LLM generates a loosely structured intermediate representation.
      • Notably, no fixed DSL schema is predefined.
      • This design choice allows observation of whether FFmpeg commands can still be generated from inconsistent or partially structured representations.
  • c_chain: Command Synthesis

    • The final stage, c_chain, is responsible for synthesizing executable FFmpeg commands.
    c_chain = execution_context | LLM | ffmpeg_command
    
    • The input to c_chain is an execution context, which aggregates:

      • the original task,
      • the capability analysis result,
      • the structured intermediate representation,
      • and information about previously generated files.

      Using this bunch of contextual information, the LLM generates a raw FFmpeg commandwithout performing execution itself.

  • Project Structure

Practice_LangChain/
├── agent/
│   ├── a_chain.py          # task planning
│   ├── b_chain.py          # Capability analysis & intermediate representation
│   ├── c_chain.py          # FFmpeg command synthesis
│   └── agent_runner.py     # Orchestrates the full multi-chain pipeline
│
├── prompts/
│   ├── a_chain_prompts.py  # Prompts for planning and task generation
│   ├── b_chain_prompts.py  # Prompts for capability analysis & structuring
│   └── c_chain_prompts.py  # Prompts for FFmpeg command generation
│
├── utils/
│   ├── task_parser.py      # Parses task sequences from a_chain output
│   └── context_packing.py  # Aggregates execution context across chains
│
├── tools/
│   └── ffmpeg_executor.py  # Executes FFmpeg from generated commands
│
├── configs/
│   └── llm.py              # LLM configuration (API / local model switchable)
│
└── README.md

About

Practice implementation of a LangChain-based content generation agent

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages