Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- OpenTelemetry-based tracing with OTLP export
- Distributed caching with Rails.cache backend and stampede protection
- Prompt management (text and chat) with Mustache templating
- In-memory caching with TTL and LRU eviction
- In-memory caching with TTL and bounded expiration-ordered eviction
- Fallback prompt support
- Global configuration pattern with `Langfuse.configure`

Expand Down
91 changes: 87 additions & 4 deletions docs/API_REFERENCE.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,8 @@ Block receives a configuration object with these properties:
| `cache_max_size` | Integer | No | `1000` | Max cached prompts |
| `cache_backend` | Symbol | No | `:memory` | `:memory` or `:rails` |
| `cache_lock_timeout` | Integer | No | `10` | Lock timeout (seconds) |
| `cache_stale_while_revalidate` | Boolean | No | `false` | Enable SWR (requires stale TTL) |
| `cache_stale_ttl` | Integer | No | `0` | Stale TTL (seconds, >0 enables) |
| `cache_stale_while_revalidate` | Boolean | No | `false` | Advisory SWR intent flag (effective activation depends on `cache_stale_ttl`) |
| `cache_stale_ttl` | Integer or `:indefinite` | No | `0` | Stale TTL (seconds, `>0` enables SWR) |
| `cache_refresh_threads` | Integer | No | `5` | Background refresh threads |
| `batch_size` | Integer | No | `50` | Score + trace export batch size |
| `flush_interval` | Integer | No | `10` | Score + trace export interval (s) |
Expand Down Expand Up @@ -149,7 +149,7 @@ get_prompt(name, version: nil, label: nil, fallback: nil, type: nil)
| `name` | String | Yes | Prompt name |
| `version` | Integer | No | Specific version (mutually exclusive with `label`) |
| `label` | String | No | Version label (e.g., "production") |
| `fallback` | String | No | Fallback template if not found |
| `fallback` | String or Array<Hash> | No | Fallback prompt if not found (`String` for text, `Array<Hash>` for chat) |
| `type` | Symbol | Conditional | `:text` or `:chat` (required if `fallback` provided) |

**Returns:** `TextPromptClient` or `ChatPromptClient`
Expand All @@ -163,7 +163,7 @@ get_prompt(name, version: nil, label: nil, fallback: nil, type: nil)
**Examples:**

```ruby
# Latest version
# API default selection (no version/label sent)
prompt = client.get_prompt("greeting")

# Specific version
Expand Down Expand Up @@ -218,6 +218,89 @@ messages = client.compile_prompt("chat-assistant",
# => [{ role: :system, content: "..." }, { role: :user, content: "..." }]
```

### `Client#create_prompt`

Create a new prompt (or a new version if the name already exists).

**Signature:**

```ruby
create_prompt(name:, prompt:, type:, config: {}, labels: [], tags: [], commit_message: nil)
```

**Parameters:**

| Parameter | Type | Required | Description |
| ---------------- | ------------------ | -------- | ------------------------------------------------------------------------ |
| `name` | String | Yes | Prompt name |
| `prompt` | String or Array<Hash> | Yes | Prompt content (String for text, array of role/content hashes for chat) |
| `type` | Symbol | Yes | Prompt type (`:text` or `:chat`) |
| `config` | Hash | No | Prompt config metadata (for example model parameters) |
| `labels` | Array<String> | No | Labels to assign (for example `["production"]`) |
| `tags` | Array<String> | No | Tags for categorization |
| `commit_message` | String | No | Optional commit message |

**Returns:** `TextPromptClient` or `ChatPromptClient`

**Raises:**

- `ArgumentError` for missing/invalid prompt type or content
- `UnauthorizedError` if credentials invalid
- `ApiError` on network/server errors

**Example:**

```ruby
prompt = client.create_prompt(
name: "support-assistant",
prompt: [
{ role: "system", content: "You are a helpful assistant for {{product}}" },
{ role: "user", content: "{{question}}" }
],
type: :chat,
labels: ["staging"],
tags: ["support"],
config: { model: "gpt-4o-mini" }
)
```

### `Client#update_prompt`

Update labels for an existing prompt version.

**Signature:**

```ruby
update_prompt(name:, version:, labels:)
```

**Parameters:**

| Parameter | Type | Required | Description |
| --------- | ------------- | -------- | ----------------------------------------- |
| `name` | String | Yes | Prompt name |
| `version` | Integer | Yes | Prompt version to update |
| `labels` | Array<String> | Yes | Replacement labels for that prompt version |

**Returns:** `TextPromptClient` or `ChatPromptClient`

**Raises:**

- `ArgumentError` if `labels` is not an array
- `NotFoundError` if prompt/version not found
- `UnauthorizedError` if credentials invalid
- `ApiError` on network/server errors

**Example:**

```ruby
prompt = client.update_prompt(
name: "support-assistant",
version: 3,
labels: ["production"]
)
```

### `Client#list_prompts`

List all prompts in the project.
Expand Down
11 changes: 9 additions & 2 deletions docs/ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,14 @@ end
HTTP layer with Faraday:

```ruby
api_client = Langfuse::ApiClient.new(config, cache)
api_client = Langfuse::ApiClient.new(
public_key: config.public_key,
secret_key: config.secret_key,
base_url: config.base_url,
timeout: config.timeout,
logger: config.logger,
cache: cache
)
prompt_data = api_client.get_prompt("name")
```

Expand Down Expand Up @@ -146,7 +153,7 @@ cached = cache.get(key)
**Features:**
- Thread-safe with Monitor
- TTL expiration
- LRU eviction
- Bounded expiration-ordered eviction

#### RailsCacheAdapter (Distributed)

Expand Down
20 changes: 10 additions & 10 deletions docs/CACHING.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,14 @@ For configuration options, see [CONFIGURATION.md](CONFIGURATION.md).

The Langfuse Ruby SDK provides two caching backends to optimize prompt fetching:

1. **In-Memory Cache** (default) - Thread-safe, local cache with TTL and LRU eviction
1. **In-Memory Cache** (default) - Thread-safe, local cache with TTL and bounded expiration-ordered eviction
2. **Rails.cache Backend** - Distributed caching with Redis/Memcached

Both backends support TTL-based expiration and automatic stampede protection (Rails.cache only).
Both backends support TTL-based expiration and stale-while-revalidate (SWR). Distributed stampede protection via locking is specific to the Rails.cache backend; the in-memory backend mitigates stampedes within a single process using Monitor-based single-flight locks.

## In-Memory Cache (Default)

The default caching backend stores prompts in memory with automatic TTL expiration and LRU eviction.
The default caching backend stores prompts in memory with automatic TTL expiration and bounded eviction when the cache reaches max size.

### Configuration

Expand All @@ -42,7 +42,7 @@ end

- **Thread-safe**: Uses Monitor-based synchronization
- **TTL-based expiration**: Automatically expires after configured TTL
- **LRU eviction**: Removes least recently used prompts when max_size is reached
- **Bounded eviction**: When max_size is reached, removes the entry with earliest expiration (`stale_until`)
- **Zero dependencies**: No external services required
- **Fast**: ~1ms cache hits

Expand Down Expand Up @@ -179,8 +179,8 @@ Total latency: ~1ms
Langfuse.configure do |config|
config.cache_backend = :memory # Works with both :memory and :rails
config.cache_ttl = 300 # Fresh for 5 minutes
config.cache_stale_while_revalidate = true # Enable SWR
config.cache_stale_ttl = 300 # Serve stale for up to 5 minutes
config.cache_stale_while_revalidate = true # Advisory intent flag
config.cache_stale_ttl = 300 # `> 0` activates SWR; serve stale for up to 5 minutes
end
```

Expand Down Expand Up @@ -419,7 +419,7 @@ puts "Cached #{results[:success].size} prompts"
# Warm with different label
results = warmer.warm_all(default_label: "staging")

# Warm latest versions (no label)
# Warm without a label (API-determined selection)
results = warmer.warm_all(default_label: nil)
```

Expand Down Expand Up @@ -536,8 +536,8 @@ See [CONFIGURATION.md](CONFIGURATION.md) for all cache-related configuration opt

**In-Memory Cache:**

- TTL expiration + LRU eviction
- Evicts least recently used when max_size reached
- TTL expiration + bounded eviction
- Evicts the entry with the earliest expiration (`stale_until`) when max_size is reached

**Rails.cache:**

Expand All @@ -564,7 +564,7 @@ config.cache_stale_while_revalidate = !Rails.env.development?

# Production: enabled for best performance
if Rails.env.production?
config.cache_stale_ttl = config.cache_ttl # Auto-set, but can customize
config.cache_stale_ttl = config.cache_ttl # Set explicitly (common default)
end
```

Expand Down
20 changes: 10 additions & 10 deletions docs/CONFIGURATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,26 +121,25 @@ See [CACHING.md](CACHING.md#stampede-protection) for details.

- **Type:** Boolean
- **Default:** `false`
- **Description:** Enable stale-while-revalidate caching pattern
- **Description:** Advisory SWR intent flag (effective SWR behavior is controlled by `cache_stale_ttl`)

```ruby
config.cache_stale_while_revalidate = true # Enable SWR
config.cache_stale_while_revalidate = true # Optional intent flag
```

When enabled, serves stale cached data immediately while refreshing in the background. This dramatically reduces P99 latency by avoiding synchronous API waits after cache expiration.
This flag does not independently turn SWR on or off. SWR activates when `cache_stale_ttl > 0`; the flag exists only as an advisory indicator of intent.

**Behavior:**
**Behavior (driven by `cache_stale_ttl`):**

- `false` (default): Cache expires at TTL, next request waits for API (~100ms)
- `true`: After TTL, serves stale data instantly (~1ms) + refreshes in background
- `cache_stale_ttl <= 0` (default): Cache expires at TTL, next request waits for API (~100ms)
- `cache_stale_ttl > 0`: After TTL, serves stale data instantly (~1ms) + refreshes in background

**Important:** SWR only activates when `cache_stale_ttl` is a positive value. Set it explicitly (typically equal to `cache_ttl`).
**Important:** To activate SWR, set `cache_stale_ttl` to a positive value (typically equal to `cache_ttl`).

**Compatibility:**

- ✅ Works with `:memory` backend
- ✅ Works with `:rails` backend
- Set `cache_stale_ttl` to a positive value to activate SWR (often the same as `cache_ttl`)

See [CACHING.md](CACHING.md#stale-while-revalidate-swr) for detailed usage.

Expand Down Expand Up @@ -414,7 +413,8 @@ Langfuse.configure do |config|
config.secret_key = Rails.application.credentials.dig(:langfuse, :secret_key)
config.cache_ttl = 300 # Longer TTL for stability
config.cache_backend = :rails # Shared cache
config.cache_stale_while_revalidate = true # Enable SWR for best latency
config.cache_stale_while_revalidate = true # Advisory intent flag (SWR activates via cache_stale_ttl > 0)
config.cache_stale_ttl = 300 # Activates SWR
config.timeout = 10 # Handle network variability
config.logger = Rails.logger
end
Expand All @@ -435,7 +435,7 @@ Langfuse.configure do |config|
config.public_key = 'pk-lf-test'
config.secret_key = 'sk-lf-test'
config.cache_backend = :memory # Isolated per-process cache
config.cache_stale_while_revalidate = false # Disable SWR for predictable tests
config.cache_stale_ttl = 0 # Disable SWR for predictable tests
end
```

Expand Down
8 changes: 4 additions & 4 deletions docs/PROMPTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,13 +183,13 @@ prompt = client.get_prompt("greeting", label: "production") # => version 3

### Best Practices

1. **Default to production:** Omitting `version`/`label` fetches the `production`-labeled prompt (matching JS/Python SDK behavior)
2. **Use labels in production:** Pin to `production` label for stability
1. **Be explicit in production:** Always pass `label: "production"` for deterministic selection
2. **Treat implicit selection as API-defined:** Omitting both `version` and `label` sends no selector — the Langfuse API decides which version to return
3. **Version for rollback:** Keep version numbers for emergency rollbacks

```ruby
# Development
prompt = client.get_prompt("greeting") # Latest version
# Development (API-defined selection)
prompt = client.get_prompt("greeting")

# Production
prompt = client.get_prompt("greeting", label: "production") # Stable
Expand Down
2 changes: 1 addition & 1 deletion lib/langfuse/cache_warmer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ def warm(prompt_names, versions: {}, labels: {})
# @example Warm with a different default label
# results = warmer.warm_all(default_label: "staging")
#
# @example Warm without any label (latest versions)
# @example Warm without any label (API-determined selection)
# results = warmer.warm_all(default_label: nil)
#
# @example With specific versions for some prompts
Expand Down
Loading