Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

## [3.0.0] - 2026-01-30

### Security

- **CRITICAL**: Removed client-side URL fetching to prevent SSRF vulnerabilities
- URLs are now passed to the server for secure server-side fetching
- Restricted `sign()` method to local files only (API limitation)

### Changed

- **BREAKING**: `sign()` only accepts local files (paths, bytes, file objects) - no URLs
- **BREAKING**: Most methods now accept `FileInputWithUrl` - URLs passed to server
- **BREAKING**: Removed client-side PDF parsing - leverage API's negative index support
- Methods like `rotate()`, `split()`, `deletePages()` now support negative indices (-1 = last page)
- All methods except `sign()` accept URLs that are passed securely to the server

### Removed

- **BREAKING**: Removed `process_remote_file_input()` from public API (security risk)
- **BREAKING**: Removed `get_pdf_page_count()` from public API (client-side PDF parsing)
- **BREAKING**: Removed `is_valid_pdf()` from public API (internal use only)
- Removed ~200 lines of client-side PDF parsing code

### Added

- SSRF protection documentation in README
- Migration guide (docs/MIGRATION.md)
- Security best practices for handling remote files
- Support for negative page indices in all page-based methods

## [2.0.0] - 2025-01-09

- Initial stable release with full API coverage
- Async-first design with httpx and aiofiles
- Comprehensive type hints and mypy strict mode
- Workflow builder with staged pattern
- Error hierarchy with typed exceptions
75 changes: 75 additions & 0 deletions docs/MIGRATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# Migration Guide: v2.x to v3.0

## Overview

Version 3.0.0 introduces SSRF protection and removes client-side PDF parsing.

## Key Changes

### 1. `sign()` No Longer Accepts URLs (API Limitation)

**Before (v2.x)**:
```python
result = await client.sign('https://example.com/document.pdf', {...})
```

**After (v3.0)** - Fetch file first:
```python
import httpx

async with httpx.AsyncClient() as http:
url = 'https://example.com/document.pdf'

# IMPORTANT: Validate URL
if not url.startswith('https://trusted-domain.com/'):
raise ValueError('URL not from trusted domain')

response = await http.get(url, timeout=10.0)
response.raise_for_status()
pdf_bytes = response.content

result = await client.sign(pdf_bytes, {...})
```

### 2. Most Methods Now Accept URLs (Passed directly to DWS)

Good news! These methods now support URLs passed securely to the DWS:
- `rotate()`, `split()`, `add_page()`, `duplicate_pages()`, `delete_pages()`
- `set_page_labels()`, `set_metadata()`, `optimize()`
- `flatten()`, `apply_instant_json()`, `apply_xfdf()`
- All redaction methods
- `convert()`, `ocr()`, `watermark_*()`, `extract_*()`, `merge()`, `password_protect()`

**Example**:
```python
# This now works!
result = await client.rotate('https://example.com/doc.pdf', 90, pages={'start': 0, 'end': 5})
```

### 3. Negative Page Indices Now Supported

Use negative indices for "from end" references:
- `-1` = last page
- `-2` = second-to-last page
- etc.

**Examples**:
```python
# Rotate last 3 pages
await client.rotate(pdf, 90, pages={'start': -3, 'end': -1})

# Delete first and last pages
await client.delete_pages(pdf, [0, -1])

# Split: keep middle pages, excluding first and last
await client.split(pdf, [{'start': 1, 'end': -2}])
```

### 4. Removed from Public API

- `process_remote_file_input()` - No longer needed (URLs passed to server)
- `get_pdf_page_count()` - Use negative indices instead
- `is_valid_pdf()` - Let server validate (internal use only)

**Still Available:**
- `is_remote_file_input()` - Helper to detect if input is a URL (still public)
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ nutrient_dws_scripts = [

[project]
name = "nutrient-dws"
version = "2.0.0"
version = "3.0.0"
description = "Python client library for Nutrient Document Web Services API"
readme = "README.md"
requires-python = ">=3.10"
Expand Down Expand Up @@ -112,7 +112,7 @@ ignore = [
convention = "google"

[tool.ruff.lint.per-file-ignores]
"tests/*" = [] # Don't require docstrings in tests, allow asserts
"tests/*" = ["D102"] # Don't require docstrings in tests

[tool.mypy]
python_version = "3.10"
Expand Down
8 changes: 6 additions & 2 deletions src/nutrient_dws/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,24 +12,28 @@
ValidationError,
)
from nutrient_dws.inputs import (
FileInput,
LocalFileInput,
UrlFileInput,
is_remote_file_input,
process_file_input,
process_remote_file_input,
validate_file_input,
)
from nutrient_dws.utils import get_library_version, get_user_agent

__all__ = [
"APIError",
"AuthenticationError",
"FileInput",
"LocalFileInput",
"NetworkError",
"NutrientClient",
"NutrientError",
"UrlFileInput",
"ValidationError",
"get_library_version",
"get_user_agent",
"is_remote_file_input",
"process_file_input",
"process_remote_file_input",
"validate_file_input",
]
2 changes: 1 addition & 1 deletion src/nutrient_dws/builder/builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ def _register_asset(self, asset: FileInput) -> str:
"""Register an asset in the workflow and return its key for use in actions.

Args:
asset: The asset to register
asset: The asset to register (must be local, not URL)

Returns:
The asset key that can be used in BuildActions
Expand Down
Loading
Loading