Skip to content

Releases: PSPDFKit/nutrient-dws-client-python

v3.0.0 - SSRF Security Hardening

18 Feb 04:01
7457a52

Choose a tag to compare

[3.0.0] - 2026-01-30

Security

  • CRITICAL: Removed client-side URL fetching to prevent SSRF vulnerabilities
  • URLs are now passed to the server for secure server-side fetching
  • Restricted sign() method to local files only (API limitation)

Changed

  • BREAKING: sign() only accepts local files (paths, bytes, file objects) - no URLs
  • BREAKING: Most methods now accept FileInputWithUrl - URLs passed to server
  • BREAKING: Removed client-side PDF parsing - leverage API's negative index support
  • Methods like rotate(), split(), deletePages() now support negative indices (-1 = last page)
  • All methods except sign() accept URLs that are passed securely to the server

Removed

  • BREAKING: Removed process_remote_file_input() from public API (security risk)
  • BREAKING: Removed get_pdf_page_count() from public API (client-side PDF parsing)
  • BREAKING: Removed is_valid_pdf() from public API (internal use only)
  • Removed ~200 lines of client-side PDF parsing code

Added

  • SSRF protection documentation in README
  • Migration guide (docs/MIGRATION.md)
  • Security best practices for handling remote files
  • Support for negative page indices in all page-based methods

v1.0.2 - Major Feature Release

03 Jul 17:40

Choose a tag to compare

What's Changed

This release adds significant new functionality with 13 new Direct API methods and numerous stability improvements.

✨ New Features

Direct API Methods

  • create_redactions_preset() - Create redactions using predefined patterns (SSN, email, phone, etc.)
  • create_redactions_regex() - Create redactions using custom regex patterns
  • create_redactions_text() - Create redactions for specific text strings
  • optimize_pdf() - Optimize PDF file size and performance
  • password_protect_pdf() - Add password protection to PDFs
  • set_pdf_metadata() - Update PDF metadata (title, author)
  • split_pdf() - Split PDFs into multiple files based on page ranges
  • duplicate_pdf_pages() - Duplicate specific pages within a PDF
  • delete_pdf_pages() - Remove specific pages from a PDF
  • add_page() - Insert blank pages at specific positions
  • apply_instant_json() - Apply PSPDFKit Instant JSON annotations
  • apply_xfdf() - Apply XFDF annotations to PDFs
  • set_page_label() - Set custom page labels (Roman numerals, letters, etc.)

Enhancements

  • 🖼️ Image file support for watermark_pdf() method - now accepts PNG/JPEG images as watermarks
  • 🧪 Improved CI/CD integration test strategy with better error reporting
  • 📈 Enhanced test coverage for all new Direct API methods

🐛 Bug Fixes

  • Critical API compatibility issues in Direct API integration
  • Python 3.9 and 3.10 syntax compatibility across the codebase
  • Comprehensive CI failure resolution
  • Integration test fixes to match actual API behavior patterns
  • Ruff linting and formatting issues throughout the project
  • MyPy type checking errors and improved type annotations
  • Removed unsupported parameters from API calls
  • Fixed page range handling in split_pdf with proper defaults
  • Resolved runtime errors with isinstance union syntax
  • Updated test fixtures to use valid PNG images

📋 Requirements

  • Python 3.10+ (maintained as per project design)
  • requests>=2.25.0,<3.0.0

📦 Installation

pip install nutrient-dws==1.0.2

📚 Documentation

See the README for usage examples of the new features.

Full Changelog: v1.0.1...v1.0.2

v1.0.1 - Critical Documentation Fix and Testing Improvements

20 Jun 17:46

Choose a tag to compare

Release Notes - v1.0.1

Release Date: June 20, 2024

🐛 Critical Bug Fixes

Documentation Error Fixed

  • Fixed README.md: Corrected documentation to use NutrientTimeoutError instead of TimeoutError in import examples and exception handling
  • Resolved Import Error: Users following README examples will no longer get ImportError: cannot import name 'TimeoutError'

CI/Testing Stability

  • Test Collection: Fixed pytest collection failures in CI environments
  • TOML Configuration: Removed duplicate setuptools configuration causing installation errors
  • Type Checking: Resolved mypy errors across all modules
  • Linting: Fixed all ruff linting issues (W292, W293, RUF034, SIM115, B017, E501)

✨ New Features

Testing Infrastructure

  • 31 Comprehensive Unit Tests: Added full test coverage for all major components
    • HTTP client tests (5 tests)
    • File handler tests (5 tests)
    • Builder API tests (5 tests)
    • Exception handling tests
    • Client functionality tests
  • Integration Test Framework: New CI workflow for testing against live API
    • Runs on all Python versions (3.8-3.12)
    • Secure API key handling via GitHub secrets
    • Automatic config cleanup
    • Basic smoke test for API connectivity

Development Quality

  • Repository Enhancement: Added badges, issue templates, and documentation
  • CI Pipeline: Improved workflow with better error handling and debugging

🔧 Technical Improvements

  • All tests pass on Python 3.8-3.12
  • CI pipeline is stable and reliable
  • Integration tests provide continuous API validation
  • Code coverage and quality metrics tracked
  • Type safety enhanced with better annotations

📋 What's Changed

Full Changelog: v1.0.0...v1.0.1

This patch release fixes a critical documentation bug that would prevent users from successfully importing the library when following README examples. It also adds significant testing infrastructure and stability improvements based on 29 commits of fixes and enhancements.

Upgrade recommended for all users to avoid import errors.
EOF < /dev/null

v1.0.0 - First Stable Release

17 Jun 18:54

Choose a tag to compare

Release Notes - v1.0.0

Release Date: June 17, 2024

We are excited to announce the first release of the official Python client library for Nutrient Document Web Services (DWS) API! This library provides a comprehensive, Pythonic interface for document processing operations including PDF manipulation, OCR, watermarking, and more.

🎉 Highlights

Dual API Design

The library offers two complementary ways to interact with the Nutrient API:

  1. Direct API - Simple method calls for single operations
  2. Builder API - Fluent interface for complex, multi-step workflows

Automatic Office Document Conversion

A major discovery during development: the Nutrient API automatically converts Office documents (DOCX, XLSX, PPTX) to PDF when processing them. This means you can:

  • Apply any PDF operation directly to Office documents
  • Mix PDFs and Office documents in merge operations
  • Skip explicit conversion steps in your workflows

Enterprise-Ready Features

  • Robust Error Handling: Comprehensive exception hierarchy for different error scenarios
  • Automatic Retries: Built-in retry logic for transient failures
  • Connection Pooling: Optimized performance for multiple requests
  • Large File Support: Automatic streaming for files over 10MB
  • Type Safety: Full type hints for better IDE support

📦 Installation

pip install nutrient-dws

🚀 Quick Start

from nutrient_dws import NutrientClient

# Initialize client
client = NutrientClient(api_key="your-api-key")

# Direct API - Single operation
client.rotate_pages("document.pdf", output_path="rotated.pdf", degrees=90)

# Convert Office document to PDF (automatic!)
client.convert_to_pdf("report.docx", output_path="report.pdf")

# Builder API - Complex workflow
client.build(input_file="scan.pdf") \
    .add_step("ocr-pdf", {"language": "english"}) \
    .add_step("watermark-pdf", {"text": "CONFIDENTIAL"}) \
    .add_step("flatten-annotations") \
    .execute(output_path="processed.pdf")

# Merge PDFs and Office documents together
client.merge_pdfs([
    "chapter1.pdf",
    "chapter2.docx",
    "appendix.xlsx"
], output_path="complete_document.pdf")

🔧 Supported Operations

  • convert_to_pdf - Convert Office documents to PDF
  • flatten_annotations - Flatten form fields and annotations
  • rotate_pages - Rotate specific or all pages
  • ocr_pdf - Make scanned PDFs searchable (English & German)
  • watermark_pdf - Add text or image watermarks
  • apply_redactions - Apply redaction annotations
  • merge_pdfs - Combine multiple documents

🛡️ Error Handling

The library provides specific exceptions for different error scenarios:

from nutrient_dws import NutrientClient, AuthenticationError, ValidationError

try:
    client = NutrientClient(api_key="your-api-key")
    result = client.ocr_pdf("scan.pdf")
except AuthenticationError:
    print("Invalid API key")
except ValidationError as e:
    print(f"Invalid parameters: {e.errors}")

📚 Documentation

🧪 Quality Assurance

  • Test Coverage: 92.46% with 82 unit tests
  • Type Checking: Full mypy compliance
  • Code Quality: Enforced with ruff and pre-commit hooks
  • CI/CD: Automated testing on Python 3.8-3.12

🤝 Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

📝 License

This project is licensed under the MIT License.

🙏 Acknowledgments

Special thanks to the Nutrient team for their excellent API and documentation.


Note: This is the initial release. We're actively working on additional features including more language support for OCR, additional file format support, and performance optimizations. Stay tuned!

For questions or support, please open an issue.