Skip to content

[FEATURE] RFC: Regex Routing Support for PivotPHP #1

@code-cfernandes

Description

@code-cfernandes

Summary

Add support for regex-based route constraints in PivotPHP to enable more precise URL matching and parameter validation at the routing level.

Motivation

Currently, PivotPHP uses a simple :parameter syntax that matches any non-slash characters. This limitation requires developers to validate parameters in handlers or middleware, leading to:

  • Duplicate validation logic across routes
  • Less precise route matching
  • Potential security issues from unvalidated inputs
  • Performance overhead from unnecessary handler execution

Proposed Solution

1. Extended Parameter Syntax (Recommended)

Extend the current syntax with optional regex constraints while maintaining backward compatibility:

// Current syntax remains unchanged
Router::get('/users/:id', $handler);

// New syntax with inline constraints
Router::get('/users/:id<\d+>', $handler);                    // Only digits
Router::get('/posts/:year<\d{4}>/:month<\d{2}>', $handler); // Year: 4 digits, Month: 2 digits
Router::get('/files/:filename<[\w-]+>.:ext<jpg|png|gif>', $handler); // Specific extensions
Router::get('/api/:version<v\d+>/users', $handler);         // Version format: v1, v2, etc.

2. Common Constraints Shortcuts

Provide built-in shortcuts for common patterns:

Router::get('/users/:id<int>', $handler);        // Alias for \d+
Router::get('/posts/:slug<slug>', $handler);     // Alias for [a-z0-9-]+
Router::get('/search/:query<alpha>', $handler);  // Alias for [a-zA-Z]+
Router::get('/files/:name<alnum>', $handler);    // Alias for [a-zA-Z0-9]+
Router::get('/api/:uuid<uuid>', $handler);       // Alias for UUID pattern

3. Full Regex Support (Advanced)

For complex patterns, support full regex between curly braces:

// Full regex pattern
Router::get('/archive/{^(\d{4})/(\d{2})/(.+)$}', function($req, $res) {
    [$year, $month, $slug] = $req->captures(); // [2024, 01, 'my-post']
});

// Mixed syntax
Router::get('/blog/:category<\w+>/{^/(.+\.html)$}', $handler);

Implementation Details

Changes to RouteCache::compilePattern()

private static function compilePattern(string $pattern): array 
{
    $params = [];
    $position = 0;
    
    // Extract parameters with constraints
    $pattern = preg_replace_callback(
        '/:([a-zA-Z_][a-zA-Z0-9_]*)(?:<([^>]+)>)?/',
        function($matches) use (&$params, &$position) {
            $paramName = $matches[1];
            $constraint = $matches[2] ?? '[^/]+'; // Default constraint
            
            // Handle shortcuts
            $constraint = self::resolveConstraintShortcut($constraint);
            
            $params[] = ['name' => $paramName, 'position' => $position++];
            return '(' . $constraint . ')';
        },
        $pattern
    );
    
    // Handle full regex patterns {...}
    $pattern = preg_replace_callback(
        '/\{([^\}]+)\}/',
        function($matches) use (&$position) {
            $position++; // Increment for capture groups
            return $matches[1];
        },
        $pattern
    );
    
    return [
        'regex' => '#^' . $pattern . '/?$#',
        'params' => $params
    ];
}

private static function resolveConstraintShortcut(string $constraint): string 
{
    $shortcuts = [
        'int' => '\d+',
        'slug' => '[a-z0-9-]+',
        'alpha' => '[a-zA-Z]+',
        'alnum' => '[a-zA-Z0-9]+',
        'uuid' => '[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}',
    ];
    
    return $shortcuts[$constraint] ?? $constraint;
}

Performance Considerations

  • Compiled patterns are cached (existing behavior)
  • Regex complexity should be documented to avoid ReDoS attacks
  • Consider adding regex complexity limits in production

Error Handling

When a route doesn't match due to constraints:

  • Continue to next route (current behavior)
  • Optionally log constraint failures for debugging
  • Return 404 if no routes match

Examples

Basic Usage

// User profile - ID must be numeric
Router::get('/users/:id<\d+>', function($req, $res) {
    $userId = $req->param('id'); // Guaranteed to be numeric
    return $res->json(['user_id' => $userId]);
});

// Blog post with date constraints
Router::get('/blog/:year<\d{4}>/:month<\d{2}>/:slug<[a-z0-9-]+>', function($req, $res) {
    $year = $req->param('year');   // e.g., "2024"
    $month = $req->param('month'); // e.g., "01"
    $slug = $req->param('slug');   // e.g., "my-first-post"
});

// API versioning
Router::get('/api/:version<v\d+>/users', function($req, $res) {
    $version = $req->param('version'); // e.g., "v1", "v2"
});

Advanced Patterns

// ISBN validation
Router::get('/books/:isbn<\d{3}-\d{10}>', $handler); // Format: 978-1234567890

// Email-like pattern in URL
Router::get('/contact/:email<[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+>', $handler);

// Complex path matching with captures
Router::get('/files/{^(.+)/([^/]+)\.(jpg|png|gif)$}', function($req, $res) {
    [$path, $filename, $extension] = $req->captures();
});

Migration Path

  1. Phase 1: Implement constraint syntax (backward compatible)
  2. Phase 2: Add common shortcuts
  3. Phase 3: Add full regex support
  4. Phase 4: Deprecation notices for potential conflicts

Alternative Approaches Considered

Fluent API (Laravel-style)

Router::get('/users/:id', $handler)->where('id', '\d+');

Rejected because: Breaks the current static method pattern and requires significant API changes.

Separate constraint registration

Router::constraint('id', '\d+');
Router::get('/users/:id', $handler);

Rejected because: Less intuitive and requires global state management.

Backward Compatibility

  • Existing routes without constraints continue to work unchanged
  • No breaking changes to current API
  • Performance impact minimal due to existing regex usage

Security Considerations

  • Document regex complexity limits to prevent ReDoS
  • Validate regex patterns during route registration
  • Consider adding max constraint length
  • Escape user input when building dynamic routes

Testing Plan

  1. Unit tests for pattern compilation with constraints
  2. Integration tests for route matching
  3. Performance benchmarks comparing constrained vs unconstrained routes
  4. Security tests for ReDoS prevention
  5. Backward compatibility tests

Documentation Updates

  • Update routing guide with constraint examples
  • Add security best practices for regex patterns
  • Include performance considerations
  • Provide migration guide from validation-in-handler pattern

Open Questions

  1. Should we limit regex complexity by default?
  2. Should constraint validation errors be logged?
  3. Should we provide a debug mode showing why routes didn't match?
  4. Should we support named capture groups in full regex mode?

References

Metadata

Metadata

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions