Skip to content

devninja-in/postgres-data-migration

Repository files navigation

PostgreSQL Data Migration Tool

A modern web-based tool for migrating data between PostgreSQL databases with an intuitive UI for selecting tables and monitoring progress.

Features

  • Web-based UI: Modern React frontend with responsive design
  • Database Connection Testing: Test connections before migration
  • Table Selection: Browse and select specific tables to migrate
  • Incremental Migration: Process large tables in configurable chunks to reduce memory usage
  • Memory Management: Automatic memory monitoring and chunk splitting
  • Streaming Support: Server-side cursors for very large tables (500K+ rows)
  • Data Safety: Truncates target tables before insertion to ensure clean data
  • Real-time Statistics: Detailed migration progress with chunk-level tracking
  • Resume Capability: Resume failed migrations from the last successful chunk
  • Error Handling: Comprehensive error reporting and validation
  • Flexible Processing: Both file-based and direct migration strategies with incremental support

Architecture

  • Backend: FastAPI with PostgreSQL connectivity using psycopg2
  • Frontend: React with TypeScript and Tailwind CSS
  • Database: PostgreSQL (source and target)

Prerequisites

  • Python 3.8+
  • Node.js 16+
  • PostgreSQL databases (source and target)
  • Network access between the tool and both databases

Installation & Setup

1. Clone or Download

# If you have the project files, navigate to the directory
cd postgres-data-migration

2. Backend Setup

# Install Python dependencies
pip install -r requirements.txt

3. Frontend Setup

# Install Node.js dependencies
npm install

Running the Application

1. Start the Backend Server

# Run the FastAPI server
python main.py

The API will be available at http://localhost:8000

2. Start the Frontend

# In a new terminal, start the React development server
npm start

The web application will open at http://localhost:3000

Usage

1. Database Connection

  • Enter credentials for your source database (where data will be copied from)
  • Click "Test Connection" to verify connectivity
  • Enter credentials for your target database (where data will be copied to)
  • Click "Test Connection" to verify connectivity

2. Table Selection

  • After successful source connection, available tables will be displayed
  • Select the tables you want to migrate
  • Review table row counts to understand data volume

3. Migration Process

  • Click "Migrate" to start the data transfer
  • ⚠️ Warning: Selected tables in the target database will be truncated (all existing data deleted)
  • Monitor real-time progress and statistics
  • Review detailed results after completion

API Endpoints

  • POST /test-connection - Test database connectivity
  • POST /list-tables - Get list of tables with row counts
  • POST /migrate-data - Perform data migration
  • GET /health - Health check endpoint

Configuration

Database Configuration (YAML)

The application supports predefined database configurations using a YAML file. This allows you to:

  • Store multiple source and destination database configurations
  • Select databases from a dropdown instead of manual entry
  • Keep sensitive credentials in a secure configuration file
  • Set default migration settings

Setup Configuration

  1. Copy the sample configuration:

    cp config.sample.yaml config.yaml
  2. Edit config.yaml with your database details:

    databases:
      sources:
        local_source:
          name: "Local Source Database"
          host: "localhost"
          port: 5432
          database: "source_db"
          username: "readonly_user"
          password: "your_password"
          description: "Local source database - READ ONLY"
        
        prod_source:
          name: "Production Source Database"
          host: "prod-db.company.com"
          port: 5432
          database: "production_db"
          username: "migration_readonly"
          password: "secure_password"
          description: "Production source database - READ ONLY"
    
      destinations:
        local_target:
          name: "Local Target Database"
          host: "localhost"
          port: 5432
          database: "target_db"
          username: "postgres"
          password: "your_password"
          description: "Local target database"
    
    default_settings:
      chunk_size: 50000
      max_connections: 2
      enable_incremental: true
      analyze_dependencies: true
  3. Security Note: The config.yaml file is excluded from version control to protect sensitive credentials.

Using Configuration

  • Database Selection: Choose from predefined databases in the dropdown
  • Custom Entry: Select "Custom Configuration" to manually enter credentials
  • Auto-populate: Selecting a predefined database automatically fills in all connection details
  • Reload: Use the "Reload Configurations" button to refresh after updating the YAML file

Environment Variables

You can set the following environment variables:

# Frontend API URL (default: http://localhost:8000)
REACT_APP_API_URL=http://localhost:8000

Database Requirements

  • Both source and target databases must be PostgreSQL
  • Target database should have the same table schemas as source
  • User accounts need appropriate permissions:
    • Source: SELECT permissions only (recommended: use readonly user)
    • Target: TRUNCATE and INSERT permissions on tables

πŸ”’ Security Best Practices

Source Database Access:

  • Always use a readonly user for source databases to prevent accidental data modification
  • Create a dedicated readonly user with minimal permissions:
    -- Create readonly user for source database
    CREATE USER migration_readonly WITH PASSWORD 'secure_password';
    GRANT CONNECT ON DATABASE your_database TO migration_readonly;
    GRANT USAGE ON SCHEMA public TO migration_readonly;
    GRANT SELECT ON ALL TABLES IN SCHEMA public TO migration_readonly;
    GRANT SELECT ON ALL SEQUENCES IN SCHEMA public TO migration_readonly;
  • This ensures your source data remains safe during migration operations

Security Considerations

  • Database credentials are only stored in memory during the session
  • No persistent storage of sensitive information
  • Use strong database passwords and limit network access
  • Consider using connection pooling for production environments

Performance Tips

  • Incremental Migration: Large tables (100K+ rows) are automatically processed in chunks (default 50K rows per chunk)
  • Memory Efficiency: Configurable memory limits prevent out-of-memory errors
  • Streaming: Tables with 500K+ rows use server-side cursors for minimal memory usage
  • Chunk Processing: Each chunk is processed independently with immediate cleanup
  • Adaptive Sizing: Chunk sizes are automatically adjusted based on table characteristics
  • Pause Control: Configurable pauses between chunks help reduce database load
  • Resume Support: Failed migrations can be resumed from the last successful chunk
  • Consider running migrations during low-traffic periods
  • Monitor database performance during large migrations

Troubleshooting

Common Issues

  1. Connection Failed

    • Verify database host, port, and credentials
    • Check network connectivity and firewall settings
    • Ensure PostgreSQL is running and accepting connections
  2. Permission Denied

    • Verify user has necessary permissions on source and target tables
    • Check that target user can TRUNCATE and INSERT
  3. Schema Mismatch

    • Ensure target tables exist with compatible schemas
    • Check column names and data types match between source and target
  4. Migration Timeout

    • For very large tables, consider breaking down the migration
    • Check network stability and database performance

Logs

  • Backend logs are displayed in the terminal running python main.py
  • Frontend errors appear in the browser console
  • Check browser network tab for API request details

Development

Project Structure

postgres-data-migration/
β”œβ”€β”€ main.py                 # FastAPI backend
β”œβ”€β”€ requirements.txt        # Python dependencies
β”œβ”€β”€ package.json           # Node.js dependencies
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ App.tsx            # Main React component
β”‚   β”œβ”€β”€ components/        # React components
β”‚   β”œβ”€β”€ types.ts           # TypeScript interfaces
β”‚   └── api.ts             # API client
└── README.md

Adding Features

  • Backend logic: Modify main.py
  • Frontend components: Add to src/components/
  • API client: Update src/api.ts
  • Types: Define in src/types.ts

License

This project is provided as-is for educational and practical use. Please ensure you have appropriate backups before running migrations on production data.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published