Add setup wizard for mass-ingest configuration#54
Draft
pstreef wants to merge 34 commits into
Draft
Conversation
Introduces a CLI wizard that generates customized Dockerfiles for Moderne mass-ingest deployments through an interactive questionnaire. Features: - JDK version selection (8, 11, 17, 21, 25) with multi-version support - Moderne CLI configuration (Maven Central download or local JAR) - Build tool support (Maven with custom settings, Gradle) - Additional language runtimes (Android, Bazel, Node, Python, .NET) - Self-signed certificate configuration with keytool integration - Git authentication (GitHub, GitLab, Bitbucket, Azure DevOps) via SSH or HTTPS - AWS Batch support with ECS integration - Template-based architecture for maintainable Docker generation UX improvements: - Confirmation prompts on complex pages with restart capability - Configuration summaries before each confirmation - Clear input hints for defaults and expected values - Choice menus instead of sequential y/n questions - Duplicate provider detection Technical: - Bash 3.2 compatible (macOS default shell) - Modular template system with placeholder replacement - Smoke tests and comprehensive test suite included
Documents the design, implementation, and usage of the interactive Dockerfile generator wizard including architecture decisions, template structure, UX improvements, and technical details.
## Problem Users could not review or redo configuration choices in each wizard section, making the process error-prone and frustrating. ## Solution Added "Is this correct?" confirmation prompts with retry loops to all sections: - JDK Versions - Moderne CLI Configuration - Maven Configuration - Gradle Configuration - Other Build Tools - Language Runtimes - Scalability Options - Security Configuration - Git Authentication - Runtime Configuration Each section now: - Shows a summary of selected configuration - Asks "Is this correct?" - Allows users to redo the section if they answer "no" - Resets variables on retry to ensure clean state Also fixed git-credentials filename to use correct `.git-credentials` format (with leading dot) throughout the script.
Enhanced the Dockerfile generator wizard with deployment configuration: - Added docker-compose.yml and .env.example file generation for simplified container management - Added optional data directory mounting with configurable path (default: ./data) - Made Gradle version configurable (default: 8.14) - Separated configuration summary into "Dockerfile contents" and "Deployment configuration" sections - Updated next steps to show appropriate commands based on docker-compose choice - Fixed Git authentication volume mounts to be commented out in generated docker-compose.yml The wizard now asks users if they want Docker Compose files and whether to persist data, providing clearer separation between what goes in the Dockerfile versus runtime deployment configuration.
The generator script dynamically creates the base section instead of using a template file. Only the CLI-specific templates (00-modcli-download and 00-modcli-local) are actually used.
## Problem The interactive wizard had several UX inconsistencies and error handling gaps that made troubleshooting failures difficult. ## Solution Enhanced user experience and error handling: **Input handling:** - Fix input trimming to prevent whitespace issues - Fix secret input returning newlines by redirecting cosmetic output to stderr - Validate all required fields reject empty values **Error messaging:** - Add context-specific guidance to all provider error handlers - Parse error output to show specific failure reasons (auth, permissions, 404, timeout) - Display fetch summary screen showing success/failure per provider **Consistency improvements:** - Standardize SSH/HTTPS prompts across all providers (default to HTTPS) - Move authentication help URLs before input prompts - Update Bitbucket Cloud to use API tokens instead of deprecated app passwords - Fix Azure DevOps to match other providers' SSH/HTTPS prompt wording **CSV normalization:** - Remove forced padding to 8 org columns - Use actual max depth for cleaner output
Users can now use the repos.csv generator without installing gh or az CLI tools. When CLIs are unavailable, the wizard prompts for API tokens and uses REST APIs directly. Key improvements: - GitHub and Azure DevOps: Automatic API fallback when CLI missing - Organization columns: Exclude repository names from hierarchy - GitLab: Fix URL encoding for nested groups and improve error messages - Zero repository handling: Show clear errors instead of success - Preview text: Dynamic based on actual count (1 vs 2+ repos) - Test mode: FORCE_GITHUB_API_MODE and FORCE_AZURE_API_MODE flags - Exit codes: Explicit exit 0 in all fetcher scripts - Summary parsing: Fix delimiter issue with provider names containing colons
- Normalize CSV by padding at beginning (ALL in same column for Excel filtering) - Clarify GitHub prompt: "Organization name (or GitHub username for personal repos)" - Fix ANSI escape codes in docker command example output
## Problem The mass-ingest setup process required manual configuration of multiple files and lacked a cohesive user experience. Additionally, several Docker and repository fetcher issues needed fixing. ## Solution Add comprehensive setup wizard (scripts/setup-wizard.sh) that guides users through: - Repository discovery from multiple SCM providers - Docker environment configuration - Moderne CLI setup with artifact repository integration - Support for environment variable references in credentials Fix Dockerfile template issues: - Use printf instead of echo -e for mod wrapper script (POSIX compatibility) - Fix staging CLI download condition (use = instead of ==) - Remove .NET SDK 6.0 (reached end of support) - Fix AWS batch chunk.sh path reference - Fix SSH directory permissions to handle empty directories Improve repository fetcher scripts: - Reduce curl timeout from 30s to 10s per request - Add better error messages and debugging output - Fix Bitbucket Data Center project filtering - Add response validation and error handling
The separate generate-repos-csv.sh and generate-dockerfile.sh scripts have been replaced by the comprehensive setup-wizard.sh that handles the complete end-to-end configuration experience.
- Add step progress indicators to Phase 2 (Step X/12)
- Add phase indicators (Phase 1 of 2, Phase 2 of 2)
- Add Ctrl+C tip to welcome screen
- Show curl debug commands when validation fails (Moderne token and artifact repository)
- Redesign welcome screen with clearer structure and momentum
- Remove "Complete" from title and avoid overpromising ("all repositories")
- Better "Next steps" messaging
Major UX improvements to reduce question count and improve flow: Multi-select improvements: - Fix spacebar toggle on macOS using dd + stty -icanon - Add JDK versions multi-select (reduces 5 questions to 1) - Add Language Runtimes multi-select (reduces 4 questions to 1) - Add Git authentication multi-select (reduces 2 questions to 1) Combined questions: - Moderne CLI: 4 options (stable/staging/specific/jar) instead of 2 questions - Moderne token: 3 options (direct/env-var/skip) instead of 2 questions - Maven: 3 options (default/specific/skip) instead of 2 questions - Gradle: 3 options (default/specific/skip) instead of 2 questions Other improvements: - Clarify Maven settings.xml applies to mvnw wrapper projects too - Better context explanations for all combined questions - Improved terminal state management for multi-select
…anced visual design - Combined build tools (Maven, Gradle, Bazel) into single multi-select, reducing from 12 to 10 steps - Combined AWS integrations (CLI, Batch) into single multi-select - Simplified GitHub and GitLab URL questions to single input with defaults - Improved text clarity for tokens, platform descriptions, and technical concepts - Updated color scheme: progress indicators now CYAN for better visibility
…exibility ## Problem The wizard had several UX issues: - Confusing yes/no labels that didn't explain what each option would do - Script would exit silently when entering non-existent environment variables - No way to add multiple SCM providers without going through each provider's menu - Normalization question appeared even when not using hierarchical org structure - Build tools explanation was unbalanced (focused too much on Maven wrapper) ## Solution Enhanced question labels: - Made ask_yes_no() configurable with custom labels for each question - Updated all questions with context-appropriate labels (e.g., "Yes, continue" vs "Yes, use existing") - Removed version numbers from build tools multi-select (versions asked after selection) Fixed set -e compatibility issues: - Fixed expand_env_var() to handle missing environment variables gracefully - Added error handling to grep and stty restore operations - Wrapped show_repos_summary with set +e/set -e to capture return codes Improved workflow flexibility: - Added "Done adding SCM providers" option after first provider is configured - Changed repos.csv validation to 3-option choice: continue, add another SCM, or start over - Skip normalization question when using "none" or "simple" org structure Text improvements: - Balanced Maven/Gradle wrapper explanations in build tools section - Clarified Bazelisk auto-detection behavior - Distinguished between "reused" vs "generated" repos.csv in messages
## Problem Users had to manually type organization names, which was error-prone and tedious. There was also no way to undo adding an organization if a mistake was made. ## Solution GitHub organization discovery: - Auto-discovers user's organizations using GitHub API or gh CLI - Shows single-select menu with all discovered orgs + "Manually add" option - Caches discovered orgs to avoid re-fetching when adding multiple - Falls back to manual entry if no orgs discovered - Works with both github.com and GitHub Enterprise Undo functionality: - Added "Remove last added organization (name)" option to "What's next?" menu - Removes org from tracking array, deletes CSV file, cleans up fetch results - Returns to SCM selection menu after removal - Only shows "Done adding SCM providers" when actual data exists (not just flag) - Clears ENABLE_GITHUB flag when no orgs remain Bash 3.2 compatibility: - Fixed negative array indexing ([-1]) to use calculated index - Rebuilt arrays properly instead of using unset (which leaves holes in bash 3.2)
- Added missing step 6: Scalability options (AWS CLI and AWS Batch) - Fixed step 7: Changed from incorrect 'SAST scanning with ShiftLeft' to 'SSL certificates for HTTPS connections' - Added missing step 10: Docker Compose configuration - Now matches the actual 10 wizard steps in Phase 2
…etion screen - Add wait_for_enter() function to ignore arrow key input and prevent escape sequences from appearing on screen - Clear screen before generation progress for better visual separation - Remove casual closing message from completion screen for more professional tone
…d heap size ## Problem The wizard was generating Dockerfile.generated requiring manual renaming, and using a fixed 4GB heap size that doesn't scale with container memory. ## Solution - Generate directly as Dockerfile with automatic backup of existing file to Dockerfile.original - Change default JVM memory from -Xmx4g to -XX:MaxRAMPercentage=60.0 for better container resource utilization
…on menu When users accidentally select 'Add another GitHub organization', they can now select 'Done adding GitHub organizations' from the org selection menu to return to the SCM selection screen without having to add another organization.
## Problem The Moderne CLI configuration page combined both CLI version/source selection and tenant authentication, making it a long page with two distinct concerns. ## Solution - Split ask_modcli_config() into two separate functions: - ask_modcli_config() - handles CLI version and source only (Step 2/11) - ask_moderne_tenant_config() - handles tenant and token configuration (Step 3/11) - Update Phase 2 introduction to show 11 steps instead of 10 - Update all step numbers in main flow (now 1/11 through 11/11) - Remove duplicated configuration summary from final success page
When token validation fails and user selects 'Yes, try again', the wizard now re-prompts for the input method (direct vs environment variable) instead of forcing the same method. This allows correcting an accidental selection.
When connection validation fails for Moderne tenant or Artifactory, the debug curl commands now display as plain text without visible escape sequences.
Use printf instead of echo for stderr output with color codes. This properly interprets the escape sequences so they display as gray text instead of showing literal \033[0m characters.
Changed \\\\ to \\ in printf statements to display single backslash for proper line continuation in multi-line curl commands.
When discovering many organizations, only show the first 20 in the selection menu to avoid overwhelming the user. Users can still manually add any org not shown in the list using the 'Manually add organization/user' option.
Format Java options as separate quoted arguments with escaped leading dashes: mod config java options edit "\\-XX:MaxRAMPercentage=60.0" "\\-Xss3m" This ensures the mod config command properly interprets each JVM option.
Incorporates changes from origin/main commit f46c18e to ensure keytool commands run non-interactively without prompting for confirmation. Changes: - Dockerfile: Add -noprompt -storepass changeit to all 5 keytool commands - setup-wizard.sh: Update generate_certs_section() to use same flag order
Adds ability to choose between Debian and Alpine base images in the wizard, with automatic template selection based on user choice. Template structure: - shared/ - 9 templates identical for both Debian and Alpine - debian/ - 5 Debian-specific templates (apt-get, deb packages) - alpine/ - 5 Alpine-specific templates (apk, different installations) Changes: - Add ask_base_image_type() - New wizard step for base image selection - Add get_template() helper - Checks shared/ first, falls back to debian/alpine/ - Update generate_base_section() - Uses BASE_IMAGE_SUFFIX for image names - Update generate_dockerfile() - Uses get_template() for all template loads - Reorganize templates into shared/debian/alpine structure (28→19 files) - Fix ask_optional_path() to expand ~ (tilde) to HOME directory Alpine-specific templates: - Node.js: Uses apk add nodejs npm - Python: Uses Alpine's python3 package - .NET: Manual SDK download with Alpine dependencies - Android: Adds bash dependency for SDK scripts - AWS CLI: Uses pip install instead of bundled installer Reduces template duplication by 32% while enabling flexible base image choice.
ecec31e to
4cd3efa
Compare
Root cause: Arithmetic operations ((selected++)) and ((selected--)) return exit code 1 when result is 0, causing script exit under set -e. Changes: - Add set +e at start of interactive functions and set -e before returns as safety layer - Applies to: ask_choice, ask_yes_no, ask_multi_select, ask_input_or_env_var, ask_secret_or_env_var_ref This fixes the issue where pressing arrow keys would cause the wizard to exit without warning in Google Cloud Console. Note: Arithmetic operation fixes (selected=$((selected +/- 1))) were already present in this branch.
Added three improvements for better user experience and debugging: 1. URL normalization with ask_url() helper - Automatically prepends https:// if no scheme provided - Applied to GitHub, GitLab, and Bitbucket Data Center URLs - Supports default values for cloud providers 2. Interactive error debugging with show_error_details() - Prompts "Show full error details?" on API failures - Shows request URL, HTTP status, curl errors, and response body - Applied to organization discovery in setup wizard 3. Enhanced repo fetcher error handling - Updated github.sh and gitlab.sh to capture full error context - Uses curl -sS flag to properly capture DNS and connection errors - Prompts for error details on HTTP failures - Shows HTTP status codes, curl stderr, and response bodies Also added FORCE_GITHUB_API_MODE environment variable to disable gh CLI for testing API error handling.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Introduces a comprehensive setup wizard that guides users through the complete mass-ingest configuration process, from repository discovery to Docker environment setup.
Problem
Setting up mass-ingest currently requires:
This creates a steep learning curve and increases the likelihood of setup issues.
Solution
Add a unified interactive wizard (
scripts/setup-wizard.sh) that consolidates the entire setup process into a single guided experience:Output:
Status
This is a proof of concept for review and testing. Feedback welcome on the user experience and feature set before finalizing.
Usage
Follow the interactive prompts to configure your mass-ingest environment.
Known Limitations
Platform support:
Workflow:
Security:
Development & Testing