SageMaker-compatible ML container for deploying triton models using fil.
Generated on 2026-03-20T01-32-30 using ML Container Creator.
./do/buildBuilds a Docker image tagged as quick-inference-container:latest.
# Start the container
./do/run
# In another terminal, test the endpoints
./do/test./do/pushPushes the image to Amazon ECR in the us-east-1 region.
./do/deploy <your-sagemaker-execution-role-arn>Creates a SageMaker endpoint named quick-inference-container-endpoint.
./do/test quick-inference-container-endpointquick-inference-container/
├── do/ # do-framework lifecycle scripts
│ ├── build # Build Docker image
│ ├── push # Push to Amazon ECR
│ ├── deploy # Deploy to SageMaker
│ ├── run # Run container locally
│ ├── test # Test container or endpoint
│ ├── clean # Clean up resources
│ ├── submit # Submit build to CodeBuild
│ ├── config # Configuration variables
│ └── README.md # Detailed do-framework documentation
├── code/ # Model serving code
│ ├── model_handler.py # Model loading and inference
│ └── serve.py # fil server
├── deploy/ # Legacy scripts (deprecated)
│ ├── build_and_push.sh # Use ./do/build && ./do/push instead
│ └── deploy.sh # Use ./do/deploy instead
├── sample_model/ # Sample training code
│ ├── train_abalone.py # Train sample model
│ └── test_inference.py # Test inference
├── test/ # Test suite
│ ├── test_endpoint.sh # Test SageMaker endpoint
│ └── test_local_image.sh # Test local container
├── Dockerfile # Container definition
├── requirements.txt # Python dependencies
└── README.md # This file
All deployment configuration is centralized in do/config:
# Project identification
PROJECT_NAME="quick-inference-container"
DEPLOYMENT_CONFIG="triton-fil"
# AWS configuration
AWS_REGION="us-east-1"
INSTANCE_TYPE="ml.g5.12xlarge"
# Framework configuration
FRAMEWORK="triton"
MODEL_SERVER="fil"
You can override these values by setting environment variables before running do scripts.
# Build and test locally
./do/build
./do/run &
./do/test
# When satisfied, push to ECR
./do/push# Submit build to CodeBuild (builds and pushes to ECR)
./do/submit
# Deploy to SageMaker
./do/deploy <role-arn>
# Test the endpoint
./do/test quick-inference-container-endpoint# Remove local images
./do/clean local
# Remove ECR images
./do/clean ecr
# Delete SageMaker endpoint
./do/clean endpoint
# Clean everything
./do/clean allThis project uses the do-framework for standardized container lifecycle management.
| Command | Description |
|---|---|
./do/build |
Build Docker image locally |
./do/push |
Push image to Amazon ECR |
./do/deploy <role-arn> |
Deploy to SageMaker endpoint |
./do/run |
Run container locally on port 8080 |
./do/test [endpoint] |
Test local container or SageMaker endpoint |
./do/clean <target> |
Clean up resources (local/ecr/endpoint/all) |
./do/submit |
Submit build to AWS CodeBuild |
For detailed documentation on each command, see do/README.md.
SageMaker calls the /ping endpoint to verify container health:
curl http://localhost:8080/pingExpected response: 200 OK
Send prediction requests to the /invocations endpoint:
curl -X POST http://localhost:8080/invocations \
-H "Content-Type: application/json" \
-d '{
"instances": [[1.0, 2.0, 3.0, 4.0]]
}'The SageMaker execution role needs these permissions:
ecr:GetAuthorizationTokenecr:BatchCheckLayerAvailabilityecr:GetDownloadUrlForLayerecr:BatchGetImages3:GetObject(if using S3 for model artifacts)logs:CreateLogGrouplogs:CreateLogStreamlogs:PutLogEvents
See IAM_PERMISSIONS.md for detailed permission requirements.
Ensure AWS CLI is configured with appropriate credentials:
aws configureOr use environment variables:
export AWS_ACCESS_KEY_ID=your-access-key
export AWS_SECRET_ACCESS_KEY=your-secret-key
export AWS_DEFAULT_REGION=us-east-1Docker Not Found
Install Docker: https://docs.docker.com/get-docker/
Permission Denied
Add your user to the docker group:
sudo usermod -aG docker $USERECR Push Failed
Check AWS credentials and IAM permissions:
aws sts get-caller-identityEndpoint Creation Failed
- Verify the execution role ARN is correct
- Check IAM permissions
- Ensure the instance type is available in your region
Endpoint Stuck in Creating
Check CloudWatch logs:
aws logs tail /aws/sagemaker/Endpoints/quick-inference-container-endpoint --followContainer Exits Immediately
Check container logs:
docker logs $(docker ps -a | grep quick-inference-container | awk '{print $1}')Out of Memory
Increase instance size or optimize model:
# Edit do/config
INSTANCE_TYPE="ml.m5.2xlarge" # Larger instanceIf you're familiar with the old deploy/ scripts, see MIGRATION.md for a command mapping guide.
Quick Reference:
| Legacy Command | do-framework Command |
|---|---|
./deploy/build_and_push.sh |
./do/build && ./do/push |
./deploy/deploy.sh <role> |
./do/deploy <role> |
./deploy/submit_build.sh |
./do/submit |
The legacy scripts are still available but deprecated. They will display warnings and forward to do-framework commands.
For issues or questions:
- Check
do/README.mdfor detailed command documentation - Review CloudWatch logs for deployment issues
- See
MIGRATION.mdif migrating from legacy scripts - Open an issue on the ML Container Creator repository
This generated project is provided as starter code. Modify as needed for your use case.