Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
7fe74f0
docs: sync CONTRIBUTING.md with latest code
docsalot-app[bot] Sep 27, 2025
7af6dc9
docs: sync about.mdx with latest code
docsalot-app[bot] Sep 27, 2025
8399206
docs: sync concepts/contributing.mdx with latest code
docsalot-app[bot] Sep 27, 2025
694126a
docs: sync concepts/deployment.mdx with latest code
docsalot-app[bot] Sep 27, 2025
f0ad541
docs: sync concepts/fine-tuning.mdx with latest code
docsalot-app[bot] Sep 27, 2025
4ed3823
docs: sync concepts/models.mdx with latest code
docsalot-app[bot] Sep 27, 2025
99c78e2
docs: sync configuration/AWS.mdx with latest code
docsalot-app[bot] Sep 27, 2025
55aa235
docs: sync configuration/Azure.mdx with latest code
docsalot-app[bot] Sep 27, 2025
a50eff0
docs: sync configuration/Environment.mdx with latest code
docsalot-app[bot] Sep 27, 2025
d8c15ba
docs: sync configuration/GCP.mdx with latest code
docsalot-app[bot] Sep 27, 2025
c882c5e
docs: sync getting_started.md with latest code
docsalot-app[bot] Sep 27, 2025
4121be7
docs: sync installation.mdx with latest code
docsalot-app[bot] Sep 27, 2025
4de6bd7
docs: sync mint.json with latest code
docsalot-app[bot] Sep 27, 2025
c742311
docs: sync tutorials/deploying-llama-3-to-aws.mdx with latest code
docsalot-app[bot] Sep 27, 2025
d90a7bf
docs: sync tutorials/deploying-llama-3-to-azure.mdx with latest code
docsalot-app[bot] Sep 27, 2025
6c88ee7
docs: sync tutorials/deploying-llama-3-to-gcp.mdx with latest code
docsalot-app[bot] Sep 27, 2025
f311e23
docs: sync updated_readme.md with latest code
docsalot-app[bot] Sep 27, 2025
7a67f4b
docs: create tutorials/openai-compatible-proxy.mdx
docsalot-app[bot] Sep 27, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,4 +65,4 @@ By contributing, you agree that your contributions will be licensed under the Ap

## Questions?

Feel free to contact us at [support@slashml.com](mailto:support@slashml.com) if you have any questions about contributing!
Feel free to contact us at [support@slashml.com](mailto:support@slashml.com) if you have any questions about contributing!
13 changes: 8 additions & 5 deletions about.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,11 @@ description: Deploy open source AI models to AWS, GCP, and Azure in minutes

## About Magemaker

Magemaker is a Python tool that simplifies the process of deploying open source AI models to your preferred cloud provider. Instead of spending hours digging through documentation, Magemaker lets you deploy Hugging Face models directly to AWS SageMaker, Google Cloud Vertex AI, or Azure Machine Learning.
Magemaker is a Python tool that simplifies the process of deploying open-source AI models to your preferred cloud provider. Instead of spending hours digging through documentation, Magemaker lets you deploy Hugging Face models directly to AWS SageMaker, Google Cloud Vertex AI, or Azure Machine Learning.

<Note>
New in the latest release: Magemaker now ships with an optional FastAPI **OpenAI-compatible proxy server** (`server.py`). You can spin up the proxy to expose any deployed endpoint behind the familiar `/v1/chat/completions` interface—perfect for drop-in replacement of OpenAI keys in existing applications. See the dedicated tutorial for details.
</Note>

## What we're working on next

Expand All @@ -22,10 +26,9 @@ Do submit your feature requests at https://magemaker.featurebase.app/
- Querying within Magemaker currently only works with text-based models
- Deleting a model is not instant, it may show up briefly after deletion
- Deploying the same model within the same minute will break
- Hugging-face models on Azure have different Ids than their Hugging-face counterparts. Follow the steps specified in the quick-start guide to find the relevant models
- For Azure deploying models other than Hugging-face is not supported yet.
- Python3.13 is not supported because of an open-issue by Azure. https://github.com/Azure/azure-sdk-for-python/issues/37600

- Hugging Face models on Azure have different IDs than their Hugging Face counterparts. Follow the steps specified in the quick-start guide to find the relevant models.
- For Azure, deploying models other than Hugging Face is not supported yet.
- Python 3.13 is not supported because of an open issue by Azure. https://github.com/Azure/azure-sdk-for-python/issues/37600

If there is anything we missed, do point them out at https://magemaker.featurebase.app/

Expand Down
2 changes: 1 addition & 1 deletion concepts/contributing.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -165,4 +165,4 @@ We are committed to providing a welcoming and inclusive experience for everyone.

## License

By contributing to Magemaker, you agree that your contributions will be licensed under the Apache 2.0 License.
By contributing to Magemaker, you agree that your contributions will be licensed under the Apache 2.0 License.
12 changes: 6 additions & 6 deletions concepts/deployment.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ deployment: !Deployment
destination: gcp
endpoint_name: opt-125m-gcp
instance_count: 1
machine_type: n1-standard-4
instance_type: n1-standard-4 # machine type
accelerator_type: NVIDIA_TESLA_T4
accelerator_count: 1

Expand All @@ -83,7 +83,7 @@ deployment: !Deployment

models:
- !Model
id: facebook-opt-125m
id: facebook-opt-125m # Azure uses different model IDs
source: huggingface
```

Expand Down Expand Up @@ -112,7 +112,8 @@ deployment: !Deployment
endpoint_name: test-llama3-8b
instance_count: 1
instance_type: ml.g5.12xlarge
num_gpus: 4
num_gpus: 4 # Optional – override default GPU count
quantization: bitsandbytes # Optional – 4/8-bit quantisation

models:
- !Model
Expand Down Expand Up @@ -202,10 +203,9 @@ Choose your instance type based on your model's requirements:
4. Set up monitoring and alerting for your endpoints

<Warning>
Make sure you setup budget monitory and alerts to avoid unexpected charges.
Make sure you set up budget monitoring and alerts to avoid unexpected charges.
</Warning>


## Troubleshooting Deployments

Common issues and their solutions:
Expand All @@ -225,4 +225,4 @@ Common issues and their solutions:
- Verify model ID and version
- Check instance memory requirements
- Validate Hugging Face token if required
- Endpoing deployed but deployment failed. Check the logs, and do report this to us if you see this issue.
- Endpoint deployed but deployment failed. Check the logs and report the issue if it persists.
107 changes: 59 additions & 48 deletions concepts/fine-tuning.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,31 +5,41 @@ description: Guide to fine-tuning models with Magemaker

## Fine-tuning Overview

Fine-tuning allows you to adapt pre-trained models to your specific use case. Magemaker simplifies this process through YAML configuration.
Fine-tuning allows you to adapt pre-trained models to your specific use-case.
Magemaker currently supports **AWS SageMaker** fine-tuning (support for GCP & Azure is on the roadmap).
The process is fully driven by a YAML configuration file and a single CLI command.

### Basic Command
> **Basic Command**
> ```sh
> magemaker --train .magemaker_config/train-config.yaml
> ```

```sh
magemaker --train .magemaker_config/train-config.yaml
```
---

## Configuration

### Basic Training Configuration

```yaml
training: !Training
destination: aws
instance_type: ml.p3.2xlarge
destination: aws # Only “aws” is supported today
instance_type: ml.p3.2xlarge # GPU instance for training
instance_count: 1
training_input_path: s3://your-bucket/training-data.csv

models:
- !Model
id: your-model-id
id: your-model-id # e.g. google-bert/bert-base-uncased
source: huggingface
```

• **destination** – cloud provider (must be `aws` for now)
• **training_input_path** – S3 URI pointing to your training dataset
• **instance_type / count** – SageMaker training cluster specs

### Automatic Hyper-parameters (zero-config)
If **`hyperparameters`** are omitted, Magemaker will generate sensible defaults based on the model **task** (e.g. text-classification, text-generation) using the logic in `magemaker/sagemaker/fine_tune_model.py`. This is helpful for quick experiments.

### Advanced Configuration

```yaml
Expand All @@ -49,82 +59,83 @@ training: !Training
save_steps: 1000
```

You can supply any Hugging Face training argument accepted by the Transformers library. Values can be:

• **Scalars** (as above)
• **Ranges / Lists** for SageMaker Hyperparameter Tuning Jobs *(coming soon)*

---

## Data Preparation

### Supported Formats

<CardGroup>
<Card title="CSV Format" icon="file-csv">
- Simple tabular data
- Easy to prepare
- Good for classification tasks
<Card title="CSV" icon="file-csv">
- Column-based datasets<br/>
- Good for classic NLP tasks
</Card>

<Card title="JSON Lines" icon="file-code">
- Flexible data format
- Good for complex inputs
- Supports nested structures
- One JSON object per line<br/>
- Flexible structure for complex inputs
</Card>
</CardGroup>

### Data Upload
### Uploading Data

<Steps>
<Step title="Prepare Data">
Format your data according to model requirements
Clean & format according to the model task (e.g. columns `text,label` for classification).
</Step>
<Step title="Upload to S3">
Use AWS CLI or console to upload data
```bash
aws s3 cp local_file.csv s3://your-bucket/training-data.csv
```
</Step>
<Step title="Configure Path">
Specify S3 path in training configuration
<Step title="Reference in YAML">
Point `training_input_path` to the S3 URI you just uploaded.
</Step>
</Steps>

## Instance Selection
---

### Training Instance Types
## Instance Selection

Choose based on:
- Dataset size
- Model size
- Training time requirements
- Cost constraints
Training can be expensive—choose wisely based on dataset & model size:

Popular choices:
- ml.p3.2xlarge (1 GPU)
- ml.p3.8xlarge (4 GPUs)
- ml.p3.16xlarge (8 GPUs)
• **ml.p3.2xlarge** – 1× V100 GPU (entry level)
• **ml.p3.8xlarge** – 4× V100 GPUs
• **ml.p3.16xlarge** – 8× V100 GPUs

## Hyperparameter Tuning
If you need A100 GPUs, use the p4 family (ensure you have service-quota approval).

### Basic Parameters
---

```yaml
hyperparameters: !Hyperparameters
epochs: 3
learning_rate: 2e-5
batch_size: 32
```
## Hyperparameter Tuning (coming soon)

### Advanced Tuning
Magemaker will expose SageMaker HPO jobs. YAML will accept parameter ranges:

```yaml
hyperparameters: !Hyperparameters
epochs: 3
learning_rate:
learning_rate:
min: 1e-5
max: 1e-4
scaling: log
batch_size:
values: [16, 32, 64]
```

Stay tuned for this feature in an upcoming release.

---

## Monitoring Training

### CloudWatch Metrics
Once the job starts, you can:

1. Open the SageMaker console – Training Jobs –> **Logs**
2. View real-time metrics in **CloudWatch**: loss, lr, GPU utilisation

Available metrics:
- Loss
- Learning rate
- GPU utilization
<Note>
Training jobs run until completion even if the terminal session closes. Make sure to stop any jobs you no longer need to avoid unnecessary charges.
</Note>
73 changes: 46 additions & 27 deletions concepts/models.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,12 @@ description: Guide to supported models and their requirements
## Supported Models

<Note>
Currently, Magemaker supports deployment of Hugging Face models only. Support for cloud provider marketplace models is coming soon!
Magemaker currently supports two model sources:

1. Hugging Face models (all clouds)
2. Amazon SageMaker JumpStart models (AWS only)

Support for cloud-provider marketplace models on GCP and Azure is coming soon!
</Note>

### Hugging Face Models
Expand All @@ -26,15 +31,30 @@ Currently, Magemaker supports deployment of Hugging Face models only. Support fo
</Card>
</CardGroup>

### Future Support

We plan to add support for the following model sources:
### Amazon SageMaker JumpStart Models (AWS)

<CardGroup>
<Card title="AWS SageMaker" icon="aws">
Models from AWS Marketplace and SageMaker built-in algorithms
<Card title="Text Generation" icon="aws">
- meta-textgeneration-llama-3-8b-instruct
- flan-t5-xxl
- gpt-neo-1_3b
</Card>

<Card title="Text Classification" icon="aws">
- bert-base-uncased
- distilbert-base-uncased-finetuned-sst-2
</Card>
</CardGroup>

<Note>
When deploying a JumpStart model, set `source: sagemaker` and use the exact JumpStart model ID (e.g. `meta-textgeneration-llama-3-8b-instruct`).
</Note>

### Future Support

We plan to add support for the following additional model sources:

<CardGroup>
<Card title="GCP Vertex AI" icon="google">
Models from Vertex AI Model Garden and Foundation Models
</Card>
Expand All @@ -43,6 +63,7 @@ We plan to add support for the following model sources:
Models from Azure ML Model Catalog and Azure OpenAI
</Card>
</CardGroup>

## Model Requirements

### Instance Type Recommendations by Cloud Provider
Expand Down Expand Up @@ -94,9 +115,7 @@ We plan to add support for the following model sources:

## Example Deployments

### Example Hugging Face Model Deployment

Deploy the same Hugging Face model to different cloud providers:
### Example Hugging Face Model Deployment (All Clouds)

AWS SageMaker:
```yaml
Expand Down Expand Up @@ -129,23 +148,23 @@ deployment: !Deployment
```

<Note>
The model ids for Azure are different from AWS and GCP. Make sure to use the one provided by Azure in the Azure Model Catalog.

To find the relevnt model id, follow the following steps
<Steps>
<Step title="Go to your workpsace studio">
Find the workpsace in the Azure portal and click on the studio url provided. Click on the `Model Catalog` on the left side bar
![Azure ML Creation](../Images/workspace-studio.png)
</Step>
The model IDs for Azure are different from AWS and GCP. Make sure to use the one provided by Azure in the Azure Model Catalog. See the [Quick Start](quick-start) for steps to locate the correct ID.
</Note>

<Step title="Select Hugging Face in the Collections List">
Select Hugging-Face from the collections list. The id of the model card is the id you need to use in the yaml file
![Azure ML Creation](../Images/hugging-face.png)
</Step>
### Example SageMaker JumpStart Deployment (AWS)

</Steps>
</Note>
```yaml
models:
- !Model
id: meta-textgeneration-llama-3-8b-instruct # JumpStart model ID
source: sagemaker

deployment: !Deployment
destination: aws
endpoint_name: llama3-jumpstart
instance_type: ml.g5.12xlarge
num_gpus: 4
```

## Model Configuration

Expand All @@ -155,8 +174,8 @@ deployment: !Deployment
models:
- !Model
id: your-model-id
source: huggingface|sagemaker # we don't support vertex and azure specific models yet
revision: latest # Optional: specify model version
source: huggingface | sagemaker # choose the appropriate source
revision: latest # Optional: specify model version
```

### Advanced Parameters
Expand All @@ -181,9 +200,9 @@ models:
- Consider data residency requirements
- Test latency from different regions

3. **Cost Management**
2. **Cost Management**
- Compare instance pricing
- Make sure you set up the relevant alerting
- Set up cost alerts and budgets

## Troubleshooting

Expand Down
Loading