Replies: 1 comment
-
|
Closing discussion, pushed to prod on 24.10 -> https://awslabs.github.io/scale-out-computing-on-aws-documentation/documentation/architecture/node-bootstrap/customize-node-boostrap/ |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
SOCA configuration is currently stored on AWS Secrets Manager as a single Secret, which make configuration updates challenging, as any typo/syntax error have the potential to bring the entire environment down.
Our proposal is to continue to store application secrets (Redis, OpenLDAP ...) on AWS Secrets Manager and migrate the rest of the configuration tree to AWS System Manager Parameter Store:
/soca/<CLUSTER_ID>/configuration: Store SOCA environment specific parameters such asBaseOS,ClusterId,AuthProvideretc ..-
/soca/<CLUSTER_ID>/system: Store SOCA system specific variables such as the list of packages to install, the link to download Python, OpenMPI, CloudWatch Log Agent, EFA etc ...This new architecture will help us transition our new node setup logic to fully support Jinja2 templating, making bootstrap customization easier.
Example: Configure OpenLDAP or Microsoft AD
Example: Mount /data based on the FileSystem configured:
Additionally, we will expose ephemeral
/soca/<CLUSTER_ID>/job/*hierarchy tree for job specific informations (JobID, JobOwner). This tree is not stored on AWS System Manager Parameter Store but made available temporarily during the bootstrap sequence on all EC2 nodes provisioned for the given jobExample: Add EFA configuration if
efa_support=Trueis specified during job submissionWe think this change will drastically simplify the way our end-users customize the Compute or/and Visualization nodes setup logic.
Additionally, any configuration update will be easier. As an example, let's assume you are currently using EFA version
1.31.0and you want to bump it to1.32.0. You could proceed to hotpatch your cluster by updating the value of/soca/<CLUSTER_ID>/system/efa/urlkey tohttps://efa-installer.amazonaws.com/aws-efa-installer-1.32.0.tar.gzAWS System Manager Parameter Store includes native versioning, making version rollback easy in case a key was misconfigured or not updated correctly.
We are also developing a utility which will give you the ability to update these value automatically, without having to manually change the content via the AWS System Manager Parameter store console
Beta Was this translation helpful? Give feedback.
All reactions