Add terraform and SSH to pypeline#598
Conversation
| def get_region(self) -> str: | ||
| return self.region | ||
|
|
||
| def get_credential_type(self) -> str: |
There was a problem hiding this comment.
Is this a property of the Cloud or a choice of the pipeline?
There was a problem hiding this comment.
property of cloud, will move it to cloud class to remove getters as suggested
There was a problem hiding this comment.
What I meant was, how you sign in is not a property of a Cloud class, it's a user's choice. You can pass it in as an option of login or other functions that use it.
| script: str | ||
| env: Optional[dict[str, str]] = None | ||
| condition: Optional[str] = field(metadata={"yaml": "condition"}, default=None) | ||
| retryCountOnTaskFailure: Optional[int] = field( |
There was a problem hiding this comment.
Python variable should use snake case.
| "INPUT_VARIABLES": input_variables, | ||
| "DEBUG": "$(System.Debug)", | ||
| }, | ||
| condition="ne(variables['SKIP_RESOURCE_MANAGEMENT'], 'true')", |
There was a problem hiding this comment.
Could we remove the env variable and make it an input to a resource? We should eliminate all env variables. It's really hard to track env var usage, and it cut though call stack without any trace.
There was a problem hiding this comment.
Right, Noted
| from textwrap import dedent | ||
| from pipeline import Script | ||
|
|
||
| set_input_variables_aws = lambda cloud, regions, input_variables={}: Script( |
There was a problem hiding this comment.
Does it make sense to write aws, azure and gcp terraforms in 3 different classes?
There was a problem hiding this comment.
When I checked Terraform, the only place where we currently need separate commands for AWS, Azure, and GCP is in the set_input_variables function.
I suppose for now, it’s best to keep them within Terraform for maintainability and to avoid high coupling—until it’s extended further. What do you think?
There was a problem hiding this comment.
SGTM, let's how they are used in an actual resource to decide.
| regional_config_str=$(echo $regional_config | jq -c .) | ||
| echo "Final regional config: $regional_config_str" | ||
| echo "##vso[task.setvariable variable=TERRAFORM_REGIONAL_CONFIG]$regional_config_str" | ||
| """ |
There was a problem hiding this comment.
Should you align the """?
|
Addition :
Corrections:
|
| darwin-amd64/ | ||
| linux-amd64/ | ||
| # Exception: Do not ignore terraform.py | ||
| !terraform.py |
There was a problem hiding this comment.
It's better to write line 48 more specific and avoid this exception.
E.g.
.terraform/
This will not ignore terraform.py
|
|
||
| @dataclass | ||
| class Resource(ABC): | ||
| cloud: str |
There was a problem hiding this comment.
A resource can be a python binary, that's not related to Cloud at all.
| @dataclass | ||
| class Resource(ABC): | ||
| cloud: str | ||
| regions: list[str] |
|
|
||
| from pipeline import Step | ||
|
|
||
| # Keep components here to avoid circular imports |
There was a problem hiding this comment.
Could you elaborate more about this?
| # Factory class to create resources: | ||
| # To avoid repeatedly define shared properties for every resource. usage: @ job_scheduling.py | ||
| @dataclass | ||
| class ResourceFactory: |
There was a problem hiding this comment.
This is not a factory pattern. https://en.wikipedia.org/wiki/Factory_method_pattern
As mentioned before, not all resource is associate with a cloud or a region.
|
|
||
| def set_run_id(run_id: str) -> Script: | ||
| if not run_id: | ||
| run_id = "$(Build.BuildId)-$(System.JobId)" |
There was a problem hiding this comment.
$(Build.BuildId) and $(System.JobId) are azure pipeline's system env vars. We should not extract them out. In addition, it doesn't work, we don't define $(Build.BuildId) in Python.
| @dataclass | ||
| class Setup(Resource): | ||
| run_id: str | ||
| engine: str |
| run_id: str | ||
| engine: str | ||
| test_module_dir: str = "" | ||
| engine_input: dict = field(default_factory=dict) |
There was a problem hiding this comment.
Where is engine_input used?
| script_modules_directory = f"$(Pipeline.Workspace)/s/{test_modules_dir}" | ||
|
|
||
| return Script( | ||
| display_name="Set Script Module Directory", |
There was a problem hiding this comment.
We still need to get $(Pipeline.Workspace) from the Azure pipeline env to get the full path of test_modules_directory. Then it sets a variable TEST_MODULES_DIR to record this value, which will be used in later steps. I don't think we can remove it.
There was a problem hiding this comment.
You need to distinguish 2 types of variables.
1 is used set in the env variable. e.g. env={"TEST_MODULES_DIR": test_modules_dir}, we set this, so we can eliminate them.
2 is ##vso[task.setvariable variable=TEST_MODULES_DIR] these are azure pipeline variables, we can't remove them.
There was a problem hiding this comment.
to record this value, which will be used in later steps. I don't think we can remove it.
Yes you are right!
| def validate(self) -> list[Step]: | ||
| return [validate_owner_info()] | ||
|
|
||
| def execute_tests(self) -> list[Step]: |
There was a problem hiding this comment.
Where is this method used?
There was a problem hiding this comment.
For validate_owner_info() function is called in the Layout during the resource validation step.
There was a problem hiding this comment.
The commend is on line 111. I was asking for execute_tests
There was a problem hiding this comment.
Right, it should have been removed. I was testing smthg and didnt remove it.
| region: str = "eastus" | ||
| cloud: str = "azure" | ||
| regions: list[str] = field(default_factory=lambda: ["eastus2"]) | ||
| subscription: str = os.getenv("AZURE_SUBSCRIPTION_ID") |
There was a problem hiding this comment.
I checked some variables like AZURE_SUBSCRIPTION_ID, SERVICE_CONNECTION, etc., they appear to be set during the pipeline run—possibly for security reasons.
As for cloud, regions, and variables provided by user, we should be able to remove them in almost all scenarios.
| subscription: str = os.getenv("AZURE_SUBSCRIPTION_ID") | ||
| credential_type: CredentialType = CredentialType.SERVICE_CONNECTION | ||
| azure_service_connection: str = os.getenv("AZURE_SERVICE_CONNECTION") | ||
| azure_mi_client_id: str = os.getenv("AZURE_MI_CLIENT_ID") |
There was a problem hiding this comment.
Now we get the AZURE_MI_CLIENT_ID from the env we run Pypeline script. Obviously, it doesn't have it. I'm not 100% sure where is AZURE_MI_CLIENT_ID set, but I suspect it's from Azure pipeline, so we need to get the env var in the yaml file
|
|
||
|
|
||
| @dataclass | ||
| class Terraform: |
There was a problem hiding this comment.
Should it inherit from the Resource class?
There was a problem hiding this comment.
Actually Terraform is not a resource, it's a tool. a AKS cluster is a resource, a resource group is a resource.
| steps=self.setup.setup() | ||
| + self.cloud.login() | ||
| + [step for r in self.resources for step in r.setup()] | ||
| + self.terraform.setup() |
There was a problem hiding this comment.
You should model Terraform as a resource and add it to self.resources.
| + self.terraform.setup() | ||
| + [ | ||
| self.terraform.create_resource_group(), | ||
| self.terraform.run_command(command="version"), |
There was a problem hiding this comment.
It's better to have version, init, apply as separate functions or at least make command an enum, so that it doesn't take arbitrary string.
| regions=self.cloud.regions, | ||
| credential_type=self.cloud.credential_type, | ||
| ) | ||
| if self.setup is None: |
There was a problem hiding this comment.
Setup was modeled as a resource, it should not be transparent to Layout. Same goes Terraform.

Uh oh!
There was an error while loading. Please reload this page.