Add terraform and SSH to pypeline by vittoriasalim · Pull Request #598 · Azure/telescope

vittoriasalim · 2025-04-17T01:55:41Z

Add terraform
Add SSH

…k pipelines.

wonderyl · 2025-04-17T11:46:46Z

+    def get_region(self) -> str:
+        return self.region
+
+    def get_credential_type(self) -> str:


Is this a property of the Cloud or a choice of the pipeline?

property of cloud, will move it to cloud class to remove getters as suggested

What I meant was, how you sign in is not a property of a Cloud class, it's a user's choice. You can pass it in as an option of login or other functions that use it.

wonderyl · 2025-04-17T12:00:59Z

    script: str
-    env: Optional[dict[str, str]] = None
+    condition: Optional[str] = field(metadata={"yaml": "condition"}, default=None)
+    retryCountOnTaskFailure: Optional[int] = field(


Python variable should use snake case.

wonderyl · 2025-04-17T12:06:01Z

+        "INPUT_VARIABLES": input_variables,
+        "DEBUG": "$(System.Debug)",
+    },
+    condition="ne(variables['SKIP_RESOURCE_MANAGEMENT'], 'true')",


Could we remove the env variable and make it an input to a resource? We should eliminate all env variables. It's really hard to track env var usage, and it cut though call stack without any trace.

Right, Noted

wonderyl · 2025-04-17T12:15:58Z

+from textwrap import dedent
+from pipeline import Script
+
+set_input_variables_aws = lambda cloud, regions, input_variables={}: Script(


Does it make sense to write aws, azure and gcp terraforms in 3 different classes?

When I checked Terraform, the only place where we currently need separate commands for AWS, Azure, and GCP is in the set_input_variables function.

I suppose for now, it’s best to keep them within Terraform for maintainability and to avoid high coupling—until it’s extended further. What do you think?

SGTM, let's how they are used in an actual resource to decide.

wonderyl · 2025-04-17T13:20:10Z

+        regional_config_str=$(echo $regional_config | jq -c .)
+        echo "Final regional config: $regional_config_str"
+        echo "##vso[task.setvariable variable=TERRAFORM_REGIONAL_CONFIG]$regional_config_str"
+    """


Should you align the """?

vittoriasalim · 2025-05-06T06:18:15Z

@wonderyl

Addition :

Added SSH resource
Set up a resource factory to avoid reinitializing shared properties for every resource.
Moved all components to components.py to avoid circular imports and to keep all components in one place.

Corrections:

Removed environment variables and made them inputs to resources.
Refactored Terraform into its own class, refactor run_command and input_variables for readability
Ensured proper typing and converted lambda expressions into functions.
Used an Enum for credential types.
Moved logic outside the script for better readability.
Moved properties into data fields within the cloud class to avoid exposing getters and setters.

wonderyl · 2025-05-07T03:05:54Z

 darwin-amd64/
 linux-amd64/
+# Exception: Do not ignore terraform.py
+!terraform.py


It's better to write line 48 more specific and avoid this exception.
E.g.

.terraform/

This will not ignore terraform.py

wonderyl · 2025-05-07T03:08:38Z

+
+@dataclass
+class Resource(ABC):
+    cloud: str


A resource can be a python binary, that's not related to Cloud at all.

wonderyl · 2025-05-07T03:08:47Z

+@dataclass
+class Resource(ABC):
+    cloud: str
+    regions: list[str]


wonderyl · 2025-05-07T03:11:54Z

+
+from pipeline import Step
+
+# Keep components here to avoid circular imports


Could you elaborate more about this?

wonderyl · 2025-05-07T03:16:51Z

+# Factory class to create resources:
+# To avoid repeatedly define shared properties for every resource. usage: @ job_scheduling.py
+@dataclass
+class ResourceFactory:


This is not a factory pattern. https://en.wikipedia.org/wiki/Factory_method_pattern
As mentioned before, not all resource is associate with a cloud or a region.

wonderyl · 2025-05-07T03:23:40Z

+
+def set_run_id(run_id: str) -> Script:
+    if not run_id:
+        run_id = "$(Build.BuildId)-$(System.JobId)"


$(Build.BuildId) and $(System.JobId) are azure pipeline's system env vars. We should not extract them out. In addition, it doesn't work, we don't define $(Build.BuildId) in Python.

wonderyl · 2025-05-07T03:24:55Z

 @dataclass
 class Setup(Resource):
    run_id: str
+    engine: str


Where is engine used?

wonderyl · 2025-05-07T03:25:23Z

    run_id: str
+    engine: str
    test_module_dir: str = ""
+    engine_input: dict = field(default_factory=dict)


Where is engine_input used?

wonderyl · 2025-05-07T04:10:15Z

+    script_modules_directory = f"$(Pipeline.Workspace)/s/{test_modules_dir}"
+
+    return Script(
+        display_name="Set Script Module Directory",


We still need to get $(Pipeline.Workspace) from the Azure pipeline env to get the full path of test_modules_directory. Then it sets a variable TEST_MODULES_DIR to record this value, which will be used in later steps. I don't think we can remove it.

You need to distinguish 2 types of variables.
1 is used set in the env variable. e.g. env={"TEST_MODULES_DIR": test_modules_dir}, we set this, so we can eliminate them.
2 is ##vso[task.setvariable variable=TEST_MODULES_DIR] these are azure pipeline variables, we can't remove them.

to record this value, which will be used in later steps. I don't think we can remove it.

Yes you are right!

wonderyl · 2025-05-07T04:21:58Z

    def validate(self) -> list[Step]:
+        return [validate_owner_info()]
+
+    def execute_tests(self) -> list[Step]:


Where is this method used?

For validate_owner_info() function is called in the Layout during the resource validation step.

The commend is on line 111. I was asking for execute_tests

Right, it should have been removed. I was testing smthg and didnt remove it.

vittoriasalim · 2025-05-07T05:03:43Z

-    region: str = "eastus"
+    cloud: str = "azure"
+    regions: list[str] = field(default_factory=lambda: ["eastus2"])
    subscription: str = os.getenv("AZURE_SUBSCRIPTION_ID")


I checked some variables like AZURE_SUBSCRIPTION_ID, SERVICE_CONNECTION, etc., they appear to be set during the pipeline run—possibly for security reasons.
As for cloud, regions, and variables provided by user, we should be able to remove them in almost all scenarios.

wonderyl · 2025-05-07T06:16:39Z

    subscription: str = os.getenv("AZURE_SUBSCRIPTION_ID")
    credential_type: CredentialType = CredentialType.SERVICE_CONNECTION
    azure_service_connection: str = os.getenv("AZURE_SERVICE_CONNECTION")
    azure_mi_client_id: str = os.getenv("AZURE_MI_CLIENT_ID")


Now we get the AZURE_MI_CLIENT_ID from the env we run Pypeline script. Obviously, it doesn't have it. I'm not 100% sure where is AZURE_MI_CLIENT_ID set, but I suspect it's from Azure pipeline, so we need to get the env var in the yaml file

wonderyl · 2025-05-07T06:18:24Z

+
+
+@dataclass
+class Terraform:


Should it inherit from the Resource class?

Actually Terraform is not a resource, it's a tool. a AKS cluster is a resource, a resource group is a resource.

wonderyl · 2025-05-07T06:45:04Z

            steps=self.setup.setup()
            + self.cloud.login()
            + [step for r in self.resources for step in r.setup()]
+            + self.terraform.setup()


You should model Terraform as a resource and add it to self.resources.

wonderyl · 2025-05-07T06:49:22Z

+            + self.terraform.setup()
+            + [
+                self.terraform.create_resource_group(),
+                self.terraform.run_command(command="version"),


It's better to have version, init, apply as separate functions or at least make command an enum, so that it doesn't take arbitrary string.

wonderyl · 2025-05-07T06:57:40Z

+                regions=self.cloud.regions,
+                credential_type=self.cloud.credential_type,
+            )
+        if self.setup is None:


Setup was modeled as a resource, it should not be transparent to Layout. Same goes Terraform.

Lei Yao and others added 2 commits March 17, 2025 18:54

The initial commit of Pypeline, that uses python to generate benchmar…

16879fc

…k pipelines.

add terraform resources to pipeline

4089f43

vittoriasalim requested review from alyssa1303, anson627, rafael-mendes-pereira and sumanthreddy29 as code owners April 17, 2025 01:55

vittoriasalim added 5 commits April 17, 2025 12:05

fix merge conflict

54c6d33

fix merge conflict

ace9d5a

fix merge conflict

f4ce6e8

fix formatting

1e68cfb

add setup resources

c3fee85

vittoriasalim assigned vittoriasalim and unassigned vittoriasalim Apr 17, 2025

vittoriasalim requested a review from wonderyl April 17, 2025 03:21

vittoriasalim added 2 commits April 17, 2025 13:49

add terraform.py

7919180

make sure terraform.py is tracked in gitignore

86649ae