Skip to content

Add terraform and SSH to pypeline#598

Closed
vittoriasalim wants to merge 19 commits into
pypelinefrom
vitto/pypeline
Closed

Add terraform and SSH to pypeline#598
vittoriasalim wants to merge 19 commits into
pypelinefrom
vitto/pypeline

Conversation

@vittoriasalim
Copy link
Copy Markdown
Collaborator

@vittoriasalim vittoriasalim commented Apr 17, 2025

  • Add terraform
  • Add SSH

Comment thread pypeline/benchmark.py
Comment thread pypeline/benchmark.py Outdated
Comment thread pypeline/resource/terraform/terraform.py Outdated
Comment thread pypeline/resource/terraform/run_command.py Outdated
Comment thread pypeline/resource/terraform/run_command.py Outdated
Comment thread pypeline/cloud/azure.py Outdated
def get_region(self) -> str:
return self.region

def get_credential_type(self) -> str:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a property of the Cloud or a choice of the pipeline?

Copy link
Copy Markdown
Collaborator Author

@vittoriasalim vittoriasalim Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

property of cloud, will move it to cloud class to remove getters as suggested

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant was, how you sign in is not a property of a Cloud class, it's a user's choice. You can pass it in as an option of login or other functions that use it.

Comment thread pypeline/pipeline.py Outdated
Comment thread pypeline/pipeline.py Outdated
script: str
env: Optional[dict[str, str]] = None
condition: Optional[str] = field(metadata={"yaml": "condition"}, default=None)
retryCountOnTaskFailure: Optional[int] = field(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python variable should use snake case.

"INPUT_VARIABLES": input_variables,
"DEBUG": "$(System.Debug)",
},
condition="ne(variables['SKIP_RESOURCE_MANAGEMENT'], 'true')",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we remove the env variable and make it an input to a resource? We should eliminate all env variables. It's really hard to track env var usage, and it cut though call stack without any trace.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, Noted

Comment thread pypeline/resource/terraform/input_variables.py Outdated
Comment thread pypeline/resource/terraform/input_variables.py Outdated
from textwrap import dedent
from pipeline import Script

set_input_variables_aws = lambda cloud, regions, input_variables={}: Script(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to write aws, azure and gcp terraforms in 3 different classes?

Copy link
Copy Markdown
Collaborator Author

@vittoriasalim vittoriasalim Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I checked Terraform, the only place where we currently need separate commands for AWS, Azure, and GCP is in the set_input_variables function.

I suppose for now, it’s best to keep them within Terraform for maintainability and to avoid high coupling—until it’s extended further. What do you think?

image

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM, let's how they are used in an actual resource to decide.

regional_config_str=$(echo $regional_config | jq -c .)
echo "Final regional config: $regional_config_str"
echo "##vso[task.setvariable variable=TERRAFORM_REGIONAL_CONFIG]$regional_config_str"
"""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you align the """?

Comment thread pypeline/resource/terraform/input_variables.py Outdated
Comment thread pypeline/resource/terraform/terraform.py Outdated
@vittoriasalim
Copy link
Copy Markdown
Collaborator Author

vittoriasalim commented May 6, 2025

@wonderyl

Addition :

  • Added SSH resource
  • Set up a resource factory to avoid reinitializing shared properties for every resource.
  • Moved all components to components.py to avoid circular imports and to keep all components in one place.

Corrections:

  • Removed environment variables and made them inputs to resources.
  • Refactored Terraform into its own class, refactor run_command and input_variables for readability
  • Ensured proper typing and converted lambda expressions into functions.
  • Used an Enum for credential types.
  • Moved logic outside the script for better readability.
  • Moved properties into data fields within the cloud class to avoid exposing getters and setters.

@vittoriasalim vittoriasalim changed the title Add terraform resources to pypeline Add terraform and SSH to pypeline May 6, 2025
Comment thread .gitignore
darwin-amd64/
linux-amd64/
# Exception: Do not ignore terraform.py
!terraform.py
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to write line 48 more specific and avoid this exception.
E.g.

.terraform/

This will not ignore terraform.py

Comment thread pypeline/components.py

@dataclass
class Resource(ABC):
cloud: str
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A resource can be a python binary, that's not related to Cloud at all.

Comment thread pypeline/components.py
@dataclass
class Resource(ABC):
cloud: str
regions: list[str]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Comment thread pypeline/components.py

from pipeline import Step

# Keep components here to avoid circular imports
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate more about this?

Comment thread pypeline/components.py
# Factory class to create resources:
# To avoid repeatedly define shared properties for every resource. usage: @ job_scheduling.py
@dataclass
class ResourceFactory:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a factory pattern. https://en.wikipedia.org/wiki/Factory_method_pattern
As mentioned before, not all resource is associate with a cloud or a region.


def set_run_id(run_id: str) -> Script:
if not run_id:
run_id = "$(Build.BuildId)-$(System.JobId)"
Copy link
Copy Markdown
Collaborator

@wonderyl wonderyl May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$(Build.BuildId) and $(System.JobId) are azure pipeline's system env vars. We should not extract them out. In addition, it doesn't work, we don't define $(Build.BuildId) in Python.

@dataclass
class Setup(Resource):
run_id: str
engine: str
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is engine used?

run_id: str
engine: str
test_module_dir: str = ""
engine_input: dict = field(default_factory=dict)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is engine_input used?

script_modules_directory = f"$(Pipeline.Workspace)/s/{test_modules_dir}"

return Script(
display_name="Set Script Module Directory",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to get $(Pipeline.Workspace) from the Azure pipeline env to get the full path of test_modules_directory. Then it sets a variable TEST_MODULES_DIR to record this value, which will be used in later steps. I don't think we can remove it.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to distinguish 2 types of variables.
1 is used set in the env variable. e.g. env={"TEST_MODULES_DIR": test_modules_dir}, we set this, so we can eliminate them.
2 is ##vso[task.setvariable variable=TEST_MODULES_DIR] these are azure pipeline variables, we can't remove them.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to record this value, which will be used in later steps. I don't think we can remove it.

Yes you are right!

def validate(self) -> list[Step]:
return [validate_owner_info()]

def execute_tests(self) -> list[Step]:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this method used?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For validate_owner_info() function is called in the Layout during the resource validation step.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commend is on line 111. I was asking for execute_tests

Copy link
Copy Markdown
Collaborator Author

@vittoriasalim vittoriasalim May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it should have been removed. I was testing smthg and didnt remove it.

Comment thread pypeline/cloud/azure.py
region: str = "eastus"
cloud: str = "azure"
regions: list[str] = field(default_factory=lambda: ["eastus2"])
subscription: str = os.getenv("AZURE_SUBSCRIPTION_ID")
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked some variables like AZURE_SUBSCRIPTION_ID, SERVICE_CONNECTION, etc., they appear to be set during the pipeline run—possibly for security reasons.
As for cloud, regions, and variables provided by user, we should be able to remove them in almost all scenarios.

Comment thread pypeline/cloud/azure.py
subscription: str = os.getenv("AZURE_SUBSCRIPTION_ID")
credential_type: CredentialType = CredentialType.SERVICE_CONNECTION
azure_service_connection: str = os.getenv("AZURE_SERVICE_CONNECTION")
azure_mi_client_id: str = os.getenv("AZURE_MI_CLIENT_ID")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we get the AZURE_MI_CLIENT_ID from the env we run Pypeline script. Obviously, it doesn't have it. I'm not 100% sure where is AZURE_MI_CLIENT_ID set, but I suspect it's from Azure pipeline, so we need to get the env var in the yaml file



@dataclass
class Terraform:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it inherit from the Resource class?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually Terraform is not a resource, it's a tool. a AKS cluster is a resource, a resource group is a resource.

Comment thread pypeline/benchmark.py
steps=self.setup.setup()
+ self.cloud.login()
+ [step for r in self.resources for step in r.setup()]
+ self.terraform.setup()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should model Terraform as a resource and add it to self.resources.

Comment thread pypeline/benchmark.py
+ self.terraform.setup()
+ [
self.terraform.create_resource_group(),
self.terraform.run_command(command="version"),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to have version, init, apply as separate functions or at least make command an enum, so that it doesn't take arbitrary string.

Comment thread pypeline/benchmark.py
regions=self.cloud.regions,
credential_type=self.cloud.credential_type,
)
if self.setup is None:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setup was modeled as a resource, it should not be transparent to Layout. Same goes Terraform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants