I was testing the flow mentioned in TC242 to test the reliability and stability of qsfs.
Firstly i started with a (16+4+4) setup
locals {
metas = ["meta1", "meta2", "meta3", "meta4"]
datas = ["data1", "data2", "data3", "data4",
"data5", "data6", "data7", "data8",
"data9", "data10", "data11", "data12",
"data13", "data14", "data15", "data16",
"data17", "data18", "data19", "data20",
"data21", "data22", "data23", "data24"]
}
minimal_shards = 16
expected_shards = 20
after deploying with this setup, i was able to ssh to the machine and write a 1gb file then i changed the setup as mentioned in the test case by removing 4 ZDBs.
new setup (16+4)
locals {
metas = ["meta1", "meta2", "meta3", "meta4"]
datas = ["data1", "data2", "data3", "data4",
"data5", "data6", "data7", "data8",
"data9", "data10", "data11", "data12",
"data13", "data14", "data15", "data16",
"data17", "data18", "data19", "data20"]
}
after updating the deployment i was still able to ssh to the machine and access the old files and then i created a 300mb file and changed the setup again by removing another 4 ZDBs
new setup
locals {
metas = ["meta1", "meta2", "meta3", "meta4"]
datas = ["data1", "data2", "data3", "data4",
"data5", "data6", "data7", "data8",
"data9", "data10", "data11", "data12",
"data13", "data14", "data15", "data16"]
}
but when i tried to update the deployment this time i got the following error
╷
│ Error: failed to deploy deployments: error waiting deployment: workload 1 failed within deployment 5211 with error failed to update qsfs mount: failed to restart zstor process: non-zero exit code: 1; failed to revert deployments: error waiting deployment: workload 0 failed within deployment 5211 with error failed to update qsfs mount: failed to restart zstor process: non-zero exit code: 1; try again
│
│ with grid_deployment.qsfs,
│ on main.tf line 51, in resource "grid_deployment" "qsfs":
│ 51: resource "grid_deployment" "qsfs" {
│
╵
terraform {
required_providers {
grid = {
source = "threefoldtech/grid"
}
}
}
provider "grid" {
}
locals {
metas = ["meta1", "meta2", "meta3", "meta4"]
datas = ["data1", "data2", "data3", "data4",
"data5", "data6", "data7", "data8",
"data9", "data10", "data11", "data12",
"data13", "data14", "data15", "data16",
"data17", "data18", "data19", "data20",
"data21", "data22", "data23", "data24"]
}
resource "grid_network" "net1" {
nodes = [7]
ip_range = "10.1.0.0/16"
name = "network"
description = "newer network"
}
resource "grid_deployment" "d1" {
node = 7
dynamic "zdbs" {
for_each = local.metas
content {
name = zdbs.value
description = "description"
password = "password"
size = 10
mode = "user"
}
}
dynamic "zdbs" {
for_each = local.datas
content {
name = zdbs.value
description = "description"
password = "password"
size = 10
mode = "seq"
}
}
}
resource "grid_deployment" "qsfs" {
node = 7
network_name = grid_network.net1.name
ip_range = lookup(grid_network.net1.nodes_ip_range, 7, "")
qsfs {
name = "qsfs"
description = "description6"
cache = 10240 # 10 GB
minimal_shards = 16
expected_shards = 20
redundant_groups = 0
redundant_nodes = 0
max_zdb_data_dir_size = 512 # 512 MB
encryption_algorithm = "AES"
encryption_key = "4d778ba3216e4da4231540c92a55f06157cabba802f9b68fb0f78375d2e825af"
compression_algorithm = "snappy"
metadata {
type = "zdb"
prefix = "hamada"
encryption_algorithm = "AES"
encryption_key = "4d778ba3216e4da4231540c92a55f06157cabba802f9b68fb0f78375d2e825af"
dynamic "backends" {
for_each = [for zdb in grid_deployment.d1.zdbs : zdb if zdb.mode != "seq"]
content {
address = format("[%s]:%d", backends.value.ips[1], backends.value.port)
namespace = backends.value.namespace
password = backends.value.password
}
}
}
groups {
dynamic "backends" {
for_each = [for zdb in grid_deployment.d1.zdbs : zdb if zdb.mode == "seq"]
content {
address = format("[%s]:%d", backends.value.ips[1], backends.value.port)
namespace = backends.value.namespace
password = backends.value.password
}
}
}
}
vms {
name = "vm"
flist = "https://hub.grid.tf/tf-official-apps/base:latest.flist"
cpu = 2
memory = 1024
entrypoint = "/sbin/zinit init"
planetary = true
env_vars = {
SSH_KEY = "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQC533B35CELELtgg2d7Tsi5KelLxR0FYUlrcTmRRQuTNP9arP01JYD8iHKqh6naMbbzR8+M0gdPEeRK4oVqQtEcH1C47vLyRI/4DqahAE2nTW08wtJM5uiIvcQ9H2HMzZ3MXYWWlgyHMgW2QXQxzrRS0NXvsY+4wxe97MMZs9MDs+d+X15DfG6JffjMHydi+4tHB50WmHe5tFscBFxLbgDBUxNGiwi3BQc1nWIuYwMMV1GFwT3ndyLAp19KPkEa/dffiqLdzkgs2qpXtfBhTZ/lFeQRc60DHCMWExr9ySDbavIMuBFylf/ZQeJXm9dFXJN7bBTbflZIIuUMjmrI7cU5eSuZqAj5l+Yb1mLN8ljmKSIM3/tkKbzXNH5AUtRVKTn+aEPvJAEYtserAxAP5pjy6nmegn0UerEE3DWEV2kqDig3aPSNhi9WSCykvG2tz7DIr0UP6qEIWYMC/5OisnSGj8w8dAjyxS9B0Jlx7DEmqPDNBqp8UcwV75Cot8vtIac= root@mohamed-Inspiron-3576"
}
mounts {
disk_name = "qsfs"
mount_point = "/qsfs"
}
}
}
output "metrics" {
value = grid_deployment.qsfs.qsfs[0].metrics_endpoint
}
output "ygg_ip" {
value = grid_deployment.qsfs.vms[0].ygg_ip
}
I was testing the flow mentioned in TC242 to test the reliability and stability of qsfs.
Firstly i started with a (16+4+4) setup
after deploying with this setup, i was able to ssh to the machine and write a 1gb file then i changed the setup as mentioned in the test case by removing 4 ZDBs.
new setup (16+4)
after updating the deployment i was still able to ssh to the machine and access the old files and then i created a 300mb file and changed the setup again by removing another 4 ZDBs
new setup
but when i tried to update the deployment this time i got the following error