OPCT Test Failures on GCP #137
Replies: 4 comments 3 replies
-
|
Hi @sjswerdlow, Thanks for submiting the question! I will help you navigate through the results before showing the error. When you run Checking the archive you've uploaded, I can see the following failures: Those failures calls me attention and is causing the others: │ OPCT-023A │ ❌ │ fail │ Sanity [10-openshift-kube-conformance]: potential missing tests in suite │ F:<300 │ Total==1 │
│ OPCT-023B │ ❌ │ fail │ Sanity [20-openshift-conformance-validated]: potential missing tests in suite │ F:<3000 │ Total==1 │
If you check the step summary of each conformance plugin, you can see no tests has been scheduled correctly: ┌───────────────────────────────────────────┐
│ 10-openshift-kube-conformance: ❌ │
├───────────────────────────┬───────────────┤
│ Total tests │ 1 │
│ Passed │ 0 │
│ Failed │ 1 │
│ Timeout │ 0 │
│ Skipped │ 0 │
│ Filter Failed Suite │ 0 (0.00%) │
│ Filter Failed KF │ 0 (0.00%) │
│ Filter Replay │ 0 (0.00%) │
│ Filter Failed Baseline │ 0 (0.00%) │
│ Filter Failed Priority │ 0 (0.00%) │
│ Filter Failed API │ 0 (0.00%) │
│ Failures (Priotity) │ 0 (0.00%) │
│ Result - Job │ failed │
│ Result - Processed │ failed │
└───────────────────────────┴───────────────┘
┌───────────────────────────────────────────┐
│ 20-openshift-conformance-validated: ❌ │
├───────────────────────────┬───────────────┤
│ Total tests │ 1 │
│ Passed │ 0 │
│ Failed │ 1 │
│ Timeout │ 0 │
│ Skipped │ 0 │
│ Filter Failed Suite │ 0 (0.00%) │
│ Filter Failed KF │ 0 (0.00%) │
│ Filter Replay │ 0 (0.00%) │
│ Filter Failed Baseline │ 0 (0.00%) │
│ Filter Failed Priority │ 0 (0.00%) │
│ Filter Failed API │ 0 (0.00%) │
│ Failures (Priotity) │ 0 (0.00%) │
│ Result - Job │ failed │
│ Result - Processed │ failed │
└───────────────────────────┴───────────────┘
When checking the archive file, you can explore the logs of each pod/containers under the directly $ ls podlogs/opct/sonobuoy-10-openshift-kube-conformance-job-881fbc53eaf14ae3/logs/
plugin.txt sonobuoy-worker.txt tests.txtAs shared in this diagram, OPCT is backed by Sonobuoy to orchestrate the test environment, and The log
[...]
/usr/bin/openshift-tests run kubernetes/conformance \
--junit-dir="/tmp/shared/junit" \
--max-parallel-tests="0" \
--monitor="etcd-log-analyzer" \
| tee -a /tmp/shared/fifo || true
[...]
E0110 19:15:00.880457 80 test_context.go:592] Failed to setup provider config for "gce": Error building GCE/GKE provider: timed out waiting for the conditionAnd confirming that the platform type you are using is │ OPCT Summary │
│ > Archive: 20240122-opct_results.tar.gz │
├─────────────────────────────────────┬───────────────────────────────────────────────────────────────┤
│ │ PROVIDER │
├─────────────────────────────────────┼───────────────────────────────────────────────────────────────┤
│ Infrastructure: │ │
│ PlatformType │ GCP │
│ Name │ opct-test-tbmgu-kbgh4 │
It will confirm what we've been discussing by email that opct is not testing* GCP and the pre-configuration to setup GCP SDK may be required by *not tested by OPCT project. This is supported by Your PR may provide that additional setup to the conformance step. I left some comments, please take a look. Let me know if you have any additional questions. |
Beta Was this translation helpful? Give feedback.
-
|
+CC: Pavan
Marco,
To be clear, you are saying that we simply didn't have the setup completed,
because
https://github.com/redhat-openshift-ecosystem/provider-certification-plugins/pull/71/files
isn't in yet? That's a little bit weird to me as Pavan had this working
before. And you are making it sound like thats impossible.
Pavan,
Did you have to hack together something in the base code to get this to
work? I know it was a long while ago, but does this refresh your memory?
Sam,
…On Wed, Jan 22, 2025 at 1:17 PM Marco Braga ***@***.***> wrote:
Hi @sjswerdlow <https://github.com/sjswerdlow>, Thansk for submiting the
question!
When you run opct report --save-to ./results opct-archive.tar.gz it will
provide a couple of suggestions of what it could be failing through check
rules.
Checking the archive you've uploaded, I can see the following failures:
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Validation checks / Results │
│ 🚨 🚨 IMMEDIATE ACTION: 9 Check(s) failed. Review it individually, fix and collect new results 🚨 🚨 │
├───────────┬────┬────────┬────────────────────────────────────────────────────────────────────────────────────────┬──────────────────────────────┬─────────────────┤
│ ID │ # │ RESULT │ CHECK NAME │ TARGET │ CURRENT │
├───────────┼────┼────────┼────────────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────┼─────────────────┤
│ OPCT-001 │ ❌ │ fail │ Kubernetes Conformance [10-openshift-kube-conformance] must pass 100% │ Priority==0|Total!=Failed │ Total==Failed │
│ OPCT-004 │ ❌ │ fail │ OpenShift Conformance [20-openshift-conformance-validated]: Pass ratio must be >=98.5% │ Pass>=98.5%(Fail>1.5%) │ Total==Failed │
│ OPCT-005 │ ❌ │ fail │ OpenShift Conformance Validation [20]: Filter Priority Requirement >= 99.5% │ W<=0.50%,F>0.50% │ Total==Failed │
│ OPCT-005B │ ❌ │ fail │ OpenShift Conformance Validation [20]: Required to Pass After Filtering │ Pass==100%(W<=0.50%,F>0.50%) │ Total==Failed │
│ OPCT-011 │ ❌ │ fail │ The test suite should generate fewer error reports in the logs │ Pass<=150(W>150,F>300) │ ERR !total │
│ OPCT-022 │ ❌ │ fail │ Detected one or more plugin(s) with potential invalid result │ passed │ Failed[10 20] │
│ OPCT-023A │ ❌ │ fail │ Sanity [10-openshift-kube-conformance]: potential missing tests in suite │ F:<300 │ Total==1 │
│ OPCT-023B │ ❌ │ fail │ Sanity [20-openshift-conformance-validated]: potential missing tests in suite │ F:<3000 │ Total==1 │
│ -- │ ❌ │ fail │ Platform Type must be supported by OPCT │ None|External|AWS|Azure │ GCP │
Those failures calls me attention and is causing the others:
│ OPCT-023A │ ❌ │ fail │ Sanity [10-openshift-kube-conformance]: potential missing tests in suite │ F:<300 │ Total==1 │
│ OPCT-023B │ ❌ │ fail │ Sanity [20-openshift-conformance-validated]: potential missing tests in suite │ F:<3000 │ Total==1 │
If you check the step summary of each conformance plugin, you can see no
tests has been scheduled correctly:
┌───────────────────────────────────────────┐
│ 10-openshift-kube-conformance: ❌ │
├───────────────────────────┬───────────────┤
│ Total tests │ 1 │
│ Passed │ 0 │
│ Failed │ 1 │
│ Timeout │ 0 │
│ Skipped │ 0 │
│ Filter Failed Suite │ 0 (0.00%) │
│ Filter Failed KF │ 0 (0.00%) │
│ Filter Replay │ 0 (0.00%) │
│ Filter Failed Baseline │ 0 (0.00%) │
│ Filter Failed Priority │ 0 (0.00%) │
│ Filter Failed API │ 0 (0.00%) │
│ Failures (Priotity) │ 0 (0.00%) │
│ Result - Job │ failed │
│ Result - Processed │ failed │
└───────────────────────────┴───────────────┘
┌───────────────────────────────────────────┐
│ 20-openshift-conformance-validated: ❌ │
├───────────────────────────┬───────────────┤
│ Total tests │ 1 │
│ Passed │ 0 │
│ Failed │ 1 │
│ Timeout │ 0 │
│ Skipped │ 0 │
│ Filter Failed Suite │ 0 (0.00%) │
│ Filter Failed KF │ 0 (0.00%) │
│ Filter Replay │ 0 (0.00%) │
│ Filter Failed Baseline │ 0 (0.00%) │
│ Filter Failed Priority │ 0 (0.00%) │
│ Filter Failed API │ 0 (0.00%) │
│ Failures (Priotity) │ 0 (0.00%) │
│ Result - Job │ failed │
│ Result - Processed │ failed │
└───────────────────────────┴───────────────┘
When checking the archive file, you can explore the logs of each
pod/containers under the directly
podlogs/opct/<plugin/step_name>-job-<plugin_id>/logs/*.txt:
$ ls podlogs/opct/sonobuoy-10-openshift-kube-conformance-job-881fbc53eaf14ae3/logs/
plugin.txt sonobuoy-worker.txt tests.txt
As shared in this diagram
<https://redhat-openshift-ecosystem.github.io/opct/diagrams/opct-sequence/>,
OPCT is backed by Sonobuoy to orchestrate the test environment, and
openshift-tests utility to provision the OpenShift e2e conformance tests.
The log tests.txt is reporting the root cause:
The container responsible to execute the e2e tests are tests.
[...]
/usr/bin/openshift-tests run kubernetes/conformance \
--junit-dir="/tmp/shared/junit" \
--max-parallel-tests="0" \
--monitor="etcd-log-analyzer" \
| tee -a /tmp/shared/fifo || true
[...]
E0110 19:15:00.880457 80 test_context.go:592] Failed to setup provider config for "gce": Error building GCE/GKE provider: timed out waiting for the condition
And confirming that the platform type you are using is GCP:
│ OPCT Summary │
│ > Archive: 20240122-opct_results.tar.gz │
├─────────────────────────────────────┬───────────────────────────────────────────────────────────────┤
│ │ PROVIDER │
├─────────────────────────────────────┼───────────────────────────────────────────────────────────────┤
│ Infrastructure: │ │
│ PlatformType │ GCP │
│ Name │ opct-test-tbmgu-kbgh4 │
It will confirm what we've been discussing by email that opct does not
testing* GCP and the pre-configuration to setup GCP SDK is required by
openshift-tests.
*not tested by OPCT project. This is supported by openshift-tests, but
this project, opct, is not testing (setting up the credentials to the
backend openshift-tests) for GCP cloud provider
Your PR
<redhat-openshift-ecosystem/provider-certification-plugins#71>
may provide that additional setup to the conformance step. I left some
comments, please take a look.
Let me know if you have any additional questions.
—
Reply to this email directly, view it on GitHub
<#137 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A2B4FL62PEFXAN4TKAZ7CDT2L7OEFAVCNFSM6AAAAABVSYWREKVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTCOJSGAYTAMA>
.
You are receiving this because you were mentioned.Message ID:
<redhat-openshift-ecosystem/opct/repo-discussions/137/comments/11920100@
github.com>
|
Beta Was this translation helpful? Give feedback.
-
|
Looks like there are two issues: [2] error fetching short-lived credentials from compute metadata (link-local service): My impression is [1] is generating [2] as if explicity credentials wasn't set, the SDK may fallback to the instance credentials - which is blocked[3] by the overlay network. So when explicit setting credentials, the program, which runs on top of overlay network (no host network) can access the credentials without raising those issues, which is the cause of plugin stop the execution. I am not familiar with GCP, specially the SDK, so I am curious to see your findings. :) The PR[4] has been merged, why not trying it out to check if you can progress?: opct run --watch --plugins-image=quay.io/opct/plugin-openshift-tests:latest[4] redhat-openshift-ecosystem/provider-certification-plugins#71 |
Beta Was this translation helpful? Give feedback.
-
|
Marco, Please see my latest test run: https://drive.google.com/file/d/1WnD4PQqE0Ry2Z9gwz5zGond1ZuY2ya_0/view?usp=sharing&resourcekey=0-TPtiQ7qaresQklwIDP35Ow I don't see the error pertaining to GCP, so I believe I ran it correctly, but the core issue is not resolved. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
We have written tests that will bring up an Openshift Cluster, and perform a OPCT test. We are currently having a number of failures, please see the opct_results.tar.gz attached. My understanding is that is the only thing needed. Please let me know if I'm mistaken.
Drive Link:
https://drive.google.com/file/d/1xSF-GCAyky6JRlRpR3YNoUAd50xcghYh/view?usp=sharing&resourcekey=0-kPpjUM1y_DPY8079AZBYzw
By company policy I can only share this with specific emails so please request access and I should be able to grant it.
If you need more information I'd be happy to grab it for you.
Beta Was this translation helpful? Give feedback.
All reactions