fix: improper error status raise by akmalsoliev · Pull Request #661 · aws-samples/dbt-glue

akmalsoliev · 2026-02-13T08:29:44Z

'Status': 'error' is not being properly checked

Description

Fix case-sensitive comparison of error status in Glue statement output. The Status field from Statement.Output may return values with varying casing (e.g., 'error', 'Error', 'ERROR'). The previous check output.get('Status') == 'ERROR'` would miss non-uppercase, causing errors to silently pass as successful executions.

Checklist

I have signed the CLA
I have run this code in development and it appears to resolve the stated issue
This PR includes tests, or tests are not required/relevant for this PR
I have updated the CHANGELOG.md and added information about my change to the "dbt-glue next" section.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

'Status': 'error' is not being properly checked

yotahk · 2026-02-19T09:51:35Z

@akmalsoliev Thank you for the PR. Do you have a reproducible sample that can observe what's returned in Status field?

According to the doc, expected value for Status field are WAITING | RUNNING | AVAILABLE | CANCELLING | CANCELLED | ERROR.
https://docs.aws.amazon.com/ja_jp/glue/latest/webapi/API_StatementOutput.html
https://docs.aws.amazon.com/boto3/latest/reference/services/glue/client/get_statement.html

On the other hand, from my local experiment Status returns either ok or error. Looks like the documentation needs an update.

I want to understand what are the possible values here to make error handling explicit.

Sample code (tested with Glue 3.0, 4.0, 5.0 and 5.1):

pull-661-repro.py

# /// script
# dependencies = [
#   "boto3==1.28.16",
# ]
# ///
"""Repro script for dbt-glue PR #661: improper error status raise.

Exercises various statement states (RUNNING, AVAILABLE, ERROR, CANCELLED)
and prints state + output.Status on every poll iteration, mirroring the
logic in _wait_for_statement_completion from python_submissions.py.
"""
import time

import boto3

REGION = "ap-northeast-1"
SESSION_ID = "test-error-status-661"
ROLE_ARN = "arn:aws:iam::012345678901:role/AWSGlueServiceRoleDefault"


def cleanup_session(glue):
    try:
        glue.delete_session(Id=SESSION_ID)
        print("=== Cleaned up existing session ===")
        time.sleep(5)
    except glue.exceptions.EntityNotFoundException:
        pass


def wait_for_session_ready(glue):
    print("=== Waiting for session to be READY ===")
    while True:
        status = glue.get_session(Id=SESSION_ID)["Session"]["Status"]
        if status == "READY":
            break
        time.sleep(1)


def run_and_wait(glue, code: str, label: str):
    """Mirrors _wait_for_statement_completion from python_submissions.py."""
    print(f"\n=== Running {label} statement ===")
    stmt_id = glue.run_statement(SessionId=SESSION_ID, Code=code)["Id"]
    print(f"Statement ID: {stmt_id}")

    while True:
        response = glue.get_statement(SessionId=SESSION_ID, Id=stmt_id)
        state = response['Statement']['State']
        output = response['Statement'].get('Output', {})
        status = output.get('Status', '')
        print(f"  state={state}, output.Status={status}")

        if state in ('AVAILABLE', 'CANCELLED', 'ERROR'):
            print(f"  => State={state}, Output={output}")
            break
        time.sleep(1)


def run_cancel_test(glue):
    """Submit a long-running statement then cancel it to observe CANCELLED state."""
    print("\n=== Running cancel test ===")
    stmt_id = glue.run_statement(SessionId=SESSION_ID, Code="import time; time.sleep(300)")["Id"]
    print(f"Statement ID: {stmt_id}")
    time.sleep(2)
    print("Cancelling statement...")
    glue.cancel_statement(SessionId=SESSION_ID, Id=stmt_id)
    while True:
        response = glue.get_statement(SessionId=SESSION_ID, Id=stmt_id)
        state = response['Statement']['State']
        output = response['Statement'].get('Output', {})
        status = output.get('Status', '')
        print(f"  state={state}, output.Status={status}")
        if state in ('AVAILABLE', 'CANCELLED', 'ERROR'):
            print(f"  => State={state}, Output={output}")
            break
        time.sleep(1)


def main():
    glue = boto3.client("glue", region_name=REGION)

    cleanup_session(glue)

    print("=== Creating session ===")
    glue.create_session(
        Id=SESSION_ID,
        Role=ROLE_ARN,
        Command={"Name": "glueetl", "PythonVersion": "3"},
        GlueVersion="5.1",  # 3.0, 4.0, 5.0, 5.1
        WorkerType="G.1X",
        NumberOfWorkers=2,
        DefaultArguments={"--datalake-formats": "iceberg"},
    )

    wait_for_session_ready(glue)

    # 1) Heavy statement: observe RUNNING -> AVAILABLE transition
    heavy_code = """
import time
total = 0
for i in range(10_000_000):
    total += i
time.sleep(5)
print(f"result: {total}")
"""
    run_and_wait(glue, heavy_code, "heavy (RUNNING->AVAILABLE)")

    # 2) Syntax error: state stays AVAILABLE but output.Status is 'error'
    run_and_wait(glue, "def (invalid syntax here", "syntax-error")

    # 3) Cancel: observe CANCELLED state
    run_cancel_test(glue)

    # 4) Simple success: AVAILABLE with output.Status 'ok'
    run_and_wait(glue, "print('hello')", "simple-success")

    print("\n=== Deleting session ===")
    glue.delete_session(Id=SESSION_ID)
    print("Done")


if __name__ == "__main__":
    main()

$ pull-661-repro.py

=== Cleaned up existing session ===
=== Creating session ===
=== Waiting for session to be READY ===

=== Running heavy (RUNNING->AVAILABLE) statement ===
Statement ID: 0
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=AVAILABLE, output.Status=ok
  => State=AVAILABLE, Output={'Data': {'TextPlain': 'result: 49999995000000'}, 'ExecutionCount': 0, 'Status': 'ok'}

=== Running syntax-error statement ===
Statement ID: 1
  state=AVAILABLE, output.Status=error
  => State=AVAILABLE, Output={'Data': {'TextPlain': ''}, 'ExecutionCount': 1, 'Status': 'error', 'ErrorName': 'SyntaxError', 'ErrorValue': 'invalid syntax (<stdin>, line 1)', 'Traceback': ['  File "<stdin>", line 1\n', '    def (invalid syntax here\n', '        ^\n', 'SyntaxError: invalid syntax\n']}

=== Running cancel test ===
Statement ID: 2
Cancelling statement...
  state=CANCELLED, output.Status=error
  => State=CANCELLED, Output={'Data': {'TextPlain': ''}, 'ExecutionCount': 2, 'Status': 'error', 'ErrorName': 'StatementCancellationException', 'ErrorValue': '', 'Traceback': ['Traceback (most recent call last):\n', '  File "/tmp/6841545343434752041", line 53, in signal_handler\n    raise StatementCancellationException\n', 'StatementCancellationException\n']}

=== Running simple-success statement ===
Statement ID: 3
  state=AVAILABLE, output.Status=ok
  => State=AVAILABLE, Output={'Data': {'TextPlain': 'hello'}, 'ExecutionCount': 3, 'Status': 'ok'}

=== Deleting session ===
Done

akmalsoliev · 2026-02-19T10:06:02Z

@akmalsoliev Thank you for the PR. Do you have a reproducible sample that can observe what's returned in Status field?

According to the doc, expected value for Status field are WAITING | RUNNING | AVAILABLE | CANCELLING | CANCELLED | ERROR. https://docs.aws.amazon.com/ja_jp/glue/latest/webapi/API_StatementOutput.html https://docs.aws.amazon.com/boto3/latest/reference/services/glue/client/get_statement.html

On the other hand, from my local experiment Status returns either ok or error. Looks like the documentation needs an update.

I want to understand what are the possible values here to make error handling explicit.

Sample code (tested with Glue 3.0, 4.0, 5.0 and 5.1):

pull-661-repro.py

# /// script
# dependencies = [
#   "boto3==1.28.16",
# ]
# ///
"""Repro script for dbt-glue PR #661: improper error status raise.

Exercises various statement states (RUNNING, AVAILABLE, ERROR, CANCELLED)
and prints state + output.Status on every poll iteration, mirroring the
logic in _wait_for_statement_completion from python_submissions.py.
"""
import time

import boto3

REGION = "ap-northeast-1"
SESSION_ID = "test-error-status-661"
ROLE_ARN = "arn:aws:iam::012345678901:role/AWSGlueServiceRoleDefault"


def cleanup_session(glue):
    try:
        glue.delete_session(Id=SESSION_ID)
        print("=== Cleaned up existing session ===")
        time.sleep(5)
    except glue.exceptions.EntityNotFoundException:
        pass


def wait_for_session_ready(glue):
    print("=== Waiting for session to be READY ===")
    while True:
        status = glue.get_session(Id=SESSION_ID)["Session"]["Status"]
        if status == "READY":
            break
        time.sleep(1)


def run_and_wait(glue, code: str, label: str):
    """Mirrors _wait_for_statement_completion from python_submissions.py."""
    print(f"\n=== Running {label} statement ===")
    stmt_id = glue.run_statement(SessionId=SESSION_ID, Code=code)["Id"]
    print(f"Statement ID: {stmt_id}")

    while True:
        response = glue.get_statement(SessionId=SESSION_ID, Id=stmt_id)
        state = response['Statement']['State']
        output = response['Statement'].get('Output', {})
        status = output.get('Status', '')
        print(f"  state={state}, output.Status={status}")

        if state in ('AVAILABLE', 'CANCELLED', 'ERROR'):
            print(f"  => State={state}, Output={output}")
            break
        time.sleep(1)


def run_cancel_test(glue):
    """Submit a long-running statement then cancel it to observe CANCELLED state."""
    print("\n=== Running cancel test ===")
    stmt_id = glue.run_statement(SessionId=SESSION_ID, Code="import time; time.sleep(300)")["Id"]
    print(f"Statement ID: {stmt_id}")
    time.sleep(2)
    print("Cancelling statement...")
    glue.cancel_statement(SessionId=SESSION_ID, Id=stmt_id)
    while True:
        response = glue.get_statement(SessionId=SESSION_ID, Id=stmt_id)
        state = response['Statement']['State']
        output = response['Statement'].get('Output', {})
        status = output.get('Status', '')
        print(f"  state={state}, output.Status={status}")
        if state in ('AVAILABLE', 'CANCELLED', 'ERROR'):
            print(f"  => State={state}, Output={output}")
            break
        time.sleep(1)


def main():
    glue = boto3.client("glue", region_name=REGION)

    cleanup_session(glue)

    print("=== Creating session ===")
    glue.create_session(
        Id=SESSION_ID,
        Role=ROLE_ARN,
        Command={"Name": "glueetl", "PythonVersion": "3"},
        GlueVersion="5.1",  # 3.0, 4.0, 5.0, 5.1
        WorkerType="G.1X",
        NumberOfWorkers=2,
        DefaultArguments={"--datalake-formats": "iceberg"},
    )

    wait_for_session_ready(glue)

    # 1) Heavy statement: observe RUNNING -> AVAILABLE transition
    heavy_code = """
import time
total = 0
for i in range(10_000_000):
    total += i
time.sleep(5)
print(f"result: {total}")
"""
    run_and_wait(glue, heavy_code, "heavy (RUNNING->AVAILABLE)")

    # 2) Syntax error: state stays AVAILABLE but output.Status is 'error'
    run_and_wait(glue, "def (invalid syntax here", "syntax-error")

    # 3) Cancel: observe CANCELLED state
    run_cancel_test(glue)

    # 4) Simple success: AVAILABLE with output.Status 'ok'
    run_and_wait(glue, "print('hello')", "simple-success")

    print("\n=== Deleting session ===")
    glue.delete_session(Id=SESSION_ID)
    print("Done")


if __name__ == "__main__":
    main()

$ pull-661-repro.py

=== Cleaned up existing session ===
=== Creating session ===
=== Waiting for session to be READY ===

=== Running heavy (RUNNING->AVAILABLE) statement ===
Statement ID: 0
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=RUNNING, output.Status=
  state=AVAILABLE, output.Status=ok
  => State=AVAILABLE, Output={'Data': {'TextPlain': 'result: 49999995000000'}, 'ExecutionCount': 0, 'Status': 'ok'}

=== Running syntax-error statement ===
Statement ID: 1
  state=AVAILABLE, output.Status=error
  => State=AVAILABLE, Output={'Data': {'TextPlain': ''}, 'ExecutionCount': 1, 'Status': 'error', 'ErrorName': 'SyntaxError', 'ErrorValue': 'invalid syntax (<stdin>, line 1)', 'Traceback': ['  File "<stdin>", line 1\n', '    def (invalid syntax here\n', '        ^\n', 'SyntaxError: invalid syntax\n']}

=== Running cancel test ===
Statement ID: 2
Cancelling statement...
  state=CANCELLED, output.Status=error
  => State=CANCELLED, Output={'Data': {'TextPlain': ''}, 'ExecutionCount': 2, 'Status': 'error', 'ErrorName': 'StatementCancellationException', 'ErrorValue': '', 'Traceback': ['Traceback (most recent call last):\n', '  File "/tmp/6841545343434752041", line 53, in signal_handler\n    raise StatementCancellationException\n', 'StatementCancellationException\n']}

=== Running simple-success statement ===
Statement ID: 3
  state=AVAILABLE, output.Status=ok
  => State=AVAILABLE, Output={'Data': {'TextPlain': 'hello'}, 'ExecutionCount': 3, 'Status': 'ok'}

=== Deleting session ===
Done

Hey @yotahk, you can use your python examples, I've noticed that this error is only present with your python models, specifically in python_submissions.py. https://github.com/aws-samples/dbt-glue?tab=readme-ov-file#basic-usage

the error I'm getting is that model passed, when in reality it failed, the issue is the condition match, hence, just put everything .lower(), this I think will also future proof the issues.

yotahk · 2026-02-19T11:33:02Z

this error is only present with your python models, specifically in python_submissions.py

I couldn't get this point, would you be able to clarify? I want to understand what values you've actually observed in Status field. lower() can do the job but I want to avoid blindly matching by string manipulation if possible, since it expands the responsibility of the if-else statement.

My recommendation is to use explicit case matching based on observed patterns. For example:

if status = "ok":
   aaa
else if status = "error":
   bbb
else:
  raise XXXException

akmalsoliev · 2026-02-19T11:38:31Z

I did observe "error" and "ok", all in lower case. The reason why I wanted to lower is to future proof it, under the condition that a status code would be changed into OK or Ok.

yotahk · 2026-02-19T12:13:58Z

Thanks for clarifying, then let's just expect "error" and "ok" because we ain't gonna need it for most of the time.

If we want fail-safe on future api change, I think we should add something like this. I don't think we should do this now unless we have an actual data point.

if status = "ok":
   aaa
else if status = "error":
   bbb
else if status.lower() = 'error':  # fail-safe for future api response change
   logger.warn("unexpected value")
   ccc
else:
  raise XXXException

@sugichy @moomindani - let me know if you have any inputs.

yotahk · 2026-03-05T01:46:43Z

@akmalsoliev Would you be able to update the implementation with #661 (comment)? Please also update CHANGELOG.md. It might conflict with your other PRs depending on merge order but that can be resolved easily.

akmalsoliev · 2026-03-06T10:35:51Z

@akmalsoliev Would you be able to update the implementation with #661 (comment)? Please also update CHANGELOG.md. It might conflict with your other PRs depending on merge order but that can be resolved easily.

Hey Yota, added information to CHANGELOG.md as per request, will look into other PRs.

yotahk · 2026-03-07T04:45:26Z

@akmalsoliev Would you be able to update the implementation with #661 (comment)?

Thanks for the update. Would you also fix the if statements to explicitly handle 'ok' and 'error' only?

akmalsoliev · 2026-03-07T19:21:52Z

@akmalsoliev Would you be able to update the implementation with #661 (comment)?

Thanks for the update. Would you also fix the if statements to explicitly handle 'ok' and 'error' only?

Hey Yota,
implemented, also added DbtRuntimeError in case if unknown status code has been created.

akmalsoliev · 2026-03-13T12:49:42Z

Hey @yotahk,
What you think about these changes?

yotahk · 2026-03-13T13:34:33Z

@akmalsoliev My bad, forgot to submit after drafting. Suggesting to remove lower() - would you take a look?

akmalsoliev · 2026-03-13T13:38:06Z

@akmalsoliev My bad, forgot to submit after drafting. Suggesting to remove lower() - would you take a look?

The only issue I see is if the response has been changed, then this will raise an error. I believe that this minor change could disrupt a lot of pipelines, due to a small thing. What you think? Otherwise we can raise a soft warning, where the default is to pass.

yotahk · 2026-03-13T14:14:21Z

I understand the concern, but I'd prefer to keep explicit matching here. If there's any change in API response, there should be a notification from the service provider or at least update in the documentation.

If we start assuming future API change, we'll need to do the same thing for other part of this library - for example should we also do lower() for state? I don't think we should do that.

dbt-glue/dbt/adapters/glue/python_submissions.py

Line 69 in eb716ff

if state == 'AVAILABLE':

…raise

akmalsoliev · 2026-03-13T14:28:40Z

I understand the concern, but I'd prefer to keep explicit matching here. If there's any change in API response, there should be a notification from the service provider or at least update in the documentation.

If we start assuming future API change, we'll need to do the same thing for other part of this library - for example should we also do lower() for state? I don't think we should do that.

dbt-glue/dbt/adapters/glue/python_submissions.py

Line 69 in eb716ff

if state == 'AVAILABLE':

Makes sense, applied requested changes, commit history is a mess, you can just squash it.

yotahk · 2026-03-14T07:37:31Z

Merged. Thank you so much for your contribution and patience.

fix: improper error status raise

a251b60

'Status': 'error' is not being properly checked

github-actions Bot added the beginning-contributor label Feb 13, 2026

yotahk added the enable-functional-tests This label enable functional tests label Feb 17, 2026

yotahk mentioned this pull request Mar 5, 2026

feat: ability to set configurations in python model #663

Open

4 tasks

akmalsoliev added 2 commits March 6, 2026 11:29

Merge branch 'aws-samples:main' into fix_imporper_error_raise

61fafe3

chore: adding PR information to CHANGELOG.md

a8e0775

fix: the if statements to explicitly handle 'ok' and 'error' only

479bac2

yotahk requested changes Mar 13, 2026

View reviewed changes

Comment thread dbt/adapters/glue/python_submissions.py Outdated

Comment thread dbt/adapters/glue/python_submissions.py Outdated

Comment thread CHANGELOG.md Outdated

Comment thread CHANGELOG.md Outdated

akmalsoliev added 3 commits March 13, 2026 15:24

update: implementing requested changes

4139cb4

Merge remote-tracking branch 'upstream/main' into fix_imporper_error_…

f66ddcb

…raise

update: implementing CHANGELOG.md changes

0741e80

akmalsoliev force-pushed the fix_imporper_error_raise branch from 1578991 to 0741e80 Compare March 13, 2026 14:26

yotahk added enable-functional-tests This label enable functional tests and removed enable-functional-tests This label enable functional tests labels Mar 13, 2026

fix: matching new status code (OK -> ok)

db0e414

yotahk removed the enable-functional-tests This label enable functional tests label Mar 14, 2026

yotahk added the enable-functional-tests This label enable functional tests label Mar 14, 2026

yotahk approved these changes Mar 14, 2026

View reviewed changes

yotahk merged commit 877b195 into aws-samples:main Mar 14, 2026
26 checks passed

akmalsoliev deleted the fix_imporper_error_raise branch March 14, 2026 07:41

Conversation

akmalsoliev commented Feb 13, 2026 • edited by yotahk Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

yotahk commented Feb 19, 2026

Uh oh!

akmalsoliev commented Feb 19, 2026

Uh oh!

yotahk commented Feb 19, 2026

Uh oh!

akmalsoliev commented Feb 19, 2026

Uh oh!

yotahk commented Feb 19, 2026

Uh oh!

yotahk commented Mar 5, 2026

Uh oh!

akmalsoliev commented Mar 6, 2026

Uh oh!

yotahk commented Mar 7, 2026

Uh oh!

akmalsoliev commented Mar 7, 2026

Uh oh!

akmalsoliev commented Mar 13, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yotahk commented Mar 13, 2026

Uh oh!

akmalsoliev commented Mar 13, 2026

Uh oh!

yotahk commented Mar 13, 2026

Uh oh!

akmalsoliev commented Mar 13, 2026

Uh oh!

Uh oh!

yotahk commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

akmalsoliev commented Feb 13, 2026 •

edited by yotahk

Loading