@sdebruyn
Python models fail with Service Principal authentication (two issues)
Environment
- dbt-core: 1.11.2
- dbt-fabric: 1.10.1
- Authentication:
ActiveDirectoryServicePrincipal
- Profile config: using
workspace_id + lakehouse_id (no host)
- OS: Windows 11
Summary
When running Python models with Service Principal (SP) authentication, two separate issues prevent successful execution. SQL models work perfectly with the same SP configuration.
Issue 1: GET /sessions returns HTTP 500 with SP token
Description
LivySession._get_existing_session() calls GET /livyapi/versions/2023-12-01/sessions to look for a reusable idle session. This endpoint returns 500 Internal Server Error when authenticated with a Service Principal token, even though the SP has all required Fabric API permissions and workspace roles.
The subsequent POST /sessions call works fine (returns 202) with the exact same token.
Steps to reproduce
from azure.identity import ClientSecretCredential
import requests
credential = ClientSecretCredential(tenant_id="...", client_id="...", client_secret="...")
token = credential.get_token("https://analysis.windows.net/powerbi/api/.default")
headers = {"Authorization": f"Bearer {token.token}", "Accept": "application/json"}
base = "https://api.fabric.microsoft.com/v1/workspaces//lakehouses//livyapi/versions/2023-12-01"
# This fails with 500
r1 = requests.get(base + "/sessions", headers=headers)
print("GET:", r1.status_code) # 500
# This works fine
r2 = requests.post(base + "/sessions", headers=headers, json={"name": "dbt-fabric"})
print("POST:", r2.status_code) # 202
Workaround
Wrap the GET call in _get_existing_session() in a try/except so the adapter falls through to creating a new session:
def _get_existing_session(self) -> str | None:
try:
response = requests.get(self._get_base_url() + "/sessions", headers=self._get_headers())
response.raise_for_status()
for session in response.json().get("items", []):
if session["name"] == "dbt-fabric" and session["livyState"] in (
"idle",
"starting",
"running",
):
return session["id"]
except Exception:
pass
return None
The only downside is that existing idle sessions won't be reused — a new session is created every time. This is acceptable for most dbt workflows.
Root cause
This appears to be a Fabric API bug — GET /sessions doesn't work with SP tokens while POST /sessions does. Worth reporting to Microsoft, but the adapter should handle this gracefully regardless.
Issue 2: synapsesql Spark connector does not support Service Principal authentication
Description
After applying the workaround above, the Livy session is created successfully and the Spark job starts. However, the job itself fails with InternalServerError because the generated Spark code uses spark.read.synapsesql() and df.write.synapsesql() to read from and write to the Fabric Data Warehouse.
Microsoft explicitly documents that the synapsesql connector only supports interactive user authentication — Service Principal is not supported:
"The Spark connector for Fabric Data Warehouse only supports interactive Microsoft Entra user authentication. Service principal authentication isn't supported."
— Microsoft Learn: Spark connector for Microsoft Fabric Data Warehouse
Generated Spark code (from target/run/)
import com.microsoft.spark.fabric
from com.microsoft.spark.fabric.Constants import Constants
spark.conf.set("spark.datawarehouse.DW_dbt_club_silver_dev.sqlendpoint", "...,1433")
dbt = dbtObj(spark.read.synapsesql)
df = model(dbt, spark)
df.write.mode("overwrite").synapsesql("DW_dbt_club_silver_dev.mikt.int_accounts_fixed__dbt_tmp")
The Spark job starts but fails because synapsesql cannot authenticate with the SP identity of the Livy session.
Error
The Spark monitoring page shows:
Status: Failed
Error Code: InternalServerError
Error Message: An internal server error occurred while processing the request.
And dbt reports:
('42000', "[42000] No item by the name of 'mikt.int_accounts_fixed__dbt_tmp'
could be found in the current database 'DW_dbt_club_silver_dev'")
This is because the tmp table was never created — the Spark write failed silently.
Possible solution
Microsoft provides an alternative Spark connector (mssql) that maybe does support Service Principal authentication via an explicit access token.
Reference: Microsoft Learn: Spark connector for SQL databases
The adapter could detect SP authentication and generate Spark code using the mssql connector with an explicit access token instead of synapsesql. This would require passing the SP credentials into the Spark session (e.g., via spark.conf.set or environment variables) so the Spark code can acquire its own token.
What next?
By any chance, have you come across these issues? Do you know any workaround?
Thank you for the great adapter, Sam! Happy to help test any fixes.
@sdebruyn
Python models fail with Service Principal authentication (two issues)
Environment
ActiveDirectoryServicePrincipalworkspace_id+lakehouse_id(nohost)Summary
When running Python models with Service Principal (SP) authentication, two separate issues prevent successful execution. SQL models work perfectly with the same SP configuration.
Issue 1:
GET /sessionsreturns HTTP 500 with SP tokenDescription
LivySession._get_existing_session()callsGET /livyapi/versions/2023-12-01/sessionsto look for a reusable idle session. This endpoint returns 500 Internal Server Error when authenticated with a Service Principal token, even though the SP has all required Fabric API permissions and workspace roles.The subsequent
POST /sessionscall works fine (returns 202) with the exact same token.Steps to reproduce
Workaround
Wrap the GET call in
_get_existing_session()in a try/except so the adapter falls through to creating a new session:The only downside is that existing idle sessions won't be reused — a new session is created every time. This is acceptable for most dbt workflows.
Root cause
This appears to be a Fabric API bug —
GET /sessionsdoesn't work with SP tokens whilePOST /sessionsdoes. Worth reporting to Microsoft, but the adapter should handle this gracefully regardless.Issue 2:
synapsesqlSpark connector does not support Service Principal authenticationDescription
After applying the workaround above, the Livy session is created successfully and the Spark job starts. However, the job itself fails with
InternalServerErrorbecause the generated Spark code usesspark.read.synapsesql()anddf.write.synapsesql()to read from and write to the Fabric Data Warehouse.Microsoft explicitly documents that the
synapsesqlconnector only supports interactive user authentication — Service Principal is not supported:Generated Spark code (from
target/run/)The Spark job starts but fails because
synapsesqlcannot authenticate with the SP identity of the Livy session.Error
The Spark monitoring page shows:
And dbt reports:
This is because the tmp table was never created — the Spark write failed silently.
Possible solution
Microsoft provides an alternative Spark connector (
mssql) that maybe does support Service Principal authentication via an explicit access token.Reference: Microsoft Learn: Spark connector for SQL databases
The adapter could detect SP authentication and generate Spark code using the
mssqlconnector with an explicit access token instead ofsynapsesql. This would require passing the SP credentials into the Spark session (e.g., viaspark.conf.setor environment variables) so the Spark code can acquire its own token.What next?
By any chance, have you come across these issues? Do you know any workaround?
Thank you for the great adapter, Sam! Happy to help test any fixes.