Python models cannot write to Data Warehouse with Service Principal auth (Fabric limitation)

@sdebruyn 
# Python models fail with Service Principal authentication (two issues)

## Environment

- **dbt-core**: 1.11.2
- **dbt-fabric**: 1.10.1
- **Authentication**: `ActiveDirectoryServicePrincipal`
- **Profile config**: using `workspace_id` + `lakehouse_id` (no `host`)
- **OS**: Windows 11

## Summary

When running Python models with Service Principal (SP) authentication, two separate issues prevent successful execution. SQL models work perfectly with the same SP configuration.

---

## Issue 1: `GET /sessions` returns HTTP 500 with SP token

### Description

`LivySession._get_existing_session()` calls `GET /livyapi/versions/2023-12-01/sessions` to look for a reusable idle session. This endpoint returns **500 Internal Server Error** when authenticated with a Service Principal token, even though the SP has all required Fabric API permissions and workspace roles.

The subsequent `POST /sessions` call works fine (returns 202) with the exact same token.

### Steps to reproduce

```python
from azure.identity import ClientSecretCredential
import requests

credential = ClientSecretCredential(tenant_id="...", client_id="...", client_secret="...")
token = credential.get_token("https://analysis.windows.net/powerbi/api/.default")
headers = {"Authorization": f"Bearer {token.token}", "Accept": "application/json"}
base = "https://api.fabric.microsoft.com/v1/workspaces//lakehouses//livyapi/versions/2023-12-01"

# This fails with 500
r1 = requests.get(base + "/sessions", headers=headers)
print("GET:", r1.status_code)  # 500

# This works fine
r2 = requests.post(base + "/sessions", headers=headers, json={"name": "dbt-fabric"})
print("POST:", r2.status_code)  # 202
```

### Workaround

Wrap the GET call in `_get_existing_session()` in a try/except so the adapter falls through to creating a new session:

```python
def _get_existing_session(self) -> str | None:
    try:
        response = requests.get(self._get_base_url() + "/sessions", headers=self._get_headers())
        response.raise_for_status()
        for session in response.json().get("items", []):
            if session["name"] == "dbt-fabric" and session["livyState"] in (
                "idle",
                "starting",
                "running",
            ):
                return session["id"]
    except Exception:
        pass
    return None
```

The only downside is that existing idle sessions won't be reused — a new session is created every time. This is acceptable for most dbt workflows.

### Root cause

This appears to be a Fabric API bug — `GET /sessions` doesn't work with SP tokens while `POST /sessions` does. Worth reporting to Microsoft, but the adapter should handle this gracefully regardless.

---

## Issue 2: `synapsesql` Spark connector does not support Service Principal authentication

### Description

After applying the workaround above, the Livy session is created successfully and the Spark job starts. However, the job itself fails with `InternalServerError` because the generated Spark code uses `spark.read.synapsesql()` and `df.write.synapsesql()` to read from and write to the Fabric Data Warehouse.

**Microsoft explicitly documents that the `synapsesql` connector only supports interactive user authentication — Service Principal is not supported:**

> *"The Spark connector for Fabric Data Warehouse only supports interactive Microsoft Entra user authentication. Service principal authentication isn't supported."*
>
> — [Microsoft Learn: Spark connector for Microsoft Fabric Data Warehouse](https://learn.microsoft.com/en-us/fabric/data-engineering/spark-data-warehouse-connector)

### Generated Spark code (from `target/run/`)

```python
import com.microsoft.spark.fabric
from com.microsoft.spark.fabric.Constants import Constants
spark.conf.set("spark.datawarehouse.DW_dbt_club_silver_dev.sqlendpoint", "...,1433")
dbt = dbtObj(spark.read.synapsesql)
df = model(dbt, spark)
df.write.mode("overwrite").synapsesql("DW_dbt_club_silver_dev.mikt.int_accounts_fixed__dbt_tmp")
```

The Spark job starts but fails because `synapsesql` cannot authenticate with the SP identity of the Livy session.

### Error

The Spark monitoring page shows:

```
Status: Failed
Error Code: InternalServerError
Error Message: An internal server error occurred while processing the request.
```

And dbt reports:

```
('42000', "[42000] No item by the name of 'mikt.int_accounts_fixed__dbt_tmp'
could be found in the current database 'DW_dbt_club_silver_dev'")
```

This is because the tmp table was never created — the Spark write failed silently.

### Possible solution

Microsoft provides an alternative Spark connector (`mssql`) that **maybe does** support Service Principal authentication via an explicit access token.

Reference: [Microsoft Learn: Spark connector for SQL databases](https://learn.microsoft.com/en-us/fabric/data-engineering/spark-sql-connector)

The adapter could detect SP authentication and generate Spark code using the `mssql` connector with an explicit access token instead of `synapsesql`. This would require passing the SP credentials into the Spark session (e.g., via `spark.conf.set` or environment variables) so the Spark code can acquire its own token.

---

## What next?

By any chance, have you come across these issues? Do you know any workaround?

Thank you for the great adapter, Sam! Happy to help test any fixes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Python models cannot write to Data Warehouse with Service Principal auth (Fabric limitation) #28

Python models fail with Service Principal authentication (two issues)

Environment

Summary

Issue 1: `GET /sessions` returns HTTP 500 with SP token

Description

Steps to reproduce

Workaround

Root cause

Issue 2: `synapsesql` Spark connector does not support Service Principal authentication

Description

Generated Spark code (from `target/run/`)

Error

Possible solution

What next?

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Python models cannot write to Data Warehouse with Service Principal auth (Fabric limitation) #28

Description

Python models fail with Service Principal authentication (two issues)

Environment

Summary

Issue 1: GET /sessions returns HTTP 500 with SP token

Description

Steps to reproduce

Workaround

Root cause

Issue 2: synapsesql Spark connector does not support Service Principal authentication

Description

Generated Spark code (from target/run/)

Error

Possible solution

What next?

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Issue 1: `GET /sessions` returns HTTP 500 with SP token

Issue 2: `synapsesql` Spark connector does not support Service Principal authentication

Generated Spark code (from `target/run/`)