SNOW-3230493: Explicitly query CURRENT_ACCOUNT in default_artifact_repository retrieval path#4119
Conversation
src/snowflake/snowpark/session.py
Outdated
| account = self._conn._get_current_parameter("account", quoted=False) | ||
| database = self._conn._get_current_parameter("database", quoted=False) | ||
| schema = self._conn._get_current_parameter("schema", quoted=False) |
There was a problem hiding this comment.
would the change break cases that we need quotes for the identifier (case sensitive)?
can we test identifiers w/, w/o double quotes that contains special characters as long as they are allowed by snowflake?
you can follow up with @sfc-gh-bkogan on it
…quote-artifact-repository-account
| account = quote_name_without_upper_casing( | ||
| self._conn._get_string_datum("SELECT CURRENT_ACCOUNT()") | ||
| ) |
There was a problem hiding this comment.
What's the latency of this? Since we are doing it before the cache check, it's incurred on every call. Can we optimize this where if both a database and schema already exist, we skip calling this (since we won't need it).
There was a problem hiding this comment.
why current_database and current_schema don't need the same change? are they not having the same issue as account?
if we are issuing a query, we can get all the info in one round-trip: select current_account(), current_database(), current_schema()
Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.
Fixes SNOW-3230493
Fill out the following pre-review checklist:
Please describe how your code solves the related issue.
A customer encountered this issue when creating a sproc, where a query like
SYSTEM$GET_DEFAULT_PYTHON_ARTIFACT_REPOSITORY('3.13', 'account', '"sfctest0"')was raising an error because the account name was double-quoted and all lowercased. This would result in the exception being caught and falling back to the default conda repository path. Copying my analysis from the parent ticket:If the user explicitly passes artifact_repository=”snowflake.snowpark.pypi_shared_repository” to the sproc function, then SYSTEM$GET_DEFAULT_PYTHON_ARTIFACT_REPOSITORY should never get called.
Setting an active database or schema on the Snowpark session should be a viable workaround: these also get double-quoted by the Snowpark Python client, but the system function is able to accept these properly. This is because of the artifact_repository parameter was not explicitly passed to sproc, the client will check the session’s active schema for a schema-level default artifact repository. If there is no active schema, then it will fall back to the active database; if there is no active database, it will fall back to the account (as seems to be the case here).
In addition to the double-quoting issue mentioned in the ticket, for some reason the Snowpark Python client quotes a lowercase version of the account name, while schema and database names are uppercased before they’re quoted. Passing an uppercase double-quoted value (whether it’s an account locator, schema, or database name) is fine, but making it lowercase causes the observed issue. Removing the double quotes makes the identifiers case-insensitive, which is the fix I have in-progress.The issue occurs because Snowpark reads the connector's underlying
_accountproperty, which is set by user configuration file and not updated by queries (unlike database/schema). This field is treated as case-insensitive, but when Snowpark reads it in this case, quoting the field makes it case-sensitive, and breaks expected behavior. To remedy this, we now issue an explicitSELECT CURRENT_ACCOUNT()as a workaround.Note that un-quoting the identifier is not an appropriate fix, because account names can contain hyphens, which are not allowed in unquoted identifiers and will raise a syntax error.
Account identifier rules: https://docs.snowflake.com/en/user-guide/admin-account-identifier
Unquoted object identifier rules: https://docs.snowflake.com/en/sql-reference/identifiers-syntax#label-identifier-casing