Use dolt JSON encoding#2639
Conversation
|
|
| Function: func(ctx *sql.Context, val any, targetType *pgtypes.DoltgresType) (any, error) { | ||
| return targetType.IoInput(ctx, val.(string)) | ||
| switch v := val.(type) { | ||
| case string: |
There was a problem hiding this comment.
Instead of being a string, we might also encounter a type that implements sql.Wrapper[string], such as TextStorage.
| switch v := val.(type) { | ||
| case sql.JSONWrapper: | ||
| return v.ToInterface(ctx) | ||
| case string: |
There was a problem hiding this comment.
Do we need to handle sql.Wrapper[string] too?
| Value: pgtypes.JsonValueCopy(item.Value), | ||
| }) | ||
| } | ||
| wrapper1, ok1 := val1Interface.(sql.JSONWrapper) |
There was a problem hiding this comment.
Not a blocker, but as written, this doesn't benefit from JSON optimizations and fully deserializes both inputs.
It's likely that for most inputs, there's not much performance to be gained. But concatenating two arrays would optimize very well and I imagine is a common use case. To get that benefit, we would want to add a method to types.MutableJSON and implement it in IndexedJsonDocument. Then we can check if the inputs implement MutableJSON and delegate to that method.
| if !ok { | ||
| return nil, nil | ||
| } | ||
| v, err := wrapper.ToInterface(ctx) |
There was a problem hiding this comment.
So in general, every time we call ToInterface, we defeat all JSON optimizations because we have to fully load the value.
If we want to benefit from JSON optimizations, we need to check if the input implements one of the interfaces that IndexedJsonDocument implements: ComparableJSON, MutableJSON, and SearchableJSON
Ito Test Report ❌24 test cases ran. 2 failed, 22 passed. Across 24 executed JSON/JSONB test cases, 22 passed and 2 failed, with successful verification of legacy/new migration interoperability, restart and rewrite parity with no semantic drift, large nested payload stability, deterministic ORDER BY and comparator/operator consistency, extraction edge cases (including negative indexes and controlled type errors), casting behavior (including overflow handling and concurrent workloads), merge/concurrency semantics, and targeted coverage reruns after environmental repair. The key confirmed defects were a High-severity panic path for short invalid JSON in shared json_in/jsonb_in handling (unguarded input[:10] slicing) and a Medium-severity precision-loss issue where very large numeric literals collapse and compare equal, although a later rerun of short invalid inputs length 0–9 in both routes completed without panic and preserved session usability. ❌ Failed (2)
|


















This gives us the storage, merge, and perf benefits for JSON in doltgres.
This change also results in a behavior change for
ORDER BYonJSONBcolumns. Postgres has a particular b-tree order it uses to store JSON documents in order to efficient perform its lookups. Dolt's method for efficient JSON retrieval is quite different, and has nothing to do with the order in which documents are stored in a primary index.