-
Notifications
You must be signed in to change notification settings - Fork 60
sync spec text with json examples #324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: v0.5
Are you sure you want to change the base?
Conversation
will-moore
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thx. 👍
|
@will-moore is this the correct target branch for this fix? Just wondering about the failing checks. |
|
@jo-mueller I'm not up-to-date with the strategy for PRs, or even know if this should be opened against v0.5 (which would create a v0.5.n+1 version of the spec) or if we just go for v0.6? |
|
For existing v0.5 datasets that have version keys inside plate and well objects, will this change make them invalid OME-Zarr files? |
|
@ziw-liu THAT is a good question. I think it's safer to just commit this into a new 0.6-dev branch. For the 0.5 branch, it might be better to simply update the examples therein and make sure that everthing is properly synced up before 0.6 comes out. |
|
Generally I don't think there's language in the spec that prevents extra keys in the various JSON objects -- my understanding is that the spec generally describes JSON objects in positive terms (e.g., "object |
I agree with this observation. However in practice various implementations do prohibit extra keys. Maybe a separate clarification is needed, e.g. |
|
Yes, further clarification on the presence of extra keys would be helpful. This came up yesterday here zarrs/ome_zarr_metadata#9. |
|
I would recommend adding language like this at the top of the spec:
|
Co-Authored-By: Davis Bennett <davis.v.bennett@gmail.com>
|
@d-v-b I added the proposed statement under the document conventions section. 👍 |
Co-Authored-By: David Stansby <d.stansby@ucl.ac.uk>
|
It's worth noting that if arbitrary extra keys are allowed, then this removes any future possibility of backwards compatibility between versions of the specification in future. To give a concrete example, say someone writes data that contains a If instead arbitrary extra keys are prohibited in the "ome" metadadata, it allows the possibility of all valid metadata written with one version of the specification to also be valid with the next version. Given OME-Zarr has reserved the top-level "ome" key in the JSON attributes, I actually now think that extra keys should be prohibited to give the option of backwards compatibility. Instead, extra keys should live outside the "ome" field. This would also limit proliferation of "unofficial" metadata under the "ome" field, keeping the metadata there tightly controlled to what is allowed within the specifciation. |
|
This is a good point! In terms of what is right for the future of the spec, I don't have strong feelings about whether extra keys should be allowed or not. I can definitely see the appeal of keeping the But in terms of interpreting what the spec says today, I think disallowing extra keys would be potentially breaking for anyone who saved data with an extra key because they thought it was allowed. |
|
I agree with @dstansby here that extra keys should not be allowed for 1.x versions. I don't see much harm for 0.x versions, but the question is whether we want to promote it now, if we'll disallow it later. Perhaps, we should just carve out an exception for the "version" field within the 0.5 versions. |
|
see also: #209, maybe we could get a decision here and close that issue :) |
|
cc @thewtex since ngff-zarr saves an extra |
|
Thanks all for the insight and contributions to the discussion! From what I have gathered above, an agreeable solution would be the following:
That should get everyone plenty of time to adapt. Maybe it will also kickstart the discussion upon plug-in specifications (tables, spatialdata, etc) under protected keys outside of the "ome" key. |
|
Neither banning nor allowing extra keys solves backwards compatibility in the general case, they just have different trade-offs. Banning additional keys breaks backwards compatibility when a key is removed from the spec (e.g. Arguably, as the spec evolves we're more likely to add keys than remove them (i.e. we should ban extra keys). However, banning additional keys causes backwards compatibility problems right now, where the problem it solves would only occur if A) we removed a field from the spec in one version, and then added it back in a future version, or B) someone is already adding arbitrary keys to their OME metadata (i.e. using it incorrectly). IMO, the best solution is something like
and
and then spec designers try to avoid removing and then re-adding a field of the same name. Nobody can stop a JS or python user from doing something stupid if they really try, but this at least sets expectations and guides implementations with better type systems. |
This requires language permitting lower-level objects to relax this rule as needed, since the spec currently recommends(!) saving an unstructured dict in the
IMO JS (via typescript) and python both offer everything we need to completely model this stuff. Whether people use it is a separate question. |
|
Good point about the downside of banning keys - I agree though that the spec is more likely to add keys than remove them. If keys are removed, MUST contain could be converted to MAY contain statements, which would allow for backwards compatibility.
Hasn't every version of the spec so far been backwards incompatible with the previous version(s) anyway? (with few (any?) tools available to convert metadata between incompatible versions). In this pre-1.0 stage so far preserving backwards copatibility does not seem to have been a consideration. Having language that prohibits some a particular metadata structure, but then gives implementations a get out clause to allow data to violate that metadata structure seems like a bad idea to me, and defeats the point of having a specification in the first place. I like the suggestion in #324 (comment) to ban extra metadata from 1.0 onwards, to allow for backwards compatibility in the future, and in tandem with points in #324 (comment) that would suggest that before 1.0 we could say "additional metadata SHOULD NOT" be stored under the "ome" key" to discourage it, and then change SHOULD NOT to MUST NOT in version 1.0. Finally, defining or recommending what implementations do is quite a new topic, and so far has not been incorporated into the specification. It is a much more general topic than just the change proposed here, so perhaps discussion about including recommendations for implementations is done elsewhere to separate concerns (and keep it easy to discuss one thing at a time)? |
|
Maybe it makes sense to continue this discussion in a separate issue? Not to step on anyone's toes but I think the PR got a bit sidetracked from the original purpose, which was aligning spec and examples 😅 This should get some people on their way with regard to whether a version key should be present in plates/wells. How restrictive/permissive the spec should be with respect to extra keys in the json is a matter that requires a bit more careful discussion, I think. In order to allow this go through, I would remove the paragraph suggested by @d-v-b from the top:
|
I think there are 2 topics, both of which were covered in this thread:
|
|
Thanks for linking the issue! I'll write up a summary of this thread over there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, provided that everyone considers it obvious that extra keys are allowed for reading. Because this change removes required metadata keys, there will be datasets in the wild that were previously compliant and used these keys. They should continue to be compliant after these changes. That means their extra keys must be accepted by readers.
|
Is there a desire/need to have backwards compatibility between versions of the spec now? As far as I remember every version increment has been backwards incompatible up to now. |
|
Hold on a minute, this isn't another proposal to silently edit (ie., change without incrementing the version number) 0.5 specification is it? If it is, please please please can we stop doing this, and make changes to the next version instead. Silently editing the specification without incrementing the version defeats the point of having a versioned spec in the first place. |
|
good point, I wasn't even thinking in terms of versioning... how does that work again for this spec? |
Wouldn't say so. The change just doesn't demand the But I'm also fine with changing the target of this PR to a new |
I think you're saying that this change is forwards compatible (all data written before the change is valid after the change), but I'm saying that the change is not backwards compatible (not all data written after the changes is valid with the spec before the change) - in this case, if data is written without a version key, it is not valid 0.5 data (according to the first 'release' of the 0.5 specification) (maybe I get backwards and forwards compatible the wrong way above - it always confuses me which way round they are) |
|
As for versioning, I think the idea now would be to create a new branch ( Maybe @joshmoore could help us there? |
|
I think it should go in 0.5.3, because this bug originates in 0.5. |
|
Back from vacation - maybe we can all agree that while it should or should not go into a 0.5.3 version, it should definitely be part of an upcoming 0.6? Once there is an active development head here, I'll change the target of the PR to that so we can uncouple this from the question of whether or how older versions should be modified. |
Dear all,
This PR is intended as a fix for #322 and #309. Some confusion was raised in these issues as to whether different
versionkeys would be allowed in an ome-zarrs top-levelzarr.jsonand in a followingplate/well/labelsgroups therein.In the current examples on the 0.5 branch (e.g., plate example), the
version keyis only given at the top level and not in the group-level. This is in contradiction to the current text of the plate spec, which states:Same for wells and labels groups examples.
Context and consequences
For context, the
versionkey was originally moved up into the top-levelomegroup under the pretext of discouraging different metadata versions inside the same file according to RFC2:Removing the statements about the
versionkeys in the plate/well/labels groups essentially drops the idea differing versions inside a single ome.zarr file, which I think is a reasonable step from an implementation perspective. Also, it brings the spec back into sync with the examples, which I think is important for the integrity of this project.I modified the statement at the top-level
omegroup into the following to make this clear:Let me know what you think :)
cc @joshmoore @normanrz @dstansby @ziw-liu
PS: Hope I got this PR configured right - had a bit of trouble getting the knack about contributing with submodules (#323)