feat(ocs): define parsed OCS message types#16
Conversation
Most of the business logic for how RTR uses the OCS message counter can be found in OCS.Stream.SequenceMonitor. I believe the only consequence of a sequence being out of order is that we emit a log, but tagging @lemald here since he probably has more context and my understanding is just from grokking RTR code.
The
|
| }, | ||
| "heading": { | ||
| "description": "Directional information. Dependent on dataSource.", | ||
| "$comment": "TODO: Should we break this up into separate fields instead of a union type?", |
There was a problem hiding this comment.
I would definitely be in favor of splitting this up into separate fields, and potentially only including each field in a distinct message type. I believe this is a heading information when we're getting a GPS reading and AB information when we are getting an AVI read. The unit for heading information should be degrees offset from Magnetic North. I split these into separate fields in the light rail eventsschema already.
There was a problem hiding this comment.
Ah ok, that makes sense. Definitely on board with splitting it into two fields.
In terms of distinct message types, are you considering something like ocs.tmov_light_rail_gps.v1, ocs.tmov_light_rail_avi.v1, and ocs.tmov_light_rail_nvl.v1 for the different data sources? Since the light rail events schema has a single event type with both fields, I think I would be inclined to do the same, and just mark them as nullable in cases where they aren't provided, rather than introduce more types. But let me know if you have a strong opinion otherwise.
But that leads to another question: RTR codebase uses an explicit :unreported atom in cases where this (and speed) are missing. Is there a reason that couldn't just be null here? Maybe there's some known distinction between "this is expectedly not provided" vs "this is missing and that's a hole in the data error"?
7c8490e to
bee06de
Compare
| "orientation": { | ||
| "description": "Whether the 'rear' or 'front' of each train is facing forward for each train in the consist. B means backward, A means forward. Only provided with AVI data source.", | ||
| "type": ["string", "null"] | ||
| } |
There was a problem hiding this comment.
Took another pass at this and tried to make it align with the RTR lightrail messages. If I'm understanding it, that means each individual character in the string represents the orientation of a train car, as opposed to, say, having this be an array with "AB" vs "BA" values, like RTR has internally. E.g. "AAB" instead of [ "AB, "AB, "BA" ].
| "leadCar": { | ||
| "description": "Lead car number, as represented internally in OCS, which may differ from the number displayed on a physical train car. Example: For a red line car physically numbered 15xx, internally numbered 25xx, this field would contain 25xx.", | ||
| "$comment": "TODO: Should this just be an integer.", | ||
| "type": "string" | ||
| }, | ||
| "leadCarDisplay": { | ||
| "description": "Lead car number, as displayed on the physical vehicle. Example: For a red line car physically numbered 15xx, internally numbered 25xx, this field would contain 15xx.", | ||
| "$comment": "TODO: Should this just be an integer.", | ||
| "type": "string" | ||
| }, |
There was a problem hiding this comment.
Just put up a commit to add this. Wondering whether people like this idea. Essentially, this is an attempt (at the cost of being more verbose) to get rid of every downstream listener needing to perform the special case mapping of red line 25xx -> 15xx trains, and to potentially account for future scenarios where train cars may have special numbering internal to OCS.
There was a problem hiding this comment.
Torn on this. I get what you're after, but most TID systems except for Orbit probably want 15xx (I think RTR uses 15xx, right?). So I wonder if we should continue to base TID on 15xx and Orbit should shift back to 25xx in its presentation.
There was a problem hiding this comment.
I know we discussed this a little already, but how would you feel about including this, but reversing the naming scheme? Something like:
leadCar -> "Physical car number"
leadCarInternal -> "Car number as represented internally in OCS"
I could also make the internal field optional and only present if there's a difference between the two? Would save us from being overly chatty in the majority of cases where there is no difference.
There was a problem hiding this comment.
Pushed a version revised the way I proposed. Let me know if you think this is better, or worse, or if I should just drop this idea entirely.
40b4489 to
4af1a61
Compare
|
Just realized I missed a TSCH subtype - TSCH_TAG. I'll work on adding it. |
b6abb74 to
bb1bf4e
Compare
bb1bf4e to
e0bc7a3
Compare
| @@ -0,0 +1,116 @@ | |||
| { | |||
There was a problem hiding this comment.
I know we talked about this verbally earlier, but are there enough differences between light and heavy rail to warrant a separate event type, rather than using optional fields?
| "dispatchTags": { | ||
| "description": "A set of tags that dispatchers can assign to trains.", | ||
| "type": "array", | ||
| "items": { | ||
| "type": "string" | ||
| } | ||
| } |
There was a problem hiding this comment.
I noticed some TODOs elsewhere to consider enumerate strings. This is another case where a list of possible tags either as an enumerator or in the description could act as documentation of what's possible.
There was a problem hiding this comment.
From conversation in slack, it sounds like there isn't an official doc of known values here, so I'm actually leaning in the direction of not making it an enum, so as to allow for arbitrary values. Let me know if you feel strongly otherwise? How would you feel about allowing arbitrary strings, but documenting the set of commonly expected values, in the description and/or docs?
| "minLength": 20, | ||
| "$comment": "RFC3339 timestamp", | ||
| "format": "date-time", | ||
| "pattern": "^[0-9]{4}-[01][0-9]-[0-3][0-9]T[012][0-9]:[0-5][0-9]:[0-6][0-9](.[0-9]*)?(Z|[+-][012][0-9]:[0-5][0-9])$", |
There was a problem hiding this comment.
I've seen OCS timestamps in local time or RFC3339 without a timezone prefix (including Z), just checking this is right?
There was a problem hiding this comment.
At the moment, I'm basing this on RTR because in most cases they convert local times into full RFC3339 at the point of parsing, so I was thinking Trike could do the same. I will double check though.
| "leadCar": { | ||
| "description": "Lead car number, as represented internally in OCS, which may differ from the number displayed on a physical train car. Example: For a red line car physically numbered 15xx, internally numbered 25xx, this field would contain 25xx.", | ||
| "$comment": "TODO: Should this just be an integer.", | ||
| "type": "string" | ||
| }, | ||
| "leadCarDisplay": { | ||
| "description": "Lead car number, as displayed on the physical vehicle. Example: For a red line car physically numbered 15xx, internally numbered 25xx, this field would contain 15xx.", | ||
| "$comment": "TODO: Should this just be an integer.", | ||
| "type": "string" | ||
| }, |
There was a problem hiding this comment.
Torn on this. I get what you're after, but most TID systems except for Orbit probably want 15xx (I think RTR uses 15xx, right?). So I wonder if we should continue to base TID on 15xx and Orbit should shift back to 25xx in its presentation.
|
From recent conversation, it sounds like we may want to re-evaluate the decision as to whether Trike should emit parsed messages. So this schema may not be needed. I'm converting this PR back to a draft for now. (May end up just closing and re-opening later if needed.) |
|
Pretty sure this is no longer relevant. Closing for now. |
Asana task: Add com.mbta.ocs parsed types
To support upcoming features, Orbit will need to listen to certain OCS events to supplement GTFS. Rather than listen to the existing
ocs.raw_messageevents emitted by Trike, the team has proposed changing Trike to emit additional "parsed" events, which will have already done the work of parsing the OCS raw comma-separated format and converting to JSON.Trike will still emit the existing raw messages unaltered, so this should be backward-compatible (assuming all existing consumers already filter out unknown message types). But the thinking is that Orbit and future consumers can subscribe to these parsed events and not have to duplicate the raw data parsing.
This PR includes JSON schema, examples, and minimal docs for the
TMOVandTSCHmessages, which are the two that we believe Orbit will need soon. Additional types may need to be spec'd out later.I have a number of in-line "TODOs" as points to get specific feedback, but here are some general things I want to point out:
Like the raw events, these new events are intended to fit with the the CloudEvents spec. If I've violated that anywhere, please let me know.
I've represented each sub-type of
TSCH, (e.g.TSCH_NEW,TSCH_OFF) as a separate top-level event type (e.g.ocs.tsch_new,ocs.tsch_off), instead of say, having a singleocs.tschevent with asubtypefield. My instinct is that this is cleaner because there is a lot of variation in the data fields between the subtypes, but curious what others think.Perhaps more controversially, I also broke
TMOVinto three types, for heavy-rail, light-rail, and deletions. I did this after looking at how the RTR codebase represents these messages, and felt it might make sense to do the same because of the variations in data between the three.I went with camel case for property names, to mimic the glides event schemas and fall more generally in-line with most JSON standards. I had considered using underscores to better match how things are named in our elixir codebases, but that felt too implementation-dependent.
Some of my biggest open questions are about the
counterfield on OCS events.TMOVandTSCH) in order to keep the count?