Skip to content

feat(ocs): define parsed OCS message types#16

Closed
andrewdolce wants to merge 11 commits into
mainfrom
feat/ocs-parsed-message-types
Closed

feat(ocs): define parsed OCS message types#16
andrewdolce wants to merge 11 commits into
mainfrom
feat/ocs-parsed-message-types

Conversation

@andrewdolce

Copy link
Copy Markdown
Contributor

Asana task: Add com.mbta.ocs parsed types

To support upcoming features, Orbit will need to listen to certain OCS events to supplement GTFS. Rather than listen to the existing ocs.raw_message events emitted by Trike, the team has proposed changing Trike to emit additional "parsed" events, which will have already done the work of parsing the OCS raw comma-separated format and converting to JSON.

Trike will still emit the existing raw messages unaltered, so this should be backward-compatible (assuming all existing consumers already filter out unknown message types). But the thinking is that Orbit and future consumers can subscribe to these parsed events and not have to duplicate the raw data parsing.

This PR includes JSON schema, examples, and minimal docs for the TMOV and TSCH messages, which are the two that we believe Orbit will need soon. Additional types may need to be spec'd out later.

I have a number of in-line "TODOs" as points to get specific feedback, but here are some general things I want to point out:

  • Like the raw events, these new events are intended to fit with the the CloudEvents spec. If I've violated that anywhere, please let me know.

  • I've represented each sub-type of TSCH, (e.g. TSCH_NEW, TSCH_OFF) as a separate top-level event type (e.g. ocs.tsch_new, ocs.tsch_off), instead of say, having a single ocs.tsch event with a subtype field. My instinct is that this is cleaner because there is a lot of variation in the data fields between the subtypes, but curious what others think.

  • Perhaps more controversially, I also broke TMOV into three types, for heavy-rail, light-rail, and deletions. I did this after looking at how the RTR codebase represents these messages, and felt it might make sense to do the same because of the variations in data between the three.

  • I went with camel case for property names, to mimic the glides event schemas and fall more generally in-line with most JSON standards. I had considered using underscores to better match how things are named in our elixir codebases, but that felt too implementation-dependent.

  • Some of my biggest open questions are about the counter field on OCS events.

    • Does RTR or any consumer actually use this to detect dropped messages?
    • Are the counter rules well-understood? From looking at splunk, it seems like each transitline gets its own independent counter across all event types, but I do see occasional missing ones, or counters that look drastically out-of-sequence.
    • If the counter is important, then does that mean that Orbit will need to listen to all events (not just TMOV and TSCH) in order to keep the count?

@rudiejd

rudiejd commented Jun 9, 2025

Copy link
Copy Markdown
Member

Does RTR or any consumer actually use this to detect dropped messages?

Most of the business logic for how RTR uses the OCS message counter can be found in OCS.Stream.SequenceMonitor. I believe the only consequence of a sequence being out of order is that we emit a log, but tagging @lemald here since he probably has more context and my understanding is just from grokking RTR code.

Are the counter rules well-understood? From looking at splunk, it seems like each transitline gets its own independent counter across all event types, but I do see occasional missing ones, or counters that look drastically out-of-sequence.

The SequenceMonitor indicates that the counter should be per trike IP, so if the trike IP changes some out of sequence values could be permitted. You should get a ocs_sequence_monitor missing_sequence event when the counter skips numbers.

If the counter is important, then does that mean that Orbit will need to listen to all events (not just TMOV and TSCH) in order to keep the count?
This is probably above my pay grade, but the fact that a sequence mismatch is not actionable in RTR makes me think that you might be able to get away with ignoring them.

Comment thread docs/events/ocs/com.mbta.ocs.tmov_light_rail.v1.mdx Outdated
},
"heading": {
"description": "Directional information. Dependent on dataSource.",
"$comment": "TODO: Should we break this up into separate fields instead of a union type?",

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would definitely be in favor of splitting this up into separate fields, and potentially only including each field in a distinct message type. I believe this is a heading information when we're getting a GPS reading and AB information when we are getting an AVI read. The unit for heading information should be degrees offset from Magnetic North. I split these into separate fields in the light rail eventsschema already.

@andrewdolce andrewdolce Jun 9, 2025

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, that makes sense. Definitely on board with splitting it into two fields.

In terms of distinct message types, are you considering something like ocs.tmov_light_rail_gps.v1, ocs.tmov_light_rail_avi.v1, and ocs.tmov_light_rail_nvl.v1 for the different data sources? Since the light rail events schema has a single event type with both fields, I think I would be inclined to do the same, and just mark them as nullable in cases where they aren't provided, rather than introduce more types. But let me know if you have a strong opinion otherwise.

But that leads to another question: RTR codebase uses an explicit :unreported atom in cases where this (and speed) are missing. Is there a reason that couldn't just be null here? Maybe there's some known distinction between "this is expectedly not provided" vs "this is missing and that's a hole in the data error"?

@andrewdolce andrewdolce force-pushed the feat/ocs-parsed-message-types branch from 7c8490e to bee06de Compare June 10, 2025 12:06
Comment on lines +99 to +102
"orientation": {
"description": "Whether the 'rear' or 'front' of each train is facing forward for each train in the consist. B means backward, A means forward. Only provided with AVI data source.",
"type": ["string", "null"]
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took another pass at this and tried to make it align with the RTR lightrail messages. If I'm understanding it, that means each individual character in the string represents the orientation of a train car, as opposed to, say, having this be an array with "AB" vs "BA" values, like RTR has internally. E.g. "AAB" instead of [ "AB, "AB, "BA" ].

Comment on lines +56 to +65
"leadCar": {
"description": "Lead car number, as represented internally in OCS, which may differ from the number displayed on a physical train car. Example: For a red line car physically numbered 15xx, internally numbered 25xx, this field would contain 25xx.",
"$comment": "TODO: Should this just be an integer.",
"type": "string"
},
"leadCarDisplay": {
"description": "Lead car number, as displayed on the physical vehicle. Example: For a red line car physically numbered 15xx, internally numbered 25xx, this field would contain 15xx.",
"$comment": "TODO: Should this just be an integer.",
"type": "string"
},

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just put up a commit to add this. Wondering whether people like this idea. Essentially, this is an attempt (at the cost of being more verbose) to get rid of every downstream listener needing to perform the special case mapping of red line 25xx -> 15xx trains, and to potentially account for future scenarios where train cars may have special numbering internal to OCS.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Torn on this. I get what you're after, but most TID systems except for Orbit probably want 15xx (I think RTR uses 15xx, right?). So I wonder if we should continue to base TID on 15xx and Orbit should shift back to 25xx in its presentation.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we discussed this a little already, but how would you feel about including this, but reversing the naming scheme? Something like:

leadCar -> "Physical car number"
leadCarInternal -> "Car number as represented internally in OCS"

I could also make the internal field optional and only present if there's a difference between the two? Would save us from being overly chatty in the majority of cases where there is no difference.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed a version revised the way I proposed. Let me know if you think this is better, or worse, or if I should just drop this idea entirely.

@andrewdolce andrewdolce force-pushed the feat/ocs-parsed-message-types branch from 40b4489 to 4af1a61 Compare June 10, 2025 13:13
@andrewdolce

Copy link
Copy Markdown
Contributor Author

Just realized I missed a TSCH subtype - TSCH_TAG. I'll work on adding it.

@andrewdolce andrewdolce force-pushed the feat/ocs-parsed-message-types branch from b6abb74 to bb1bf4e Compare June 11, 2025 17:51
@andrewdolce andrewdolce force-pushed the feat/ocs-parsed-message-types branch from bb1bf4e to e0bc7a3 Compare June 11, 2025 17:56
@@ -0,0 +1,116 @@
{

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we talked about this verbally earlier, but are there enough differences between light and heavy rail to warrant a separate event type, rather than using optional fields?

Comment on lines +66 to +72
"dispatchTags": {
"description": "A set of tags that dispatchers can assign to trains.",
"type": "array",
"items": {
"type": "string"
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed some TODOs elsewhere to consider enumerate strings. This is another case where a list of possible tags either as an enumerator or in the description could act as documentation of what's possible.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From conversation in slack, it sounds like there isn't an official doc of known values here, so I'm actually leaning in the direction of not making it an enum, so as to allow for arbitrary values. Let me know if you feel strongly otherwise? How would you feel about allowing arbitrary strings, but documenting the set of commonly expected values, in the description and/or docs?

"minLength": 20,
"$comment": "RFC3339 timestamp",
"format": "date-time",
"pattern": "^[0-9]{4}-[01][0-9]-[0-3][0-9]T[012][0-9]:[0-5][0-9]:[0-6][0-9](.[0-9]*)?(Z|[+-][012][0-9]:[0-5][0-9])$",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen OCS timestamps in local time or RFC3339 without a timezone prefix (including Z), just checking this is right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, I'm basing this on RTR because in most cases they convert local times into full RFC3339 at the point of parsing, so I was thinking Trike could do the same. I will double check though.

Comment thread schemas/ocs-events-common.json Outdated
Comment on lines +56 to +65
"leadCar": {
"description": "Lead car number, as represented internally in OCS, which may differ from the number displayed on a physical train car. Example: For a red line car physically numbered 15xx, internally numbered 25xx, this field would contain 25xx.",
"$comment": "TODO: Should this just be an integer.",
"type": "string"
},
"leadCarDisplay": {
"description": "Lead car number, as displayed on the physical vehicle. Example: For a red line car physically numbered 15xx, internally numbered 25xx, this field would contain 15xx.",
"$comment": "TODO: Should this just be an integer.",
"type": "string"
},

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Torn on this. I get what you're after, but most TID systems except for Orbit probably want 15xx (I think RTR uses 15xx, right?). So I wonder if we should continue to base TID on 15xx and Orbit should shift back to 25xx in its presentation.

@andrewdolce

Copy link
Copy Markdown
Contributor Author

From recent conversation, it sounds like we may want to re-evaluate the decision as to whether Trike should emit parsed messages. So this schema may not be needed.

I'm converting this PR back to a draft for now. (May end up just closing and re-opening later if needed.)

@andrewdolce

Copy link
Copy Markdown
Contributor Author

Pretty sure this is no longer relevant. Closing for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants