ingestion refactor changes#760
Open
huangh wants to merge 12 commits into
Open
Conversation
LCOV of commit
|
LCOV of commit
|
runkelcorey
reviewed
Jun 8, 2026
runkelcorey
reviewed
Jun 9, 2026
runkelcorey
left a comment
Collaborator
There was a problem hiding this comment.
I like how clean this code is. Can you construct a test that covers convert? I think it would help for regression testing—and to prove to myself that it does what it should. You can take a look at test_convert in test_convert_gtfs_rt to see what this could look like.
…ee this is the issue - seen in local running
…_async, and remove hardcoded self.detail (bug)
4607981 to
51f2e09
Compare
Co-authored-by: Corey Runkel <39202587+runkelcorey@users.noreply.github.com>
Co-authored-by: Corey Runkel <39202587+runkelcorey@users.noreply.github.com>
LCOV of commit
|
LCOV of commit
|
…rd expected structure the same. deduped data will be written to the standard springboard location
LCOV of commit
|
runkelcorey
requested changes
Jun 22, 2026
| if config_type not in converters: | ||
| converters[config_type] = GtfsRtConverter(config_type, metadata_queue) | ||
| if config_type in (ConfigType.RT_ALERTS, ConfigType.VEHICLE_COUNT, ConfigType.SCHEDULE): | ||
| converters[config_type] = GtfsConverter(config_type, metadata_queue) |
Collaborator
There was a problem hiding this comment.
@huangh bumping this because I don't see how RT_ALERTS gets ingested with a GtfsConverter
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Asana Task: ticket
ticket2
What changes does this PR propose?
Writeup on the re-architecture here: https://app.notion.com/p/mbta-downtown-crossing/Ingestion-Redesign-Ingestion-Indegestion-Redegestion-334f5d8d11ea8020bffcdcffe9b6c1ee?source=copy_link
convert_gtfs_rt_fullset.pyhas been added that implements this 15-minute chunking. The original converter is left mostly alone. This new converter is applied for now only toRT_TRIP_UPDATESandRT_VEHICLE_POSITIONS- bothPRODandDEV_GREENvariants.How were these changes validated?
test_yield_check_periodicmakes sure the files written out make sense.validate_data_across_environment.pycompares data across dev|staging|prod (which have the same inputs via delta) - and ensures that they are identical. This proves that we have all of the records and have lost none.