Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
2e8c5b7
updates for BSM deduplication
Michael7371 Jul 3, 2025
cb2e3c6
comment out broken code to fix build issues
Michael7371 Jul 3, 2025
6c6bbf9
updates to fix build
Michael7371 Jul 4, 2025
fc0703a
Merge pull request #6 from CDOT-CV/ode-bsm-schema-updates
Michael7371 Jul 15, 2025
2124e6d
updates to ODE MAP deduplicator topologies
Michael7371 Jul 15, 2025
8eadea7
fixing map deduplication issues
Michael7371 Jul 15, 2025
7dddcd6
Update deduplication for all message types except ProcessedMap and Pr…
drewjj Aug 1, 2025
c3747ce
Remove all Pair model and Serdes since they are unused
drewjj Aug 1, 2025
ce73cbc
Update ProcessedMap and WKT dedup
drewjj Aug 1, 2025
c998ad5
Add additional serdes tests and fix issue for GeoUtils test being unseen
drewjj Aug 1, 2025
9cd31f7
Rework the ProcessedMap and ProcessedMapWkt examples
drewjj Aug 1, 2025
9df2ee9
Update release notes and application.yaml
drewjj Aug 1, 2025
98a7bcc
Add new deduplication documentation as well as modify duplication rul…
drewjj Aug 1, 2025
221b226
Address comments
drewjj Aug 1, 2025
319b117
Removed redundant logic for TIM dedup processing
drewjj Aug 5, 2025
5d70433
Remove redundant OdeMapJson dedup logic that was added
drewjj Aug 5, 2025
267aaf3
Remove the conflictmonitor.version from the pom
drewjj Aug 6, 2025
ad2ed6e
Address some of the pull request comments
drewjj Aug 6, 2025
de8e803
Address additional pull request comments
drewjj Aug 7, 2025
41d5057
Discard messages with 'unknown' keys for MessageFrame topologies
drewjj Aug 7, 2025
0493161
Add moy to the OdeMapJson dedup tests
drewjj Aug 8, 2025
de33f60
Handle ASN1 and update timestamp and odereceivedat in processed messa…
drewjj Aug 8, 2025
ed849bc
Update gitmodules to include the latest version of the jpo-utils
drewjj Aug 8, 2025
a1a5903
Update jpo-utils for the removal of unused mongo collections/topics
drewjj Aug 8, 2025
9321ba5
Remove env vars related to OdeRawEncodedTIMJson
drewjj Aug 8, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitmodules
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
[submodule "jpo-utils"]
path = jpo-utils
url = https://github.com/usdot-jpo-ode/jpo-utils.git
url = https://github.com/CDOT-CV/jpo-utils.git
3 changes: 2 additions & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
{
"java.configuration.updateBuildConfiguration": "automatic"
"java.configuration.updateBuildConfiguration": "automatic",
"java.debug.settings.onBuildFailureProceed": true
}
4 changes: 1 addition & 3 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@

WORKDIR /home

ARG MAVEN_GITHUB_TOKEN

Check warning on line 5 in Dockerfile

View workflow job for this annotation

GitHub Actions / jpo-deduplicator / build

Sensitive data should not be used in the ARG or ENV commands

SecretsUsedInArgOrEnv: Do not use ARG or ENV instructions for sensitive data (ARG "MAVEN_GITHUB_TOKEN") More info: https://docs.docker.com/go/dockerfile/rule/secrets-used-in-arg-or-env/
ARG MAVEN_GITHUB_ORG

ENV MAVEN_GITHUB_TOKEN=$MAVEN_GITHUB_TOKEN

Check warning on line 8 in Dockerfile

View workflow job for this annotation

GitHub Actions / jpo-deduplicator / build

Sensitive data should not be used in the ARG or ENV commands

SecretsUsedInArgOrEnv: Do not use ARG or ENV instructions for sensitive data (ENV "MAVEN_GITHUB_TOKEN") More info: https://docs.docker.com/go/dockerfile/rule/secrets-used-in-arg-or-env/
ENV MAVEN_GITHUB_ORG=$MAVEN_GITHUB_ORG

RUN test -n "$MAVEN_GITHUB_TOKEN" || (echo "Error: MAVEN_GITHUB_TOKEN cannot be empty" && exit 1)
Expand All @@ -27,19 +27,17 @@
WORKDIR /home

COPY --from=builder /home/jpo-deduplicator/src/main/resources/application.yaml /home
COPY --from=builder /home/jpo-deduplicator/src/main/resources/logback.xml /home
Comment thread
John-Wiens marked this conversation as resolved.
COPY --from=builder /home/jpo-deduplicator/target/jpo-deduplicator.jar /home

#COPY cert.crt /home/cert.crt
#RUN keytool -import -trustcacerts -keystore /usr/local/openjdk-11/lib/security/cacerts -storepass changeit -noprompt -alias mycert -file cert.crt

RUN amazon-linux-extras install -y epel && \
yum install -y jemalloc-devel
yum install -y jemalloc-devel
ENV LD_PRELOAD="/usr/lib64/libjemalloc.so"

ENTRYPOINT ["java", \
"-Djava.rmi.server.hostname=$DOCKER_HOST_IP", \
"-Dlogback.configurationFile=/home/logback.xml", \
"-Xmx1024M", \
"-Xms128M", \
"-XX:+UseG1GC", \
Expand Down
38 changes: 35 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,47 @@ The JPO-Deduplicator is a Kafka Java spring-boot application designed to reduce

The following topics currently support deduplication.

- topic.OdeMapJson -> topic.DeduplicatedOdeMapJson
- topic.ProcessedMap -> topic.DeduplicatedProcessedMap
- topic.ProcessedMapWKT -> topic.DeduplicatedProcessedMapWKT
- topic.OdeMapJson -> topic.DeduplicatedOdeMapJson
Comment thread
John-Wiens marked this conversation as resolved.
- topic.OdeTimJson -> topic.DeduplicatedOdeTimJson
- topic.OdeRawEncodedTIMJson -> topic.DeduplicatedOdeRawEncodedTIMJson
- topic.OdeBsmJson -> topic.DeduplicatedOdeBsmJson
- topic.ProcessedBsm -> topic.DeduplicatedProcessedBsm
- topic.ProcessedSpat -> topic.DeduplicatedProcessedSpat

## Topic Deduplication Rules

The processes that determine which messages are duplicates are unique and customized for each message type. The following is a detailed explanation on each message type's criteria for when messages are deduplicated. All of the criteria must be met for a single message type.

### OdeMapJson
- Two messages within a 1 hour time window
- Two messages have the same intersection ID

### ProcessedMap and ProcessedMapWKT
- Two messages within a 1 hour time window
- Two messages have the same hash values after factoring out the odeReceivedAt and timeStamp fields

### OdeTimJson
- Two messages within a 1 hour time window
- Two messages have the same packet ID
- Two message have the same msgCnt value

### OdeBsmJson
- Two messages within the defined time threshold defined by the `odeBsmMaximumTimeDelta` value in application.yaml
- The new message's speed is identical to the previous message's speed
- The new message's speed is below the speed threshold defined by the `odeBsmAlwaysIncludeAtSpeed` value in application.yaml
- The position delta between the messages is not suitably large defined by the `odeBsmMaximumPositionDelta` value in application.yaml or position information is null

### ProcessedBsm
- Two messages within the defined time threshold defined by the `odeBsmMaximumTimeDelta` value in application.yaml
- The new speed is below the speed threshold defined by the `odeBsmAlwaysIncludeAtSpeed` value in application.yaml
- The position delta between the messages is not suitably large defined by the `odeBsmMaximumPositionDelta` value in application.yaml or position information is null

### ProcessedSpat
- Two messages within 1 minute time window
- Signal states are identical
- Signal light phases are identical

## Release Notes

The current version and release history of the JPO Deduplicator: [Release Notes](<docs/Release_notes.md>)
Expand Down Expand Up @@ -105,7 +138,6 @@ To manually configure deduplication for a topic, the following environment varia
| `ENABLE_PROCESSED_MAP_WKT_DEDUPLICATION` | `true` / `false` - Enable ProcessedMap WKT message Deduplication |
| `ENABLE_ODE_MAP_DEDUPLICATION` | `true` / `false` - Enable ODE MAP message Deduplication |
| `ENABLE_ODE_TIM_DEDUPLICATION` | `true` / `false` - Enable ODE TIM message Deduplication |
| `ENABLE_ODE_RAW_ENCODED_TIM_DEDUPLICATION` | `true` / `false` - Enable ODE Raw Encoded TIM Deduplication |
| `ENABLE_PROCESSED_SPAT_DEDUPLICATION` | `true` / `false` - Enable ProcessedSpat Deduplication |
| `ENABLE_ODE_BSM_DEDUPLICATION` | `true` / `false` - Enable ODE BSM Deduplication |
| `ENABLE_PROCESSED_BSM_DEDUPLICATION` | `true` / `false` - Enable Processed BSM Deduplication |
Expand Down
5 changes: 2 additions & 3 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ services:
privileged: false # Set true to allow writing to /proc/sys/vm/drop_caches
restart: ${RESTART_POLICY}
ports:
- "10091:10090" # JMX
- "10091:10090" # JMX
environment:
DOCKER_HOST_IP: ${DOCKER_HOST_IP}
KAFKA_BOOTSTRAP_SERVERS: ${KAFKA_BOOTSTRAP_SERVERS}
Expand All @@ -25,7 +25,6 @@ services:
ENABLE_PROCESSED_MAP_WKT_DEDUPLICATION: ${ENABLE_PROCESSED_MAP_WKT_DEDUPLICATION}
ENABLE_ODE_MAP_DEDUPLICATION: ${ENABLE_ODE_MAP_DEDUPLICATION}
ENABLE_ODE_TIM_DEDUPLICATION: ${ENABLE_ODE_TIM_DEDUPLICATION}
ENABLE_ODE_RAW_ENCODED_TIM_DEDUPLICATION: ${ENABLE_ODE_RAW_ENCODED_TIM_DEDUPLICATION}
ENABLE_PROCESSED_SPAT_DEDUPLICATION: ${ENABLE_PROCESSED_SPAT_DEDUPLICATION}
ENABLE_ODE_BSM_DEDUPLICATION: ${ENABLE_ODE_BSM_DEDUPLICATION}
ENABLE_PROCESSED_BSM_DEDUPLICATION: ${ENABLE_PROCESSED_BSM_DEDUPLICATION}
Expand Down Expand Up @@ -56,4 +55,4 @@ services:
depends_on:
kafka:
condition: service_healthy
required: false
required: false
6 changes: 3 additions & 3 deletions docs/Release_notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@ The first release of the jpo-deduplicator package. This package focuses on remov
- ProcessedMapWKT
- ProcessedSpat
- OdeMapJson
- OdeBSMJson
- OdeTIMJson
- OdeRawEncodedTIm
Comment thread
Michael7371 marked this conversation as resolved.
- OdeBsmJson
- OdeTimJson
- OdeRawEncodedTim
Comment thread
John-Wiens marked this conversation as resolved.

This release also makes additional changes to the submodule components as follows

Expand Down
122 changes: 65 additions & 57 deletions jpo-deduplicator/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>3.1.3</version>
<relativePath /> <!-- lookup parent from repository -->
<relativePath />
</parent>
<groupId>usdot.jpo.ode</groupId>
<artifactId>jpo-deduplicator</artifactId>
Expand All @@ -32,6 +32,8 @@
<sonar.dynamicAnalysis>reuseReports</sonar.dynamicAnalysis>
<sonar.coverage.jacoco.xmReportPaths>${project.basedir}/target/site/jacoco/jacoco.xml</sonar.coverage.jacoco.xmReportPaths>
<sonar.language>java</sonar.language>
<jpo.ode.version>5.0.2</jpo.ode.version>
<jpo.geojsonconverter.version>3.0.2</jpo.geojsonconverter.version>
</properties>
<dependencies>
<!-- Spring Boot Dependencies -->
Expand All @@ -56,10 +58,6 @@
</exclusion>
</exclusions>
</dependency>
<!-- <dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-thymeleaf</artifactId>
</dependency> -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-devtools</artifactId>
Expand Down Expand Up @@ -101,17 +99,27 @@
<dependency>
<groupId>usdot.jpo.ode</groupId>
<artifactId>jpo-ode-core</artifactId>
<version>4.1.1</version>
<version>${jpo.ode.version}</version>
</dependency>
<dependency>
<groupId>usdot.jpo.ode</groupId>
<artifactId>jpo-ode-plugins</artifactId>
<version>4.1.1</version>
<version>${jpo.ode.version}</version>
</dependency>
<dependency>
<groupId>usdot.jpo.ode</groupId>
<artifactId>jpo-geojsonconverter</artifactId>
<version>2.1.1</version>
<version>${jpo.geojsonconverter.version}</version>
<exclusions>
<exclusion>
<groupId>usdot.jpo.ode</groupId>
<artifactId>jpo-ode-core</artifactId>
</exclusion>
<exclusion>
<groupId>usdot.jpo.ode</groupId>
<artifactId>jpo-ode-plugins</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>junit</groupId>
Expand Down Expand Up @@ -146,19 +154,19 @@
<version>33.0.0-jre</version>
</dependency>

<!-- https://mvnrepository.com/artifact/org.apache.commons/commons-math3 -->
<!-- <dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-math3</artifactId>
<version>3.6.1</version>
</dependency> -->

<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>1.18.30</version>
</dependency>

<dependency>
<groupId>net.javacrumbs.json-unit</groupId>
<artifactId>json-unit</artifactId>
<version>4.1.0</version>
<scope>test</scope>
</dependency>


</dependencies>
<repositories>
Expand Down Expand Up @@ -290,45 +298,45 @@
</plugins>
</build>
<profiles>
<profile>
<id>package-jar</id>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>3.2.0</version>
<configuration>
<archive>
<manifest>
<addClasspath>true</addClasspath>
<classpathPrefix>lib/</classpathPrefix>
<mainClass>
us.dot.its.jpo.deduplicator.DeduplicatorApplication</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<executions>
<execution>
<id>repackage</id>
<phase>none</phase>
</execution>
</executions>
</plugin>
</plugins>
</build>
</profile>
</profiles>
<!-- GitHub Artifact publishing configuration -->
<distributionManagement>
<repository>
<id>github</id>
<name>GitHub Packages</name>
<url>https://maven.pkg.github.com/${github_organization}/jpo-deduplicator</url>
</repository>
</distributionManagement>
</project>
<profile>
<id>package-jar</id>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>3.2.0</version>
<configuration>
<archive>
<manifest>
<addClasspath>true</addClasspath>
<classpathPrefix>lib/</classpathPrefix>
<mainClass>
us.dot.its.jpo.deduplicator.DeduplicatorApplication</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<executions>
<execution>
<id>repackage</id>
<phase>none</phase>
</execution>
</executions>
</plugin>
</plugins>
</build>
</profile>
</profiles>
<!-- GitHub Artifact publishing configuration -->
<distributionManagement>
<repository>
<id>github</id>
<name>GitHub Packages</name>
<url>https://maven.pkg.github.com/${github_organization}/jpo-deduplicator</url>
</repository>
</distributionManagement>
</project>
Loading