Centralize runtime country code normalization

This commit is contained in:
trifonovt 2026-06-18 11:43:07 +02:00
parent 4caadf1270
commit 9ca8c88f91
15 changed files with 1007 additions and 123 deletions

View File

@ -0,0 +1,56 @@
# Country-code normalization patch
## Problem found in `response 202606171240 session D c home.json`
The response mixed three country identifier systems:
- tachograph numeric nation codes, for example `1`, `12`, `13`;
- tachograph alphabetic nation codes, for example `A`, `CZ`, `D`;
- ISO 3166-1 alpha-2 codes returned by Nominatim, for example `AT`, `CZ`, `DE`.
The projection contained 94 numeric `country` values, 33 numeric `countryFrom`
values, and 33 numeric `countryTo` values. The flat and nested trip segments also
contained numeric country identifiers.
This was not only a presentation problem. A tachograph value such as `13` and a
Nominatim value such as `DE` were considered different. The supplied response
contained 15 false reverse-geocoded country changes of the form `13 -> DE` or
`12 -> CZ`, creating unnecessary country segments.
## Canonical representation
All runtime country fields are now exposed and compared as ISO 3166-1 alpha-2.
Examples:
| Tachograph numeric | Tachograph alphabetic | Canonical ISO alpha-2 |
|---:|---|---|
| `1` | `A` | `AT` |
| `12` | `CZ` | `CZ` |
| `13` | `D` | `DE` |
Numeric and alphabetic tachograph values are resolved through the existing
`TachographNationRegistry` and then converted to ISO alpha-2. Nominatim results
are validated and normalized as ISO values.
## Processing changes
Normalization is applied before country comparison and at the main presentation
boundaries:
- normalized EventHub support evidence;
- source-neutral runtime support events;
- tachograph file-session support-event adapters;
- legacy tachograph Esper result conversion;
- `projection.supportGeoEvents`;
- flat country-trip segments;
- country segments nested under HOME-to-HOME trips;
- Nominatim country-code parsing.
The original source payload remains available in raw event attributes where the
numeric tachograph values are needed for diagnostics.
## Expected result
An explicit border crossing `1 -> 13` is represented as `AT -> DE`. A subsequent
Nominatim result `DE` no longer creates a false `13 -> DE` transition. The trip
continues in the same canonical `DE` country state until a genuine country change.

View File

@ -0,0 +1,558 @@
# 1. Terminology and source data
The algorithm uses data from:
- `M_`: vehicle-unit tachograph data, especially GNSS positions.
- `C_`: driver-card data, especially activities and card insertion/removal events.
The main objects are:
## DI — Driving Interval
A continuous interval in which the driver is driving.
It contains:
- driver
- vehicle
- start and end time
- start and end positions
- optional GNSS trace points
## NDI — Non-Driving Interval
The gap between two consecutive driving intervals.
It contains:
- driver
- vehicle before the interval
- vehicle after the interval
- start and end time
- inferred position
- card-removal interval
- location cluster
- classification: `HOME` or `NOT_HOME`
In this specification, `HOME` does not necessarily mean the drivers private residence. It means that the interruption is treated as a home/base interruption. It can also represent:
- a company depot,
- a vehicle change,
- a long card-removal period,
- or simply a rest longer than 24 hours.
# 2. How an NDI is built
Driving intervals are grouped by driver and ordered chronologically.
For every pair of consecutive driving intervals:
```text
previous driving interval
non-driving interval
next driving interval
```
The NDI is created as follows:
```text
NDI.start = previous DI.end
NDI.end = next DI.start
```
The vehicles are:
```text
NDI.vehicleStart = previous DI.vehicleId
NDI.vehicleEnd = next DI.vehicleId
```
The position is selected using:
```text
previous DI end position
otherwise
next DI start position
otherwise
no position
```
So the exact position rule is:
```text
NDI.pos = previous.posEnd ?? next.posStart
```
The driver-card events between the two driving intervals are used to find the card-removal interval.
## Important limitations
The algorithm creates NDIs only between consecutive driving intervals. It does not create:
- an NDI before the first driving interval,
- an NDI after the final driving interval,
- or an NDI when only card/activity data exists without surrounding driving intervals.
# 3. How home locations are learned
Only NDIs satisfying both conditions are used for location learning:
```text
duration > 7.5 hours
position is known
```
Exactly 7.5 hours does not qualify because the comparison is strictly `>`.
## 3.1 Location clustering
The positions are clustered globally with DBSCAN:
| Parameter | Value |
|---|---:|
| Maximum cluster distance | 150 metres |
| Minimum points | 3 |
| Distance calculation | Haversine/PostGIS |
| Unclustered points | `NOISE` |
All drivers qualifying NDIs appear to be clustered together in one global clustering operation.
## 3.2 Company-home locations
A cluster becomes a company-home location, normally interpreted as a depot, when:
```text
visits to cluster / all long positioned NDIs > 25%
```
The denominator is all NDIs in the complete dataset that:
- are longer than 7.5 hours, and
- have a position.
Noise points are excluded as possible company-home clusters.
## 3.3 Driver-home locations
For each driver, a cluster becomes a private driver-home location when:
```text
driver's visits to cluster /
driver's long positioned NDIs > 25%
```
Additionally:
```text
cluster must not already be a company-home cluster
```
Therefore, a location used frequently by the whole company is classified as a company depot rather than a private driver home.
## Exact threshold behaviour
The comparisons are strict:
- exactly 25% is not enough;
- the share must be greater than 25%.
# 4. Rules for determining whether an NDI is HOME or NOT_HOME
The rules are evaluated in a fixed priority order. The first matching rule wins.
## Decision table
| Priority | Condition | Result |
|---:|---|---|
| A | Vehicle before NDI differs from vehicle after NDI | `HOME` |
| B | Card is removed for more than 80% of the NDI | `HOME` |
| C | NDI duration is more than 24 hours | `HOME` |
| D1 | Position unknown and duration more than 7.5 hours | `HOME` |
| D2 | Position unknown and duration no more than 7.5 hours | `NOT_HOME` |
| E1 | Position known, duration more than 7.5 hours, and position belongs to company-home or driver-home cluster | `HOME` |
| E2 | Position known, duration more than 7.5 hours, but position is not a recognised home cluster | `NOT_HOME` |
| F | All remaining short NDIs | `NOT_HOME` |
## Rule A: Change of vehicle
```text
vehicleStart != vehicleEnd → HOME
```
When the next driving interval starts in another vehicle, the NDI is always treated as `HOME`.
This rule has the highest priority. It applies even when:
- the NDI is short,
- the position is known to be away from home,
- or the driver card remained inserted for much of the interval.
This is therefore a technical trip-separation rule rather than proof that the driver was physically at home.
## Rule B: Card removed for more than 80%
```text
cardOut duration > 80% of NDI duration → HOME
```
Exactly 80% is not sufficient.
Example:
```text
NDI duration: 10 hours
Card removed: 8 hours
Result: NOT enough for Rule B
Card removed: 8 hours 1 minute
Result: HOME
```
The data model contains only one `cardOut` interval. It is not defined how several card-removal periods inside one NDI should be combined.
## Rule C: NDI longer than 24 hours
```text
NDI duration > 24 hours → HOME
```
Exactly 24 hours is not sufficient.
This rule overrides the position logic. Even when the driver remains at an unrecognised remote location, an NDI longer than 24 hours is classified as `HOME`.
## Rule D: No known position
When the position cannot be determined:
```text
duration > 7.5 hours → HOME
duration <= 7.5 hours → NOT_HOME
```
This is a fallback assumption. A long rest without location evidence is assumed to be home.
## Rule E: Long NDI with a known position
For an NDI longer than 7.5 hours:
```text
position in company-home cluster → HOME
position in driver's home cluster → HOME
otherwise → NOT_HOME
```
The specification describes the final case as an overnight stay in the vehicle, although the data itself only establishes that the rest occurred away from a recognised home location.
## Rule F: Short NDI
A short NDI defaults to:
```text
NOT_HOME
```
However, a short NDI can still be classified as `HOME` by the earlier rules:
- vehicle changed, or
- card removed for more than 80% of the interval.
# 5. Compact HOME/NOT_HOME decision flow
```text
Did the vehicle change?
├─ Yes → HOME
└─ No
├─ Was the card removed for more than 80% of the NDI?
│ ├─ Yes → HOME
│ └─ No
├─ Is the NDI longer than 24 hours?
│ ├─ Yes → HOME
│ └─ No
├─ Is the position missing?
│ ├─ Yes and duration > 7.5 h → HOME
│ ├─ Yes and duration <= 7.5 h → NOT_HOME
│ └─ No
├─ Is the NDI longer than 7.5 hours?
│ ├─ No → NOT_HOME
│ └─ Yes
│ ├─ Company-home cluster → HOME
│ ├─ Driver-home cluster → HOME
│ └─ Other location/noise → NOT_HOME
```
# 6. How trip segments are currently determined
The code does not create a trip object. It creates `TripSegment` records based on country changes.
Each segment contains:
- driver
- vehicle
- start and end time
- country before and after the boundary
- start and end positions
The algorithm works as follows:
1. Take all driving intervals belonging to one driver.
2. Sort them chronologically.
3. Start at the beginning of the first driving interval.
4. Determine the country of the first position.
5. Scan all GNSS trace points in all driving intervals.
6. Reverse-geocode every trace point to a country.
7. When the country changes:
- close the current segment at that trace-point timestamp;
- store the old and new country;
- begin a new segment at the same trace point.
8. After all trace points, close the final segment at the end of the final driving interval.
Example:
```text
08:00 Driving starts in Austria
10:15 GNSS position changes from Austria to Germany
14:00 GNSS position changes from Germany to France
18:00 Final driving interval ends
```
Segments produced:
```text
Segment 1: 08:0010:15, Austria → Germany
Segment 2: 10:1514:00, Germany → France
Segment 3: 14:0018:00, France → France
```
The final segment has the same `countryFrom` and `countryTo` because no further border was crossed.
# 7. What currently determines a “trip”
Strictly speaking, the specification does not determine individual trips.
For each driver, the segment-building function starts at:
```text
first driving interval start
```
and continues until:
```text
last driving interval end
```
It splits this complete period only at country changes.
This means that if the input covers an entire month, the algorithm may effectively process the whole month as one continuous sequence of country segments—even when the driver returned home several times.
The `HOME` and `NOT_HOME` classifications are not passed into `buildTripSegments()`. In fact, trip segments are built before the NDIs are classified:
```text
build NDIs
build trip segments
cluster NDIs
determine home locations
classify NDIs
```
Consequently:
```text
HOME NDI does not end a trip
NOT_HOME NDI does not explicitly continue a trip
```
The two parts of the algorithm are currently disconnected.
# 8. Likely intended trip definition
Based on the purpose of the HOME/NOT_HOME classification, the intended definition is most likely:
> A trip is a maximal chronological sequence of driving intervals separated only by `NOT_HOME` NDIs. A `HOME` NDI closes the current trip and separates it from the next trip.
That would produce the following rules:
## Trip start
A trip begins:
- at the first available driving interval, or
- at the first driving interval following a `HOME` NDI.
## Trip continuation
The same trip continues across an NDI when:
```text
NDI.status = NOT_HOME
```
This includes:
- short breaks,
- overnight rests away from recognised home locations,
- rest in the vehicle,
- long rests at remote locations up to 24 hours.
## Trip end
A trip ends at the end of the driving interval preceding an NDI when:
```text
NDI.status = HOME
```
The next driving interval begins a new trip.
## Country segmentation inside a trip
After trips are established, each trip is divided into country segments at:
- an explicit tachograph border-crossing event, or
- a reliable country change inferred from GNSS positions.
The logical hierarchy should therefore be:
```text
Driver timeline
└─ Trip
├─ Country segment 1
├─ Country segment 2
└─ Country segment 3
```
Not:
```text
Driver timeline
└─ Country segments without trip boundaries
```
# 9. Recommended trip-building algorithm
A consistent implementation would be:
```text
1. Build and sort all driving intervals per driver.
2. Build the NDI between every two consecutive driving intervals.
3. Determine location clusters.
4. Classify every NDI as HOME or NOT_HOME.
5. Build trips:
- start with the first DI;
- append NOT_HOME NDI and the following DI to the current trip;
- when a HOME NDI occurs, close the current trip;
- start a new trip with the next DI.
6. Split every resulting trip at country-border crossings.
```
Pseudocode:
```text
currentTrip = new Trip(firstDI)
for every NDI between prevDI and nextDI:
if NDI.status == NOT_HOME:
currentTrip.add(NDI)
currentTrip.add(nextDI)
else:
currentTrip.end = prevDI.end
save(currentTrip)
currentTrip = new Trip(nextDI)
save(currentTrip)
```
Then:
```text
for every trip:
trip.segments = splitAtBorderCrossings(trip)
```
# 10. Issues and ambiguities in the current rules
## Explicit border-crossing events are mentioned but not used
The comment states that a border crossing can come from:
- an explicit Smart Tachograph v2 event, or
- a GNSS-derived country change.
However, the implementation scans only `gnssTrace`. There is no processing of explicit border-crossing events.
## Vehicle identity can be incorrect for a segment
A segment may span several driving intervals and possibly several vehicles. Nevertheless, the segment stores only one `vehicleId`:
- the vehicle active at the border crossing, or
- the vehicle of the final DI for the final segment.
If a vehicle changes without a country crossing, the segment can contain activity from multiple vehicles but retain only the last vehicle ID.
## HOME does not currently split segments or trips
A driver can:
1. drive,
2. return home,
3. remain home for two days,
4. begin a new journey,
and the current segment builder can still represent both journeys as one continuous segment if no country changes occur.
## Position selection may hide conflicting positions
The NDI position always prefers the previous DIs end position:
```text
previous.posEnd ?? next.posStart
```
When both positions exist but differ substantially, the inconsistency is ignored.
## Long unknown-location intervals are assumed HOME
An NDI longer than 7.5 hours without a position is automatically `HOME`. This can incorrectly classify an overnight stay abroad as home when GNSS data is missing.
## All rests longer than 24 hours are HOME
A driver can remain at a foreign parking place for more than 24 hours, but the rule still returns `HOME`. This may be intentional as a trip-reset rule, but it is not reliable as a physical-home determination.
## Global company-home calculation may be dominated by dataset composition
The company-home denominator includes all qualifying NDIs across all drivers. Results can depend on:
- the selected time period,
- drivers with many records,
- missing GNSS data,
- incomplete driver histories.
# Final interpretation
The document currently provides a valid algorithm for:
- constructing NDIs between driving intervals,
- learning frequently visited locations,
- classifying each NDI as `HOME` or `NOT_HOME`,
- and splitting driving history at detected country changes.
But it does **not yet provide a complete trip-building algorithm**.
The most consistent interpretation is:
```text
HOME NDI = boundary between two trips
NOT_HOME NDI = interruption inside the same trip
Border crossing = boundary between segments inside one trip
```
That relationship needs to be explicitly implemented because it is not present in the current `run()` or `buildTripSegments()` logic.

View File

@ -3,6 +3,7 @@ package at.procon.eventhub.geocoding.service;
import at.procon.eventhub.config.EventHubProperties;
import at.procon.eventhub.geocoding.model.GeoCountryResolution;
import at.procon.eventhub.geocoding.model.GeoCountryResolutionStatus;
import at.procon.eventhub.reference.CountryCodeNormalizer;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import java.io.IOException;
@ -18,7 +19,6 @@ import java.time.Clock;
import java.time.Duration;
import java.time.Instant;
import java.util.Comparator;
import java.util.Locale;
import java.util.Map;
import java.util.Objects;
import java.util.concurrent.ConcurrentHashMap;
@ -288,7 +288,6 @@ public class NominatimGeoCountryResolver implements GeoCountryResolver {
}
}
private HttpResponse<String> sendRequest(
URI uri,
EventHubProperties.Nominatim config,
@ -447,11 +446,7 @@ public class NominatimGeoCountryResolver implements GeoCountryResolver {
}
private String normalizeCountryCode(String value) {
if (value == null || value.isBlank()) {
return null;
}
String normalized = value.trim().toUpperCase(Locale.ROOT);
return normalized.length() == 2 ? normalized : null;
return CountryCodeNormalizer.normalizeIso(value);
}
private String text(JsonNode node, String field) {

View File

@ -12,6 +12,7 @@ import at.procon.eventhub.processing.driverworkingtime.model.DriverWorkingTimeSu
import at.procon.eventhub.processing.driverworkingtime.model.DriverWorkingTimeVehicleUsageInterval;
import at.procon.eventhub.processing.driverworkingtime.model.DriverWorkingTimeVuCardAbsentInterval;
import at.procon.eventhub.processing.eventprocessing.support.RuntimeSupportEvidenceEvent;
import at.procon.eventhub.reference.CountryCodeNormalizer;
import java.time.Duration;
import java.time.OffsetDateTime;
import java.time.ZoneOffset;
@ -306,10 +307,22 @@ public class DriverWorkingTimeProcessingCore {
event.lifecycle(),
event.registrationKey(),
event.vehicleKey(),
event.countryCode(),
CountryCodeNormalizer.normalizeSupportEvent(
event.sourceFamily(),
event.sourceKind(),
event.countryCode()
),
event.regionCode(),
event.countryFrom(),
event.countryTo(),
CountryCodeNormalizer.normalizeSupportEvent(
event.sourceFamily(),
event.sourceKind(),
event.countryFrom()
),
CountryCodeNormalizer.normalizeSupportEvent(
event.sourceFamily(),
event.sourceKind(),
event.countryTo()
),
event.operation(),
event.latitude(),
event.longitude(),

View File

@ -1,105 +1,19 @@
package at.procon.eventhub.processing.driverworkingtime.tripsegmentation.service;
import java.util.LinkedHashMap;
import java.util.Locale;
import java.util.Map;
/**
* Package-local compatibility facade. Country-code normalization is shared by
* the complete runtime pipeline through the reference-layer implementation.
*/
final class CountryCodeNormalizer {
private static final Map<String, String> TACHOGRAPH_TO_ISO2 = buildTachographMap();
private CountryCodeNormalizer() {
}
static String normalizeTachograph(String value) {
if (value == null || value.isBlank()) {
return null;
}
String normalized = value.trim().toUpperCase(Locale.ROOT);
String mapped = TACHOGRAPH_TO_ISO2.get(normalized);
if (mapped != null) {
return mapped;
}
return normalizeIso(normalized);
return at.procon.eventhub.reference.CountryCodeNormalizer.normalizeTachograph(value);
}
static String normalizeIso(String value) {
if (value == null || value.isBlank()) {
return null;
}
String normalized = value.trim().toUpperCase(Locale.ROOT);
if (normalized.length() == 2) {
return normalized;
}
for (String iso2 : Locale.getISOCountries()) {
Locale locale = Locale.of("", iso2);
try {
if (locale.getISO3Country().equalsIgnoreCase(normalized)) {
return iso2;
}
} catch (java.util.MissingResourceException ignored) {
}
}
return null;
}
private static Map<String, String> buildTachographMap() {
Map<String, String> values = new LinkedHashMap<>();
values.put("A", "AT");
values.put("AL", "AL");
values.put("AND", "AD");
values.put("ARM", "AM");
values.put("AZ", "AZ");
values.put("B", "BE");
values.put("BG", "BG");
values.put("BIH", "BA");
values.put("BY", "BY");
values.put("CH", "CH");
values.put("CY", "CY");
values.put("CZ", "CZ");
values.put("D", "DE");
values.put("DK", "DK");
values.put("E", "ES");
values.put("EST", "EE");
values.put("F", "FR");
values.put("FIN", "FI");
values.put("FL", "LI");
values.put("FR", "FO");
values.put("UK", "GB");
values.put("GE", "GE");
values.put("GR", "GR");
values.put("H", "HU");
values.put("HR", "HR");
values.put("I", "IT");
values.put("IRL", "IE");
values.put("IS", "IS");
values.put("KZ", "KZ");
values.put("L", "LU");
values.put("LT", "LT");
values.put("LV", "LV");
values.put("M", "MT");
values.put("MC", "MC");
values.put("MD", "MD");
values.put("MK", "MK");
values.put("N", "NO");
values.put("NL", "NL");
values.put("P", "PT");
values.put("PL", "PL");
values.put("RO", "RO");
values.put("RSM", "SM");
values.put("RUS", "RU");
values.put("S", "SE");
values.put("SK", "SK");
values.put("SLO", "SI");
values.put("TM", "TM");
values.put("TR", "TR");
values.put("UA", "UA");
values.put("V", "VA");
values.put("YU", "RS");
values.put("MNE", "ME");
values.put("SRB", "RS");
values.put("UZ", "UZ");
values.put("TJ", "TJ");
return Map.copyOf(values);
return at.procon.eventhub.reference.CountryCodeNormalizer.normalizeIso(value);
}
}

View File

@ -9,6 +9,7 @@ import at.procon.eventhub.dto.EventType;
import at.procon.eventhub.dto.GeoPointDto;
import at.procon.eventhub.dto.VehicleRefDto;
import at.procon.eventhub.processing.support.RuntimeEntityReferenceResolver;
import at.procon.eventhub.reference.CountryCodeNormalizer;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.node.ObjectNode;
@ -87,10 +88,12 @@ public class RuntimeSupportEvidenceNormalizer {
BigDecimal latitude = position == null ? decimal(raw, "latitude") : position.latitude();
BigDecimal longitude = position == null ? decimal(raw, "longitude") : position.longitude();
Long odometerKm = firstNonNull(longValue(raw, "odometerKm"), toKilometers(event.odometerM()));
String sourceFamily = sourceFamily(event);
String sourceKind = firstNonBlank(text(raw, "sourceKind"), sourceKind(event));
return new RuntimeSupportEvidenceEvent(
firstNonBlank(text(raw, "supportEventId"), text(raw, "sourceRowId"), event.externalSourceEventId()),
sourceFamily(event),
firstNonBlank(text(raw, "sourceKind"), sourceKind(event)),
sourceFamily,
sourceKind,
event.eventDomain() == null ? null : event.eventDomain().name(),
event.eventType() == null ? null : event.eventType().name(),
event.lifecycle() == null ? null : event.lifecycle().name(),
@ -101,10 +104,22 @@ public class RuntimeSupportEvidenceNormalizer {
event.occurredAt() == null ? null : event.occurredAt().toEpochSecond(),
latitude,
longitude,
firstNonBlank(text(raw, "country"), detailText(event, "country")),
CountryCodeNormalizer.normalizeSupportEvent(
sourceFamily,
sourceKind,
firstNonBlank(text(raw, "country"), detailText(event, "country"))
),
firstNonBlank(text(raw, "region"), detailText(event, "region")),
firstNonBlank(text(raw, "countryFrom"), detailText(event, "countryFrom")),
firstNonBlank(text(raw, "countryTo"), detailText(event, "countryTo")),
CountryCodeNormalizer.normalizeSupportEvent(
sourceFamily,
sourceKind,
firstNonBlank(text(raw, "countryFrom"), detailText(event, "countryFrom"))
),
CountryCodeNormalizer.normalizeSupportEvent(
sourceFamily,
sourceKind,
firstNonBlank(text(raw, "countryTo"), detailText(event, "countryTo"))
),
firstNonBlank(text(raw, "operation"), detailText(event, "operation")),
odometerKm,
decimal(raw, "avgSpeedKmh"),

View File

@ -6,6 +6,7 @@ import at.procon.eventhub.processing.eventprocessing.support.RuntimeSupportEvide
import at.procon.eventhub.processing.model.RuntimeActivityInterval;
import at.procon.eventhub.processing.model.RuntimeSupportEvent;
import at.procon.eventhub.processing.model.RuntimeVehicleUsageInterval;
import at.procon.eventhub.reference.CountryCodeNormalizer;
import java.util.List;
import java.util.UUID;
@ -89,10 +90,10 @@ public final class RuntimeDriverWorkingTimeAdapter {
supportEvent.occurredAt().toEpochSecond(),
supportEvent.latitude(),
supportEvent.longitude(),
supportEvent.country(),
CountryCodeNormalizer.normalizeSupportEvent(null, null, supportEvent.country()),
supportEvent.region(),
supportEvent.countryFrom(),
supportEvent.countryTo(),
CountryCodeNormalizer.normalizeSupportEvent(null, null, supportEvent.countryFrom()),
CountryCodeNormalizer.normalizeSupportEvent(null, null, supportEvent.countryTo()),
supportEvent.operation(),
supportEvent.odometerKm(),
supportEvent.avgSpeedKmh(),

View File

@ -0,0 +1,210 @@
package at.procon.eventhub.reference;
import java.util.LinkedHashMap;
import java.util.Locale;
import java.util.Map;
import java.util.MissingResourceException;
/**
* Converts country identifiers used by tachograph sources and reverse-geocoding
* providers to one canonical representation: ISO 3166-1 alpha-2.
*
* <p>Tachograph nation identifiers are not ISO alpha-2 identifiers. They may be
* numeric (for example {@code 1}, {@code 12}, {@code 13}) or use tachograph
* alphabetic values such as {@code A}, {@code D}, {@code UK}, and {@code SLO}.
* Those values must be resolved through {@link TachographNationRegistry} before
* they are compared with Nominatim values such as {@code AT}, {@code CZ}, and
* {@code DE}.</p>
*/
public final class CountryCodeNormalizer {
private static final Map<String, String> TACHOGRAPH_ALPHA_TO_ISO2 = buildTachographAlphaMap();
private CountryCodeNormalizer() {
}
/**
* Normalizes a tachograph numeric or alphabetic country identifier to ISO alpha-2.
*/
public static String normalizeTachograph(String value) {
String normalized = normalizeInput(value);
if (normalized == null) {
return null;
}
TachographNationRegistry.NationResolution resolution =
TachographNationRegistry.resolve(normalized, null);
if (resolution.known()) {
String mapped = TACHOGRAPH_ALPHA_TO_ISO2.get(normalizeInput(resolution.legacyNation()));
if (mapped != null) {
return mapped;
}
}
String mapped = TACHOGRAPH_ALPHA_TO_ISO2.get(normalized);
if (mapped != null) {
return mapped;
}
// Some sources already expose ISO values even though they are part of a
// tachograph pipeline. Preserve those values when no tachograph mapping exists.
return normalizeIso(normalized);
}
/**
* Normalizes an ISO alpha-2 or alpha-3 country identifier to ISO alpha-2.
*/
public static String normalizeIso(String value) {
String normalized = normalizeInput(value);
if (normalized == null) {
return null;
}
if (isAlphabeticAlpha2(normalized)) {
return normalized;
}
for (String iso2 : Locale.getISOCountries()) {
Locale locale = Locale.of("", iso2);
try {
if (locale.getISO3Country().equalsIgnoreCase(normalized)) {
return iso2;
}
} catch (MissingResourceException ignored) {
// Ignore incomplete locale entries and continue with the remaining countries.
}
}
return null;
}
/**
* Normalizes a support-event country value while respecting its source type.
* Numeric values are always tachograph nation codes. Explicit tachograph source
* metadata also selects tachograph semantics. Other providers are interpreted as
* ISO first to avoid ambiguities such as {@code FR}, which is an ISO code for
* France but a legacy tachograph code for the Faroe Islands.
*/
public static String normalizeSupportEvent(
String sourceFamily,
String sourceKind,
String value
) {
String normalized = normalizeInput(value);
if (normalized == null) {
return null;
}
if (isInteger(normalized) || isTachographSource(sourceFamily, sourceKind)) {
return normalizeTachograph(normalized);
}
String iso = normalizeIso(normalized);
if (iso != null) {
return iso;
}
return normalizeTachograph(normalized);
}
private static boolean isTachographSource(String sourceFamily, String sourceKind) {
String family = normalizeInput(sourceFamily);
String kind = normalizeInput(sourceKind);
return containsTachographMarker(family) || containsTachographMarker(kind);
}
private static boolean containsTachographMarker(String value) {
if (value == null) {
return false;
}
return value.contains("TACHOGRAPH")
|| value.equals("DRIVER_CARD")
|| value.equals("VEHICLE_UNIT")
|| value.equals("CARD")
|| value.equals("VU")
|| value.startsWith("CARD_")
|| value.startsWith("VU_");
}
private static boolean isInteger(String value) {
if (value == null || value.isEmpty()) {
return false;
}
for (int index = 0; index < value.length(); index++) {
if (!Character.isDigit(value.charAt(index))) {
return false;
}
}
return true;
}
private static boolean isAlphabeticAlpha2(String value) {
return value != null
&& value.length() == 2
&& Character.isLetter(value.charAt(0))
&& Character.isLetter(value.charAt(1));
}
private static String normalizeInput(String value) {
if (value == null) {
return null;
}
String normalized = value.trim().toUpperCase(Locale.ROOT);
return normalized.isEmpty() ? null : normalized;
}
private static Map<String, String> buildTachographAlphaMap() {
Map<String, String> values = new LinkedHashMap<>();
values.put("A", "AT");
values.put("AL", "AL");
values.put("AND", "AD");
values.put("ARM", "AM");
values.put("AZ", "AZ");
values.put("B", "BE");
values.put("BG", "BG");
values.put("BIH", "BA");
values.put("BY", "BY");
values.put("CH", "CH");
values.put("CY", "CY");
values.put("CZ", "CZ");
values.put("D", "DE");
values.put("DK", "DK");
values.put("E", "ES");
values.put("EST", "EE");
values.put("F", "FR");
values.put("FIN", "FI");
values.put("FL", "LI");
values.put("FR", "FO");
values.put("UK", "GB");
values.put("GE", "GE");
values.put("GR", "GR");
values.put("H", "HU");
values.put("HR", "HR");
values.put("I", "IT");
values.put("IRL", "IE");
values.put("IS", "IS");
values.put("KZ", "KZ");
values.put("L", "LU");
values.put("LT", "LT");
values.put("LV", "LV");
values.put("M", "MT");
values.put("MC", "MC");
values.put("MD", "MD");
values.put("MK", "MK");
values.put("N", "NO");
values.put("NL", "NL");
values.put("P", "PT");
values.put("PL", "PL");
values.put("RO", "RO");
values.put("RSM", "SM");
values.put("RUS", "RU");
values.put("S", "SE");
values.put("SK", "SK");
values.put("SLO", "SI");
values.put("TM", "TM");
values.put("TR", "TR");
values.put("UA", "UA");
values.put("V", "VA");
values.put("YU", "RS");
values.put("MNE", "ME");
values.put("SRB", "RS");
values.put("UZ", "UZ");
values.put("TJ", "TJ");
return Map.copyOf(values);
}
}

View File

@ -9,6 +9,7 @@ import at.procon.eventhub.processing.driverworkingtime.model.DriverWorkingTimeRe
import at.procon.eventhub.processing.driverworkingtime.model.DriverWorkingTimeSupportGeoEvent;
import at.procon.eventhub.processing.driverworkingtime.model.DriverWorkingTimeVehicleUsageInterval;
import at.procon.eventhub.processing.driverworkingtime.model.DriverWorkingTimeVuCardAbsentInterval;
import at.procon.eventhub.reference.CountryCodeNormalizer;
import at.procon.eventhub.tachographfilesession.model.TachographEsperActivityIntervalEvent;
import at.procon.eventhub.tachographfilesession.model.TachographEsperDailyWeeklyRestCandidateCoverageIntervalEvent;
import at.procon.eventhub.tachographfilesession.model.TachographEsperDrivingInterruptionIntervalEvent;
@ -730,10 +731,10 @@ public record TachographEsperDriverProcessingResultDto(
value.eventLifecycle(),
value.registrationKey(),
value.vehicleKey(),
value.country(),
CountryCodeNormalizer.normalizeTachograph(value.country()),
value.region(),
value.countryFrom(),
value.countryTo(),
CountryCodeNormalizer.normalizeTachograph(value.countryFrom()),
CountryCodeNormalizer.normalizeTachograph(value.countryTo()),
value.operation(),
value.latitude(),
value.longitude(),

View File

@ -22,6 +22,7 @@ import at.procon.eventhub.tachographfilesession.model.TachographEsperPotentialHo
import at.procon.eventhub.tachographfilesession.model.TachographEsperPotentialInVehicleOvernightStayIntervalEvent;
import at.procon.eventhub.tachographfilesession.model.TachographEsperPotentialInVehicleTripIntervalEvent;
import at.procon.eventhub.tachographfilesession.model.TachographEsperVuCardAbsentIntervalEvent;
import at.procon.eventhub.reference.CountryCodeNormalizer;
import java.time.OffsetDateTime;
import java.util.ArrayList;
import java.util.Comparator;
@ -129,8 +130,8 @@ public final class TachographDriverWorkingTimeAdapter {
}
return new RuntimeSupportEvidenceEvent(
supportEvent.eventId(),
null,
null,
"TACHOGRAPH_FILE_SESSION",
"TACHOGRAPH",
supportEvent.eventDomain(),
supportEvent.eventType(),
supportEvent.eventLifecycle(),
@ -141,10 +142,10 @@ public final class TachographDriverWorkingTimeAdapter {
supportEvent.occurredAt().toEpochSecond(),
supportEvent.latitude(),
supportEvent.longitude(),
supportEvent.country(),
CountryCodeNormalizer.normalizeTachograph(supportEvent.country()),
supportEvent.region(),
supportEvent.countryFrom(),
supportEvent.countryTo(),
CountryCodeNormalizer.normalizeTachograph(supportEvent.countryFrom()),
CountryCodeNormalizer.normalizeTachograph(supportEvent.countryTo()),
supportEvent.operation(),
supportEvent.odometerKm(),
supportEvent.avgSpeedKmh(),

View File

@ -10,6 +10,7 @@ import at.procon.eventhub.tachographfilesession.model.ExtractionWarning;
import at.procon.eventhub.tachographfilesession.model.ResolvedActivityInterval;
import at.procon.eventhub.tachographfilesession.model.ResolvedDriverTimeline;
import at.procon.eventhub.tachographfilesession.model.ResolvedVehicleUsageInterval;
import at.procon.eventhub.reference.CountryCodeNormalizer;
import java.util.List;
public final class RuntimeTimelineCompatibilityAdapter {
@ -163,10 +164,10 @@ public final class RuntimeTimelineCompatibilityAdapter {
supportEvent.slot(),
supportEvent.registrationKey(),
supportEvent.vehicleKey(),
supportEvent.country(),
CountryCodeNormalizer.normalizeTachograph(supportEvent.country()),
supportEvent.region(),
supportEvent.countryFrom(),
supportEvent.countryTo(),
CountryCodeNormalizer.normalizeTachograph(supportEvent.countryFrom()),
CountryCodeNormalizer.normalizeTachograph(supportEvent.countryTo()),
supportEvent.operation(),
supportEvent.latitude(),
supportEvent.longitude(),

View File

@ -60,7 +60,7 @@ eventhub:
# Deliberate opt-in is required for the donated public service. Prefer a self-hosted or contracted endpoint for production/bulk processing.
public-service-enabled: ${NOMINATIM_PUBLIC_SERVICE_ENABLED:false}
user-agent: ${NOMINATIM_USER_AGENT:eventhub-tachograph/0.1 (Nominatim reverse geocoding)}
email: ${NOMINATIM_EMAIL:martin.schweitzer@procon.co.at}
email: ${NOMINATIM_EMAIL:}
accept-language: ${NOMINATIM_ACCEPT_LANGUAGE:en}
connect-timeout: ${NOMINATIM_CONNECT_TIMEOUT:5s}
read-timeout: ${NOMINATIM_READ_TIMEOUT:30s}

View File

@ -67,6 +67,56 @@ class DriverCountryTripSegmentationServiceTest {
assertThat(result.segments().get(1).countryTo()).isEqualTo("DE");
}
@Test
void normalizesNumericTachographCountriesBeforeComparingWithIsoCountries() {
AtomicInteger resolverCalls = new AtomicInteger();
GeoCountryResolver resolver = (latitude, longitude, allowRemoteLookup) -> {
resolverCalls.incrementAndGet();
String code = longitude.compareTo(new BigDecimal("15")) > 0 ? "AT" : "DE";
return new GeoCountryResolution(
GeoCountryResolutionStatus.RESOLVED,
latitude,
longitude,
code,
null,
null,
"NOMINATIM",
null,
false,
true,
null
);
};
DriverCountryTripSegmentationService service = new DriverCountryTripSegmentationService(
resolver,
properties()
);
List<RuntimeSupportEvidenceEvent> supportEvents = List.of(
position("p-at", "2026-05-01T08:05:00Z", "48.2082", "16.3738", "1"),
border("b-at-de", "2026-05-01T10:00:00Z", "48.75", "13.84", "1", "13"),
position("p-de", "2026-05-01T11:00:00Z", "48.90", "13.40", null)
);
DriverCountryTripSegmentationResult result = service.segmentPreparedInputs(
Map.of(DRIVER, preparedInput(supportEvents))
).resultForDriver(DRIVER);
assertThat(resolverCalls).hasValue(1);
assertThat(result.segmentCount()).isEqualTo(2);
assertThat(result.segments().get(0).countryCode()).isEqualTo("AT");
assertThat(result.segments().get(0).countryFrom()).isEqualTo("AT");
assertThat(result.segments().get(0).countryTo()).isEqualTo("DE");
assertThat(result.segments().get(1).countryCode()).isEqualTo("DE");
assertThat(result.segments().get(1).countryFrom()).isEqualTo("DE");
assertThat(result.segments().get(1).countryTo()).isEqualTo("DE");
assertThat(result.segments())
.allSatisfy(segment -> {
assertThat(segment.countryCode()).doesNotMatch("\\d+");
assertThat(segment.countryFrom()).doesNotMatch("\\d+");
assertThat(segment.countryTo()).doesNotMatch("\\d+");
});
}
@Test
void usesNominatimWhenPositionCountryIsMissing() {
AtomicInteger resolverCalls = new AtomicInteger();

View File

@ -75,6 +75,42 @@ class RuntimeSupportEvidenceNormalizerTest {
assertThat(normalized.payload().path("raw").path("supportEventType").asText()).isEqualTo("BORDER_INBOUND");
}
@Test
void resolvesNumericTachographCountriesToCanonicalIsoCodes() {
ObjectNode payload = (ObjectNode) raw("DRIVER-1", "VIN-1", "1:W-1");
ObjectNode raw = (ObjectNode) payload.path("raw");
raw.put("sourceKind", "DRIVER_CARD");
raw.put("country", "13");
raw.put("countryFrom", "1");
raw.put("countryTo", "12");
EventHubEventDto border = new EventHubEventDto(
UUID.randomUUID(),
"border-numeric-1",
null,
vehicleRef("VIN-1", "1:W-1"),
OffsetDateTime.parse("2026-05-01T22:00:00Z"),
null,
OffsetDateTime.parse("2026-05-01T22:00:00Z"),
EventDomain.BORDER_CROSSING,
EventType.BORDER_OUTBOUND,
EventLifecycle.OUTBOUND,
null,
new GeoPointDto(new BigDecimal("48.5"), new BigDecimal("16.5")),
null,
null,
payload,
false,
null
);
RuntimeSupportEvidenceEvent support = normalizer.toSupportEvidenceEvent("DRIVER-1", border);
assertThat(support.countryCode()).isEqualTo("DE");
assertThat(support.countryFrom()).isEqualTo("AT");
assertThat(support.countryTo()).isEqualTo("CZ");
}
@Test
void doesNotNormalizeActivityOrCardUsageEvents() {
EventHubEventDto cardUsage = new EventHubEventDto(

View File

@ -0,0 +1,33 @@
package at.procon.eventhub.reference;
import static org.assertj.core.api.Assertions.assertThat;
import org.junit.jupiter.api.Test;
class CountryCodeNormalizerTest {
@Test
void resolvesNumericTachographNationCodesToIsoAlpha2() {
assertThat(CountryCodeNormalizer.normalizeTachograph("1")).isEqualTo("AT");
assertThat(CountryCodeNormalizer.normalizeTachograph("12")).isEqualTo("CZ");
assertThat(CountryCodeNormalizer.normalizeTachograph("13")).isEqualTo("DE");
}
@Test
void resolvesAlphabeticTachographNationCodesToIsoAlpha2() {
assertThat(CountryCodeNormalizer.normalizeTachograph("A")).isEqualTo("AT");
assertThat(CountryCodeNormalizer.normalizeTachograph("CZ")).isEqualTo("CZ");
assertThat(CountryCodeNormalizer.normalizeTachograph("D")).isEqualTo("DE");
assertThat(CountryCodeNormalizer.normalizeTachograph("UK")).isEqualTo("GB");
}
@Test
void keepsIsoAndTachographSemanticsSourceAware() {
assertThat(CountryCodeNormalizer.normalizeIso("at")).isEqualTo("AT");
assertThat(CountryCodeNormalizer.normalizeIso("DEU")).isEqualTo("DE");
assertThat(CountryCodeNormalizer.normalizeSupportEvent("YELLOWFOX", "POSITION", "FR"))
.isEqualTo("FR");
assertThat(CountryCodeNormalizer.normalizeSupportEvent("TACHOGRAPH_FILE_SESSION", "DRIVER_CARD", "FR"))
.isEqualTo("FO");
}
}