# NDI HOME / NOT_HOME classification and country trip segmentation This patch implements the HOME / NOT_HOME classification and the country-trip segmentation described in `docs/ndi_home_classification_en.md`. It reuses the existing driver-working-time pipeline and adds configurable Nominatim reverse geocoding only where source country evidence is missing. ## Public processing plan Use: ```text driver-home-classification-v1 ``` The dedicated plan delegates to the shared `driver-working-time-v1` pipeline and explicitly inserts: ```text support-evidence-normalization -> ndi-home-classification -> country-trip-segmentation -> driving-derived-projections ``` The normal `driver-working-time-v1` plan keeps both modules optional. They can also be requested explicitly as `ndi-home-classification` and `country-trip-segmentation`. ## Reused projection structures `DriverWorkingTimeReusableProjectionBuilder.buildAllNonDrivingIntervalCoverage(...)` runs the existing Esper interruption/card-absence/GNSS enrichment pipeline with a zero rest-candidate threshold. It creates enriched evidence for every positive non-driving interruption without changing the legacy daily/weekly-rest threshold or outputs. The implementation reuses `DriverWorkingTimeRestCoverageInterval` as the enriched NDI evidence model. It provides: - previous and next driving/vehicle identities; - NDI start, end, and duration; - card-absence duration and percentage; - begin/end boundary GNSS evidence; - boundary odometer and movement evidence. ## HOME / NOT_HOME classification The rules are evaluated in the document order: 1. previous and next vehicles differ -> `HOME`; 2. card absent for more than 80% -> `HOME`; 3. NDI longer than 24 hours -> `HOME`; 4. no position: NDI longer than 7.5 hours -> `HOME`, otherwise `NOT_HOME`; 5. positioned long NDI in a company or driver home cluster -> `HOME`; 6. positioned long NDI outside those clusters -> `NOT_HOME`; 7. remaining short NDI -> `NOT_HOME`. Every classification contains a `DriverNdiHomeClassificationReason`, so the first matching rule remains visible in the API response. ## Location learning and clustering Only NDIs longer than 7.5 hours with a position are added to the corpus. Position selection uses the existing resolved begin-boundary evidence and falls back to resolved end-boundary evidence. The in-memory cache: - accumulates observations across one or more file-session executions; - deduplicates the same NDI across repeated/overlapping sessions; - retains source-session provenance; - stores the driver key on every observation; - calculates actual-driver and other-driver views per request. Clustering uses Java DBSCAN with Haversine distance. Defaults are 150 metres and three points. Noise observations remain in the denominator for visit-share calculations but are never home clusters. ## Country trip segmentation `DriverCountryTripSegmentationService` builds country segments over driving intervals. Evidence precedence is: 1. explicit tachograph border-crossing event (`countryFrom` / `countryTo`); 2. country code already present on a positioned support event; 3. Nominatim reverse lookup for a positioned event without a usable country code. Country values are normalized to ISO 3166-1 alpha-2 where a mapping is known. Segment boundaries retain their evidence source: ```text EXPLICIT_BORDER_CROSSING GNSS_SOURCE_COUNTRY_CHANGE NOMINATIM_COUNTRY_CHANGE VEHICLE_CHANGE FINAL ``` The result includes segment counts, explicit-border counts, remote lookup counts, cache-hit counts, unresolved-coordinate counts, warnings, and OpenStreetMap attribution. ## Nominatim integration The client uses the reverse endpoint with: ```text format=jsonv2 zoom=3 addressdetails=1 layer=address ``` Only `address.country_code` is required by the classification/segmentation logic. Failures do not fail the whole processing plan; the coordinate remains unresolved and a diagnostic warning is returned. Safeguards: - identifying configurable `User-Agent`; - optional identifying email; - shared coordinate cache with TTL and maximum size; - coordinate quantization for cache reuse; - one execution-level remote lookup budget; - fully serialized remote calls; - configurable minimum interval; - enforced minimum one-second interval for `nominatim.openstreetmap.org`; - public OSM endpoint disabled unless deliberately opted in; - configurable endpoint so a self-hosted or contracted Nominatim service can be substituted without code changes. ### Configuration ```yaml eventhub: reverse-geocoding: enabled: true provider: NOMINATIM nominatim: base-url: https://nominatim.openstreetmap.org public-service-enabled: false user-agent: eventhub-tachograph/0.1 (Nominatim reverse geocoding) email: "" accept-language: en connect-timeout: 10s read-timeout: 20s minimum-request-interval: 1s cache-ttl: 30d cache-max-entries: 100000 coordinate-decimal-places: 4 max-remote-lookups-per-execution: 25 ``` Environment variables use the `NOMINATIM_*` names shown in `application.yml`. For a self-hosted endpoint, set `NOMINATIM_BASE_URL`; `public-service-enabled` is not needed. For deliberately selected, policy-compliant, low-volume use of the donated public endpoint, additionally set: ```text NOMINATIM_PUBLIC_SERVICE_ENABLED=true NOMINATIM_USER_AGENT= NOMINATIM_EMAIL= ``` Production or recurring tachograph batch processing should use a self-hosted instance or a provider whose terms cover the expected workload. Coordinates may reveal vehicle or driver movements; do not send confidential or personal-location data to a public endpoint without an appropriate legal and privacy basis. ## File-session learning scope The dedicated plan defaults `ndiLearnAllFileSessionDrivers` to `true`. For a request with explicit canonical driver keys, it internally loads all drivers from selected file sessions for location learning and filters the response back to the originally requested drivers. The scope is not broadened when the source is mixed/database-only, the option is disabled, or the result cannot safely be filtered by canonical driver key. ## Response extensions Each driver partition can contain: ```text ndiHomeClassification countryTripSegmentation ``` The fields are omitted when their optional modules were not executed, preserving the existing JSON shape for normal `driver-working-time-v1` calls.