# Wave 2 — Extended TED structured search in NEW runtime ## What was added This extension completes the missing parts from the earlier Wave 2 proposal: 1. **Projection-aware TED structured search in NEW runtime** - endpoint: `GET /v1/documents/search` - endpoint: `POST /v1/documents/search` - active only in `dip.runtime.mode=NEW` 2. **Repository-level joins across NEW projection model** - `DOC.doc_document` - `TED.ted_notice_projection` - `TED.ted_notice_lot` - `TED.ted_notice_organization` 3. **Extended TED structured filters** - `countryCode`, `countryCodes` - `noticeType` - `contractNature` - `procedureType` - `cpvPrefix`, `cpvCodes` - `nutsCode`, `nutsCodes` - `publicationDateFrom`, `publicationDateTo` - `submissionDeadlineAfter` - `euFunded` - `buyerNameContains` - `projectTitleContains` 4. **Hybrid ranking path** - structured filters first narrow the candidate `document_id` set - generic NEW lexical/trigram/semantic search ranks only inside that candidate set - request parameter `q` is used as the hybrid query text - `similarityThreshold` is forwarded as a per-request semantic threshold override 5. **Facets** - countries - notice types - procedure types - buyers - publication months (`YYYY-MM`) - CPV families (first 2 digits) 6. **Parity coverage** - NEW structured-only parity test against legacy `SearchService` for shared filters - NEW endpoint integration test for structured results + facets ## Main classes - `TedStructuredSearchRepository` - `TedStructuredSearchService` - `TedStructuredSearchController` - `TedStructuredSearchFilter` - `TedStructuredSearchFacets` ## How hybrid search works For requests with `q`: 1. apply TED structured filters on projection tables 2. collect matching `document_id`s 3. pass those ids into NEW generic search scope as `candidateDocumentIds` 4. let NEW search engines rank those TED documents 5. map ranked hits back to TED summaries This gives structured filtering plus lexical/trigram/semantic relevance ranking. ## New configuration ```yaml dip: ted: projection: structured-search-hybrid-candidate-limit: 5000 structured-search-facet-bucket-limit: 12 ``` ## Current behavior notes - Structured-only requests work without `q` - Hybrid requests use `q` and NEW generic ranking - When `q` is present, returned `similarity` contains the fused NEW search score - Facets are computed from the structured candidate set before pagination - `includeFacets=false` disables facet calculation - `facetBucketLimit` overrides the default bucket size per request ## Compatibility notes - The NEW endpoint reuses the legacy `DocumentDtos.SearchRequest` and `SearchResponse` - The response was extended with optional `facets` - Existing legacy clients remain compatible because extra JSON fields are additive ## Parity scope Parity is implemented for **shared structured filters** between legacy and NEW runtime. Good parity candidates: - country - notice type - contract nature - procedure type - publication date range - submission deadline after - eu funded - buyer name contains - project title contains Legacy structured parity is **not exact** for filters that legacy `SearchService` does not implement in structured mode, especially: - lot/organization-expanded `cpvPrefix` - `cpvCodes` - `nutsCode` - `nutsCodes` - lot-level EU funded semantics Those are NEW-runtime improvements on top of legacy behavior.