You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
3.4 KiB
3.4 KiB
Wave 2 — Extended TED structured search in NEW runtime
What was added
This extension completes the missing parts from the earlier Wave 2 proposal:
-
Projection-aware TED structured search in NEW runtime
- endpoint:
GET /v1/documents/search - endpoint:
POST /v1/documents/search - active only in
dip.runtime.mode=NEW
- endpoint:
-
Repository-level joins across NEW projection model
DOC.doc_documentTED.ted_notice_projectionTED.ted_notice_lotTED.ted_notice_organization
-
Extended TED structured filters
countryCode,countryCodesnoticeTypecontractNatureprocedureTypecpvPrefix,cpvCodesnutsCode,nutsCodespublicationDateFrom,publicationDateTosubmissionDeadlineAftereuFundedbuyerNameContainsprojectTitleContains
-
Hybrid ranking path
- structured filters first narrow the candidate
document_idset - generic NEW lexical/trigram/semantic search ranks only inside that candidate set
- request parameter
qis used as the hybrid query text similarityThresholdis forwarded as a per-request semantic threshold override
- structured filters first narrow the candidate
-
Facets
- countries
- notice types
- procedure types
- buyers
- publication months (
YYYY-MM) - CPV families (first 2 digits)
-
Parity coverage
- NEW structured-only parity test against legacy
SearchServicefor shared filters - NEW endpoint integration test for structured results + facets
- NEW structured-only parity test against legacy
Main classes
TedStructuredSearchRepositoryTedStructuredSearchServiceTedStructuredSearchControllerTedStructuredSearchFilterTedStructuredSearchFacets
How hybrid search works
For requests with q:
- apply TED structured filters on projection tables
- collect matching
document_ids - pass those ids into NEW generic search scope as
candidateDocumentIds - let NEW search engines rank those TED documents
- map ranked hits back to TED summaries
This gives structured filtering plus lexical/trigram/semantic relevance ranking.
New configuration
dip:
ted:
projection:
structured-search-hybrid-candidate-limit: 5000
structured-search-facet-bucket-limit: 12
Current behavior notes
- Structured-only requests work without
q - Hybrid requests use
qand NEW generic ranking - When
qis present, returnedsimilaritycontains the fused NEW search score - Facets are computed from the structured candidate set before pagination
includeFacets=falsedisables facet calculationfacetBucketLimitoverrides the default bucket size per request
Compatibility notes
- The NEW endpoint reuses the legacy
DocumentDtos.SearchRequestandSearchResponse - The response was extended with optional
facets - Existing legacy clients remain compatible because extra JSON fields are additive
Parity scope
Parity is implemented for shared structured filters between legacy and NEW runtime.
Good parity candidates:
- country
- notice type
- contract nature
- procedure type
- publication date range
- submission deadline after
- eu funded
- buyer name contains
- project title contains
Legacy structured parity is not exact for filters that legacy SearchService does not implement in structured mode, especially:
- lot/organization-expanded
cpvPrefix cpvCodesnutsCodenutsCodes- lot-level EU funded semantics
Those are NEW-runtime improvements on top of legacy behavior.