You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
DIP/docs/WAVE2_TED_STRUCTURED_SEARCH.md

4.6 KiB

Wave 2 — NEW TED Structured Search

Purpose

Wave 2 adds a NEW-runtime TED search endpoint that keeps the legacy request and response shape of /v1/documents/search, but executes the search against TED.ted_notice_projection instead of the legacy search path.

The goal is twofold:

  1. provide NEW-runtime structured TED search functionality
  2. make cutover measurable through parity checks against the legacy search implementation

Runtime scope

This functionality is active only in RuntimeMode.NEW.

Controller:

  • at.procon.dip.domain.ted.web.TedStructuredSearchController

Service:

  • at.procon.dip.domain.ted.service.TedStructuredSearchService

Repository:

  • at.procon.dip.domain.ted.search.TedStructuredSearchRepository

Endpoint

GET

GET /v1/documents/search

POST

POST /v1/documents/search

The POST body uses the existing legacy-compatible DTO:

  • at.procon.ted.model.dto.DocumentDtos.SearchRequest

The response uses:

  • at.procon.ted.model.dto.DocumentDtos.SearchResponse

Implemented structured filters

The Wave 2 implementation supports these filters:

  • countryCode
  • countryCodes
  • noticeType
  • contractNature
  • procedureType
  • cpvPrefix
  • cpvCodes
  • nutsCode
  • nutsCodes
  • publicationDateFrom
  • publicationDateTo
  • submissionDeadlineAfter
  • euFunded
  • buyerNameContains
  • projectTitleContains

Sorting and pagination

Supported sorting:

  • publicationDate
  • submissionDeadline
  • buyerName
  • projectTitle

Supported directions:

  • asc
  • desc

Pagination behavior:

  • page defaults to 0
  • size defaults to DipSearchProperties.defaultPageSize
  • size is capped by DipSearchProperties.maxPageSize

Data source

The endpoint reads from:

  • TED.ted_notice_projection

This means the quality and completeness of the search results depend on Wave 1 migration and projection backfill completeness.

Functional behavior

The Wave 2 implementation is intentionally structured-search-first.

Although the request DTO still contains:

  • semanticQuery
  • similarityThreshold

these fields are currently accepted only for request compatibility and future extension. The current repository implementation does not apply semantic ranking or semantic filtering.

That is deliberate for Wave 2, because the main objective is:

  • structured search on the NEW model
  • parity verification against legacy behavior for common structured filters

Parity strategy

Wave 2 adds parity-focused tests that compare NEW structured search behavior against the legacy TED search for a common subset of structured filters.

Recommended parity focus:

  • country filters
  • notice type
  • procedure type
  • publication date range
  • EU-funded filter
  • deterministic sort order

Parity should be evaluated on:

  • total result count
  • ordered publication ids / notice ids for stable cases
  • key metadata fields in DocumentSummary

Current limitations

  1. No semantic scoring is applied in the NEW structured TED search path yet.
  2. No TED facets/aggregations are included yet.
  3. Search is projection-based, so missing or stale ted_notice_projection rows can cause parity differences.
  4. The Wave 2 scope is TED-specific structured retrieval, not the full generic hybrid search fusion pipeline.

Example GET request

GET /v1/documents/search?countryCode=AT&noticeType=CN_STANDARD&publicationDateFrom=2025-01-01&publicationDateTo=2025-12-31&page=0&size=20&sortBy=publicationDate&sortDirection=desc

Example POST request

{
  "countryCodes": ["AT", "DE"],
  "noticeType": "CN_STANDARD",
  "contractNature": "SERVICES",
  "procedureType": "OPEN",
  "cpvPrefix": "79000000",
  "cpvCodes": ["79341000"],
  "nutsCodes": ["AT130", "DE300"],
  "publicationDateFrom": "2025-01-01",
  "publicationDateTo": "2025-12-31",
  "submissionDeadlineAfter": "2025-06-01T00:00:00Z",
  "euFunded": true,
  "buyerNameContains": "city",
  "projectTitleContains": "digital",
  "semanticQuery": "framework agreement for digital transformation services",
  "similarityThreshold": 0.7,
  "page": 0,
  "size": 20,
  "sortBy": "publicationDate",
  "sortDirection": "desc"
}

Postman collection

Use the companion file:

  • WAVE2_TED_STRUCTURED_SEARCH.postman_collection.json

It contains:

  • basic GET search
  • CPV/NUTS/buyer GET example
  • full POST structured request
  • a parity-oriented GET request for manual comparison against legacy search

After parity is accepted, the next logical enhancement is:

  1. add TED facets and richer structural filters
  2. merge structured TED narrowing with lexical/semantic ranking
  3. expose a documented parity validation checklist for cutover approval