You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
176 lines
4.6 KiB
Markdown
176 lines
4.6 KiB
Markdown
# Wave 2 — NEW TED Structured Search
|
|
|
|
## Purpose
|
|
|
|
Wave 2 adds a NEW-runtime TED search endpoint that keeps the legacy request and response shape of `/v1/documents/search`, but executes the search against `TED.ted_notice_projection` instead of the legacy search path.
|
|
|
|
The goal is twofold:
|
|
|
|
1. provide NEW-runtime structured TED search functionality
|
|
2. make cutover measurable through parity checks against the legacy search implementation
|
|
|
|
## Runtime scope
|
|
|
|
This functionality is active only in `RuntimeMode.NEW`.
|
|
|
|
Controller:
|
|
- `at.procon.dip.domain.ted.web.TedStructuredSearchController`
|
|
|
|
Service:
|
|
- `at.procon.dip.domain.ted.service.TedStructuredSearchService`
|
|
|
|
Repository:
|
|
- `at.procon.dip.domain.ted.search.TedStructuredSearchRepository`
|
|
|
|
## Endpoint
|
|
|
|
### GET
|
|
`GET /v1/documents/search`
|
|
|
|
### POST
|
|
`POST /v1/documents/search`
|
|
|
|
The POST body uses the existing legacy-compatible DTO:
|
|
- `at.procon.ted.model.dto.DocumentDtos.SearchRequest`
|
|
|
|
The response uses:
|
|
- `at.procon.ted.model.dto.DocumentDtos.SearchResponse`
|
|
|
|
## Implemented structured filters
|
|
|
|
The Wave 2 implementation supports these filters:
|
|
|
|
- `countryCode`
|
|
- `countryCodes`
|
|
- `noticeType`
|
|
- `contractNature`
|
|
- `procedureType`
|
|
- `cpvPrefix`
|
|
- `cpvCodes`
|
|
- `nutsCode`
|
|
- `nutsCodes`
|
|
- `publicationDateFrom`
|
|
- `publicationDateTo`
|
|
- `submissionDeadlineAfter`
|
|
- `euFunded`
|
|
- `buyerNameContains`
|
|
- `projectTitleContains`
|
|
|
|
## Sorting and pagination
|
|
|
|
Supported sorting:
|
|
|
|
- `publicationDate`
|
|
- `submissionDeadline`
|
|
- `buyerName`
|
|
- `projectTitle`
|
|
|
|
Supported directions:
|
|
|
|
- `asc`
|
|
- `desc`
|
|
|
|
Pagination behavior:
|
|
|
|
- page defaults to `0`
|
|
- size defaults to `DipSearchProperties.defaultPageSize`
|
|
- size is capped by `DipSearchProperties.maxPageSize`
|
|
|
|
## Data source
|
|
|
|
The endpoint reads from:
|
|
- `TED.ted_notice_projection`
|
|
|
|
This means the quality and completeness of the search results depend on Wave 1 migration and projection backfill completeness.
|
|
|
|
## Functional behavior
|
|
|
|
The Wave 2 implementation is intentionally **structured-search-first**.
|
|
|
|
Although the request DTO still contains:
|
|
- `semanticQuery`
|
|
- `similarityThreshold`
|
|
|
|
these fields are currently accepted only for request compatibility and future extension. The current repository implementation does **not** apply semantic ranking or semantic filtering.
|
|
|
|
That is deliberate for Wave 2, because the main objective is:
|
|
- structured search on the NEW model
|
|
- parity verification against legacy behavior for common structured filters
|
|
|
|
## Parity strategy
|
|
|
|
Wave 2 adds parity-focused tests that compare NEW structured search behavior against the legacy TED search for a common subset of structured filters.
|
|
|
|
Recommended parity focus:
|
|
|
|
- country filters
|
|
- notice type
|
|
- procedure type
|
|
- publication date range
|
|
- EU-funded filter
|
|
- deterministic sort order
|
|
|
|
Parity should be evaluated on:
|
|
|
|
- total result count
|
|
- ordered publication ids / notice ids for stable cases
|
|
- key metadata fields in `DocumentSummary`
|
|
|
|
## Current limitations
|
|
|
|
1. No semantic scoring is applied in the NEW structured TED search path yet.
|
|
2. No TED facets/aggregations are included yet.
|
|
3. Search is projection-based, so missing or stale `ted_notice_projection` rows can cause parity differences.
|
|
4. The Wave 2 scope is TED-specific structured retrieval, not the full generic hybrid search fusion pipeline.
|
|
|
|
## Example GET request
|
|
|
|
```http
|
|
GET /v1/documents/search?countryCode=AT¬iceType=CN_STANDARD&publicationDateFrom=2025-01-01&publicationDateTo=2025-12-31&page=0&size=20&sortBy=publicationDate&sortDirection=desc
|
|
```
|
|
|
|
## Example POST request
|
|
|
|
```json
|
|
{
|
|
"countryCodes": ["AT", "DE"],
|
|
"noticeType": "CN_STANDARD",
|
|
"contractNature": "SERVICES",
|
|
"procedureType": "OPEN",
|
|
"cpvPrefix": "79000000",
|
|
"cpvCodes": ["79341000"],
|
|
"nutsCodes": ["AT130", "DE300"],
|
|
"publicationDateFrom": "2025-01-01",
|
|
"publicationDateTo": "2025-12-31",
|
|
"submissionDeadlineAfter": "2025-06-01T00:00:00Z",
|
|
"euFunded": true,
|
|
"buyerNameContains": "city",
|
|
"projectTitleContains": "digital",
|
|
"semanticQuery": "framework agreement for digital transformation services",
|
|
"similarityThreshold": 0.7,
|
|
"page": 0,
|
|
"size": 20,
|
|
"sortBy": "publicationDate",
|
|
"sortDirection": "desc"
|
|
}
|
|
```
|
|
|
|
## Postman collection
|
|
|
|
Use the companion file:
|
|
- `WAVE2_TED_STRUCTURED_SEARCH.postman_collection.json`
|
|
|
|
It contains:
|
|
- basic GET search
|
|
- CPV/NUTS/buyer GET example
|
|
- full POST structured request
|
|
- a parity-oriented GET request for manual comparison against legacy search
|
|
|
|
## Recommended next step after Wave 2 validation
|
|
|
|
After parity is accepted, the next logical enhancement is:
|
|
|
|
1. add TED facets and richer structural filters
|
|
2. merge structured TED narrowing with lexical/semantic ranking
|
|
3. expose a documented parity validation checklist for cutover approval
|