Compare commits

..

No commits in common. '3284205a9e45242e0620b1b0384de9c9d958a7e3' and '0ce5f5138208de2c8dae205c1c06ab02d52dd3f1' have entirely different histories.

@ -1,175 +0,0 @@
# Wave 2 — NEW TED Structured Search
## Purpose
Wave 2 adds a NEW-runtime TED search endpoint that keeps the legacy request and response shape of `/v1/documents/search`, but executes the search against `TED.ted_notice_projection` instead of the legacy search path.
The goal is twofold:
1. provide NEW-runtime structured TED search functionality
2. make cutover measurable through parity checks against the legacy search implementation
## Runtime scope
This functionality is active only in `RuntimeMode.NEW`.
Controller:
- `at.procon.dip.domain.ted.web.TedStructuredSearchController`
Service:
- `at.procon.dip.domain.ted.service.TedStructuredSearchService`
Repository:
- `at.procon.dip.domain.ted.search.TedStructuredSearchRepository`
## Endpoint
### GET
`GET /v1/documents/search`
### POST
`POST /v1/documents/search`
The POST body uses the existing legacy-compatible DTO:
- `at.procon.ted.model.dto.DocumentDtos.SearchRequest`
The response uses:
- `at.procon.ted.model.dto.DocumentDtos.SearchResponse`
## Implemented structured filters
The Wave 2 implementation supports these filters:
- `countryCode`
- `countryCodes`
- `noticeType`
- `contractNature`
- `procedureType`
- `cpvPrefix`
- `cpvCodes`
- `nutsCode`
- `nutsCodes`
- `publicationDateFrom`
- `publicationDateTo`
- `submissionDeadlineAfter`
- `euFunded`
- `buyerNameContains`
- `projectTitleContains`
## Sorting and pagination
Supported sorting:
- `publicationDate`
- `submissionDeadline`
- `buyerName`
- `projectTitle`
Supported directions:
- `asc`
- `desc`
Pagination behavior:
- page defaults to `0`
- size defaults to `DipSearchProperties.defaultPageSize`
- size is capped by `DipSearchProperties.maxPageSize`
## Data source
The endpoint reads from:
- `TED.ted_notice_projection`
This means the quality and completeness of the search results depend on Wave 1 migration and projection backfill completeness.
## Functional behavior
The Wave 2 implementation is intentionally **structured-search-first**.
Although the request DTO still contains:
- `semanticQuery`
- `similarityThreshold`
these fields are currently accepted only for request compatibility and future extension. The current repository implementation does **not** apply semantic ranking or semantic filtering.
That is deliberate for Wave 2, because the main objective is:
- structured search on the NEW model
- parity verification against legacy behavior for common structured filters
## Parity strategy
Wave 2 adds parity-focused tests that compare NEW structured search behavior against the legacy TED search for a common subset of structured filters.
Recommended parity focus:
- country filters
- notice type
- procedure type
- publication date range
- EU-funded filter
- deterministic sort order
Parity should be evaluated on:
- total result count
- ordered publication ids / notice ids for stable cases
- key metadata fields in `DocumentSummary`
## Current limitations
1. No semantic scoring is applied in the NEW structured TED search path yet.
2. No TED facets/aggregations are included yet.
3. Search is projection-based, so missing or stale `ted_notice_projection` rows can cause parity differences.
4. The Wave 2 scope is TED-specific structured retrieval, not the full generic hybrid search fusion pipeline.
## Example GET request
```http
GET /v1/documents/search?countryCode=AT&noticeType=CN_STANDARD&publicationDateFrom=2025-01-01&publicationDateTo=2025-12-31&page=0&size=20&sortBy=publicationDate&sortDirection=desc
```
## Example POST request
```json
{
"countryCodes": ["AT", "DE"],
"noticeType": "CN_STANDARD",
"contractNature": "SERVICES",
"procedureType": "OPEN",
"cpvPrefix": "79000000",
"cpvCodes": ["79341000"],
"nutsCodes": ["AT130", "DE300"],
"publicationDateFrom": "2025-01-01",
"publicationDateTo": "2025-12-31",
"submissionDeadlineAfter": "2025-06-01T00:00:00Z",
"euFunded": true,
"buyerNameContains": "city",
"projectTitleContains": "digital",
"semanticQuery": "framework agreement for digital transformation services",
"similarityThreshold": 0.7,
"page": 0,
"size": 20,
"sortBy": "publicationDate",
"sortDirection": "desc"
}
```
## Postman collection
Use the companion file:
- `WAVE2_TED_STRUCTURED_SEARCH.postman_collection.json`
It contains:
- basic GET search
- CPV/NUTS/buyer GET example
- full POST structured request
- a parity-oriented GET request for manual comparison against legacy search
## Recommended next step after Wave 2 validation
After parity is accepted, the next logical enhancement is:
1. add TED facets and richer structural filters
2. merge structured TED narrowing with lexical/semantic ranking
3. expose a documented parity validation checklist for cutover approval

@ -1,117 +0,0 @@
# Wave 2 — Extended TED structured search in NEW runtime
## What was added
This extension completes the missing parts from the earlier Wave 2 proposal:
1. **Projection-aware TED structured search in NEW runtime**
- endpoint: `GET /v1/documents/search`
- endpoint: `POST /v1/documents/search`
- active only in `dip.runtime.mode=NEW`
2. **Repository-level joins across NEW projection model**
- `DOC.doc_document`
- `TED.ted_notice_projection`
- `TED.ted_notice_lot`
- `TED.ted_notice_organization`
3. **Extended TED structured filters**
- `countryCode`, `countryCodes`
- `noticeType`
- `contractNature`
- `procedureType`
- `cpvPrefix`, `cpvCodes`
- `nutsCode`, `nutsCodes`
- `publicationDateFrom`, `publicationDateTo`
- `submissionDeadlineAfter`
- `euFunded`
- `buyerNameContains`
- `projectTitleContains`
4. **Hybrid ranking path**
- structured filters first narrow the candidate `document_id` set
- generic NEW lexical/trigram/semantic search ranks only inside that candidate set
- request parameter `q` is used as the hybrid query text
- `similarityThreshold` is forwarded as a per-request semantic threshold override
5. **Facets**
- countries
- notice types
- procedure types
- buyers
- publication months (`YYYY-MM`)
- CPV families (first 2 digits)
6. **Parity coverage**
- NEW structured-only parity test against legacy `SearchService` for shared filters
- NEW endpoint integration test for structured results + facets
## Main classes
- `TedStructuredSearchRepository`
- `TedStructuredSearchService`
- `TedStructuredSearchController`
- `TedStructuredSearchFilter`
- `TedStructuredSearchFacets`
## How hybrid search works
For requests with `q`:
1. apply TED structured filters on projection tables
2. collect matching `document_id`s
3. pass those ids into NEW generic search scope as `candidateDocumentIds`
4. let NEW search engines rank those TED documents
5. map ranked hits back to TED summaries
This gives structured filtering plus lexical/trigram/semantic relevance ranking.
## New configuration
```yaml
dip:
ted:
projection:
structured-search-hybrid-candidate-limit: 5000
structured-search-facet-bucket-limit: 12
```
## Current behavior notes
- Structured-only requests work without `q`
- Hybrid requests use `q` and NEW generic ranking
- When `q` is present, returned `similarity` contains the fused NEW search score
- Facets are computed from the structured candidate set before pagination
- `includeFacets=false` disables facet calculation
- `facetBucketLimit` overrides the default bucket size per request
## Compatibility notes
- The NEW endpoint reuses the legacy `DocumentDtos.SearchRequest` and `SearchResponse`
- The response was extended with optional `facets`
- Existing legacy clients remain compatible because extra JSON fields are additive
## Parity scope
Parity is implemented for **shared structured filters** between legacy and NEW runtime.
Good parity candidates:
- country
- notice type
- contract nature
- procedure type
- publication date range
- submission deadline after
- eu funded
- buyer name contains
- project title contains
Legacy structured parity is **not exact** for filters that legacy `SearchService` does not implement in structured mode, especially:
- lot/organization-expanded `cpvPrefix`
- `cpvCodes`
- `nutsCode`
- `nutsCodes`
- lot-level EU funded semantics
Those are NEW-runtime improvements on top of legacy behavior.

@ -1,178 +0,0 @@
{
"info": {
"_postman_id": "9f9b7a8a-b96b-4f3a-a377-0ce5b54d0a01",
"name": "DIP Semantic Search - e5-default",
"schema": "https://schema.getpostman.com/json/collection/v2.1.0/collection.json",
"description": "Sample semantic and hybrid search queries against the DIP generic search endpoint using semanticModelKey=e5-default (intfloat/multilingual-e5-large)."
},
"item": [
{
"name": "Search / Semantic / English",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"queryText\": \"framework agreement for district heating optimization in municipal energy systems\",\n \"modes\": [\n \"SEMANTIC\"\n ],\n \"semanticModelKey\": \"e5-default\",\n \"collapseByDocument\": true,\n \"representationSelectionMode\": \"PRIMARY_AND_CHUNKS\",\n \"page\": 0,\n \"size\": 10\n}"
},
"url": {
"raw": "{{baseUrl}}/search",
"host": [
"{{baseUrl}}"
],
"path": [
"search"
]
}
}
},
{
"name": "Search / Semantic / German",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"queryText\": \"Rahmenvertrag für die Optimierung von Fernwärmesystemen in kommunalen Energienetzen\",\n \"modes\": [\n \"SEMANTIC\"\n ],\n \"semanticModelKey\": \"e5-default\",\n \"collapseByDocument\": true,\n \"representationSelectionMode\": \"PRIMARY_AND_CHUNKS\",\n \"page\": 0,\n \"size\": 10\n}"
},
"url": {
"raw": "{{baseUrl}}/search",
"host": [
"{{baseUrl}}"
],
"path": [
"search"
]
}
}
},
{
"name": "Search / Semantic / Bulgarian",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"queryText\": \"рамково споразумение за оптимизация на системи за централно отопление в общински енергийни мрежи\",\n \"modes\": [\n \"SEMANTIC\"\n ],\n \"semanticModelKey\": \"e5-default\",\n \"collapseByDocument\": true,\n \"representationSelectionMode\": \"PRIMARY_AND_CHUNKS\",\n \"page\": 0,\n \"size\": 10\n}"
},
"url": {
"raw": "{{baseUrl}}/search",
"host": [
"{{baseUrl}}"
],
"path": [
"search"
]
}
}
},
{
"name": "Search / Hybrid / English",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"queryText\": \"district heating optimization framework agreement\",\n \"modes\": [\n \"HYBRID\"\n ],\n \"semanticModelKey\": \"e5-default\",\n \"collapseByDocument\": true,\n \"representationSelectionMode\": \"PRIMARY_AND_CHUNKS\",\n \"page\": 0,\n \"size\": 10\n}"
},
"url": {
"raw": "{{baseUrl}}/search",
"host": [
"{{baseUrl}}"
],
"path": [
"search"
]
}
}
},
{
"name": "Search / Semantic / Generic Filters",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"queryText\": \"municipal energy efficiency strategy\",\n \"modes\": [\n \"SEMANTIC\"\n ],\n \"semanticModelKey\": \"e5-default\",\n \"documentTypes\": [\n \"TEXT\",\n \"HTML\",\n \"PDF\"\n ],\n \"documentFamilies\": [\n \"GENERIC\"\n ],\n \"representationTypes\": [\n \"SEMANTIC_TEXT\",\n \"CHUNK\"\n ],\n \"languageCodes\": [\n \"en\",\n \"de\",\n \"bg\"\n ],\n \"collapseByDocument\": true,\n \"representationSelectionMode\": \"PRIMARY_AND_CHUNKS\",\n \"page\": 0,\n \"size\": 10\n}"
},
"url": {
"raw": "{{baseUrl}}/search",
"host": [
"{{baseUrl}}"
],
"path": [
"search"
]
}
}
},
{
"name": "Search / Debug / Semantic",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"queryText\": \"district heating optimization\",\n \"modes\": [\n \"SEMANTIC\"\n ],\n \"semanticModelKey\": \"e5-default\",\n \"collapseByDocument\": true,\n \"representationSelectionMode\": \"PRIMARY_AND_CHUNKS\",\n \"page\": 0,\n \"size\": 10\n}"
},
"url": {
"raw": "{{baseUrl}}/search/debug",
"host": [
"{{baseUrl}}"
],
"path": [
"search",
"debug"
]
}
}
},
{
"name": "Search / Metrics",
"request": {
"method": "GET",
"header": [],
"url": {
"raw": "{{baseUrl}}/search/metrics",
"host": [
"{{baseUrl}}"
],
"path": [
"search",
"metrics"
]
}
}
}
]
}

@ -1,15 +0,0 @@
{
"id": "f2cf3c4b-e0f7-45ff-a9c2-32f4d3d23770",
"name": "DIP Semantic Search Local",
"values": [
{
"key": "baseUrl",
"value": "http://localhost:8080/api",
"type": "default",
"enabled": true
}
],
"_postman_variable_scope": "environment",
"_postman_exported_at": "2026-03-23T13:00:00Z",
"_postman_exported_using": "OpenAI ChatGPT"
}

@ -1,103 +0,0 @@
{
"info": {
"name": "Wave 2 TED Structured Search Extended",
"schema": "https://schema.getpostman.com/json/collection/v2.1.0/collection.json",
"description": "NEW runtime TED structured search with projection-aware filters, hybrid ranking, and facets."
},
"variable": [
{ "key": "baseUrl", "value": "http://localhost:8080/api" }
],
"item": [
{
"name": "Structured only - GET",
"request": {
"method": "GET",
"url": {
"raw": "{{baseUrl}}/v1/documents/search?countryCode=AUT&noticeType=CONTRACT_NOTICE&includeFacets=true&page=0&size=20&sortBy=publicationDate&sortDirection=desc",
"host": ["{{baseUrl}}"],
"path": ["v1", "documents", "search"],
"query": [
{ "key": "countryCode", "value": "AUT" },
{ "key": "noticeType", "value": "CONTRACT_NOTICE" },
{ "key": "includeFacets", "value": "true" },
{ "key": "page", "value": "0" },
{ "key": "size", "value": "20" },
{ "key": "sortBy", "value": "publicationDate" },
{ "key": "sortDirection", "value": "desc" }
]
}
},
"event": [{
"listen": "test",
"script": {
"exec": [
"pm.test('status 200', function () { pm.response.to.have.status(200); });",
"const json = pm.response.json();",
"pm.test('documents array exists', function () { pm.expect(json.documents).to.be.an('array'); });",
"pm.test('facets object exists', function () { pm.expect(json.facets).to.be.an('object'); });"
]
}
}]
},
{
"name": "Hybrid ranked TED search - GET",
"request": {
"method": "GET",
"url": {
"raw": "{{baseUrl}}/v1/documents/search?countryCode=DEU&cpvPrefix=33&q=medical imaging systems&similarityThreshold=0.65&includeFacets=true",
"host": ["{{baseUrl}}"],
"path": ["v1", "documents", "search"],
"query": [
{ "key": "countryCode", "value": "DEU" },
{ "key": "cpvPrefix", "value": "33" },
{ "key": "q", "value": "medical imaging systems" },
{ "key": "similarityThreshold", "value": "0.65" },
{ "key": "includeFacets", "value": "true" }
]
}
},
"event": [{
"listen": "test",
"script": {
"exec": [
"pm.test('status 200', function () { pm.response.to.have.status(200); });",
"const json = pm.response.json();",
"pm.test('documents array exists', function () { pm.expect(json.documents).to.be.an('array'); });"
]
}
}]
},
{
"name": "Structured only - POST with facets",
"request": {
"method": "POST",
"header": [{ "key": "Content-Type", "value": "application/json" }],
"body": {
"mode": "raw",
"raw": "{\n \"countryCodes\": [\"AUT\", \"DEU\"],\n \"noticeType\": \"CONTRACT_NOTICE\",\n \"contractNature\": \"SUPPLIES\",\n \"procedureType\": \"OPEN\",\n \"publicationDateFrom\": \"2026-01-01\",\n \"publicationDateTo\": \"2026-12-31\",\n \"includeFacets\": true,\n \"facetBucketLimit\": 10,\n \"page\": 0,\n \"size\": 20,\n \"sortBy\": \"publicationDate\",\n \"sortDirection\": \"desc\"\n}"
},
"url": {
"raw": "{{baseUrl}}/v1/documents/search",
"host": ["{{baseUrl}}"],
"path": ["v1", "documents", "search"]
}
}
},
{
"name": "Parity-style request for shared legacy filters",
"request": {
"method": "POST",
"header": [{ "key": "Content-Type", "value": "application/json" }],
"body": {
"mode": "raw",
"raw": "{\n \"countryCode\": \"AUT\",\n \"noticeType\": \"CONTRACT_NOTICE\",\n \"contractNature\": \"SERVICES\",\n \"procedureType\": \"OPEN\",\n \"projectTitleContains\": \"maintenance\",\n \"publicationDateFrom\": \"2026-04-01\",\n \"publicationDateTo\": \"2026-04-30\",\n \"page\": 0,\n \"size\": 20,\n \"sortBy\": \"publicationDate\",\n \"sortDirection\": \"desc\"\n}"
},
"url": {
"raw": "{{baseUrl}}/v1/documents/search",
"host": ["{{baseUrl}}"],
"path": ["v1", "documents", "search"]
}
}
}
]
}

@ -13,8 +13,4 @@ public class TedProjectionProperties {
private boolean startupBackfillEnabled = false;
@Positive
private int startupBackfillLimit = 250;
@Positive
private int structuredSearchHybridCandidateLimit = 5000;
@Positive
private int structuredSearchFacetBucketLimit = 12;
}

@ -1,395 +0,0 @@
package at.procon.dip.domain.ted.search;
import at.procon.dip.domain.ted.search.dto.TedStructuredSearchFacetEntry;
import at.procon.dip.domain.ted.search.dto.TedStructuredSearchFacets;
import at.procon.dip.domain.ted.search.dto.TedStructuredSearchFilter;
import at.procon.dip.domain.ted.search.dto.TedStructuredSearchSummaryRow;
import at.procon.ted.model.entity.ContractNature;
import at.procon.ted.model.entity.NoticeType;
import at.procon.ted.model.entity.ProcedureType;
import java.math.BigDecimal;
import java.sql.Array;
import java.sql.SQLException;
import java.time.LocalDate;
import java.time.OffsetDateTime;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.UUID;
import java.util.stream.Collectors;
import lombok.RequiredArgsConstructor;
import org.springframework.jdbc.core.RowMapper;
import org.springframework.jdbc.core.namedparam.MapSqlParameterSource;
import org.springframework.jdbc.core.namedparam.NamedParameterJdbcTemplate;
import org.springframework.stereotype.Repository;
import org.springframework.util.StringUtils;
@Repository
@RequiredArgsConstructor
public class TedStructuredSearchRepository {
private final NamedParameterJdbcTemplate jdbcTemplate;
public List<UUID> findCandidateDocumentIds(TedStructuredSearchFilter filter, int limit) {
StringBuilder sql = new StringBuilder(baseFromWhere(filter, false));
sql.insert(0, "SELECT p.document_id ");
sql.append(" GROUP BY p.document_id, p.publication_date, p.created_at");
sql.append(" ORDER BY p.publication_date DESC NULLS LAST, p.created_at DESC LIMIT :limit");
MapSqlParameterSource params = params(filter);
params.addValue("limit", limit);
return jdbcTemplate.query(sql.toString(), params, (rs, rowNum) -> rs.getObject(1, UUID.class));
}
public long countDistinctDocuments(TedStructuredSearchFilter filter) {
StringBuilder sql = new StringBuilder("SELECT COUNT(DISTINCT p.document_id) ");
sql.append(baseFromWhere(filter, false));
return jdbcTemplate.queryForObject(sql.toString(), params(filter), Long.class);
}
public List<TedStructuredSearchSummaryRow> searchStructured(TedStructuredSearchFilter filter,
int page,
int size,
String sortBy,
String sortDirection) {
StringBuilder sql = new StringBuilder("""
SELECT
p.document_id,
p.publication_id,
p.notice_id,
p.notice_type,
p.project_title,
p.buyer_name,
p.buyer_country_code,
p.buyer_city,
p.contract_nature,
p.procedure_type,
p.publication_date,
p.submission_deadline,
p.cpv_codes,
p.total_lots,
p.estimated_value,
p.estimated_value_currency
""");
sql.append(baseFromWhere(filter, false));
sql.append(" GROUP BY p.document_id, p.publication_id, p.notice_id, p.notice_type, p.project_title, p.buyer_name, p.buyer_country_code, p.buyer_city, p.contract_nature, p.procedure_type, p.publication_date, p.submission_deadline, p.cpv_codes, p.total_lots, p.estimated_value, p.estimated_value_currency");
sql.append(" ORDER BY ").append(resolveSort(sortBy, sortDirection));
sql.append(" LIMIT :limit OFFSET :offset");
MapSqlParameterSource params = params(filter);
params.addValue("limit", size);
params.addValue("offset", Math.max(0, page) * size);
return jdbcTemplate.query(sql.toString(), params, SUMMARY_ROW_MAPPER);
}
public List<TedStructuredSearchSummaryRow> findSummariesByDocumentIds(List<UUID> documentIds) {
if (documentIds == null || documentIds.isEmpty()) {
return List.of();
}
String sql = """
SELECT
p.document_id,
p.publication_id,
p.notice_id,
p.notice_type,
p.project_title,
p.buyer_name,
p.buyer_country_code,
p.buyer_city,
p.contract_nature,
p.procedure_type,
p.publication_date,
p.submission_deadline,
p.cpv_codes,
p.total_lots,
p.estimated_value,
p.estimated_value_currency
FROM TED.ted_notice_projection p
WHERE p.document_id IN (:documentIds)
""";
List<TedStructuredSearchSummaryRow> rows = jdbcTemplate.query(sql, new MapSqlParameterSource("documentIds", documentIds), SUMMARY_ROW_MAPPER);
Map<UUID, TedStructuredSearchSummaryRow> byId = rows.stream().collect(Collectors.toMap(TedStructuredSearchSummaryRow::documentId, r -> r));
List<TedStructuredSearchSummaryRow> ordered = new ArrayList<>();
for (UUID id : documentIds) {
TedStructuredSearchSummaryRow row = byId.get(id);
if (row != null) {
ordered.add(row);
}
}
return ordered;
}
public TedStructuredSearchFacets computeFacets(TedStructuredSearchFilter filter, int bucketLimit) {
int safeLimit = Math.max(1, bucketLimit);
return TedStructuredSearchFacets.builder()
.countries(runFacet(filter, "COALESCE(p.buyer_country_code, '')", "COALESCE(p.buyer_country_code, '')", safeLimit))
.noticeTypes(runFacet(filter, "CAST(p.notice_type AS text)", "CAST(p.notice_type AS text)", safeLimit))
.procedureTypes(runFacet(filter, "CAST(p.procedure_type AS text)", "CAST(p.procedure_type AS text)", safeLimit))
.buyers(runFacet(filter, "COALESCE(p.buyer_name, '')", "COALESCE(p.buyer_name, '')", safeLimit))
.publicationMonths(runFacet(filter, "to_char(p.publication_date, 'YYYY-MM')", "to_char(p.publication_date, 'YYYY-MM')", safeLimit))
.cpvFamilies(runCpvFamilyFacet(filter, safeLimit))
.build();
}
private List<TedStructuredSearchFacetEntry> runFacet(TedStructuredSearchFilter filter,
String keyExpr,
String labelExpr,
int limit) {
StringBuilder sql = new StringBuilder("SELECT ")
.append(keyExpr).append(" AS key, ")
.append(labelExpr).append(" AS label, COUNT(DISTINCT p.document_id) AS cnt ");
sql.append(baseFromWhere(filter, false));
sql.append(" GROUP BY ").append(keyExpr).append(", ").append(labelExpr)
.append(" HAVING ").append(keyExpr).append(" IS NOT NULL AND ").append(keyExpr).append(" <> ''")
.append(" ORDER BY cnt DESC, label ASC LIMIT :facetLimit");
MapSqlParameterSource params = params(filter);
params.addValue("facetLimit", limit);
return jdbcTemplate.query(sql.toString(), params, (rs, rowNum) -> TedStructuredSearchFacetEntry.builder()
.key(rs.getString("key"))
.label(rs.getString("label"))
.count(rs.getLong("cnt"))
.build());
}
private List<TedStructuredSearchFacetEntry> runCpvFamilyFacet(TedStructuredSearchFilter filter, int limit) {
StringBuilder sql = new StringBuilder("""
SELECT LEFT(code, 2) AS key, LEFT(code, 2) AS label, COUNT(DISTINCT document_id) AS cnt
FROM (
SELECT p.document_id, unnest(COALESCE(p.cpv_codes, ARRAY[]::varchar[])) AS code
""");
sql.append(baseFromWhere(filter, true));
sql.append(" UNION ALL SELECT p.document_id, unnest(COALESCE(l.cpv_codes, ARRAY[]::varchar[])) AS code ");
sql.append(baseFromWhere(filter, false));
sql.append(" ) cpv WHERE code IS NOT NULL AND code <> '' GROUP BY LEFT(code, 2) ORDER BY cnt DESC, label ASC LIMIT :facetLimit");
MapSqlParameterSource params = params(filter);
params.addValue("facetLimit", limit);
return jdbcTemplate.query(sql.toString(), params, (rs, rowNum) -> TedStructuredSearchFacetEntry.builder()
.key(rs.getString("key"))
.label(rs.getString("label"))
.count(rs.getLong("cnt"))
.build());
}
private String baseFromWhere(TedStructuredSearchFilter filter, boolean projectionOnly) {
StringBuilder sql = new StringBuilder("""
FROM TED.ted_notice_projection p
JOIN DOC.doc_document d ON d.id = p.document_id
""");
if (!projectionOnly) {
sql.append(" LEFT JOIN TED.ted_notice_lot l ON l.notice_projection_id = p.id");
sql.append(" LEFT JOIN TED.ted_notice_organization o ON o.notice_projection_id = p.id");
}
sql.append(" WHERE 1=1");
appendFilters(sql, filter, projectionOnly);
return sql.toString();
}
private MapSqlParameterSource params(TedStructuredSearchFilter filter) {
MapSqlParameterSource params = new MapSqlParameterSource();
if (filter == null) {
return params;
}
if (StringUtils.hasText(filter.getCountryCode())) {
params.addValue("countryCode", filter.getCountryCode().trim());
}
if (filter.getCountryCodes() != null && !filter.getCountryCodes().isEmpty()) {
params.addValue("countryCodes", filter.getCountryCodes());
}
if (filter.getNoticeType() != null) {
params.addValue("noticeType", filter.getNoticeType().name());
}
if (filter.getContractNature() != null) {
params.addValue("contractNature", filter.getContractNature().name());
}
if (filter.getProcedureType() != null) {
params.addValue("procedureType", filter.getProcedureType().name());
}
if (StringUtils.hasText(filter.getCpvPrefix())) {
params.addValue("cpvPrefixLike", filter.getCpvPrefix().trim() + "%");
}
if (filter.getCpvCodes() != null && !filter.getCpvCodes().isEmpty()) {
params.addValue("cpvCodes", filter.getCpvCodes());
}
if (StringUtils.hasText(filter.getNutsCode())) {
params.addValue("nutsCodeLike", filter.getNutsCode().trim() + "%");
}
if (filter.getNutsCodes() != null && !filter.getNutsCodes().isEmpty()) {
params.addValue("nutsCodes", filter.getNutsCodes());
}
if (filter.getPublicationDateFrom() != null) {
params.addValue("publicationDateFrom", filter.getPublicationDateFrom());
}
if (filter.getPublicationDateTo() != null) {
params.addValue("publicationDateTo", filter.getPublicationDateTo());
}
if (filter.getSubmissionDeadlineAfter() != null) {
params.addValue("submissionDeadlineAfter", filter.getSubmissionDeadlineAfter());
}
if (filter.getEuFunded() != null) {
params.addValue("euFunded", filter.getEuFunded());
}
if (StringUtils.hasText(filter.getBuyerNameContains())) {
params.addValue("buyerNameLike", "%" + filter.getBuyerNameContains().trim().toLowerCase() + "%");
}
if (StringUtils.hasText(filter.getProjectTitleContains())) {
params.addValue("projectTitleLike", "%" + filter.getProjectTitleContains().trim().toLowerCase() + "%");
}
return params;
}
private void appendFilters(StringBuilder sql, TedStructuredSearchFilter filter, boolean projectionOnly) {
if (filter == null) {
return;
}
if (StringUtils.hasText(filter.getCountryCode())) {
sql.append(" AND p.buyer_country_code = :countryCode");
}
if (filter.getCountryCodes() != null && !filter.getCountryCodes().isEmpty()) {
sql.append(" AND p.buyer_country_code IN (:countryCodes)");
}
if (filter.getNoticeType() != null) {
sql.append(" AND CAST(p.notice_type AS text) = :noticeType");
}
if (filter.getContractNature() != null) {
sql.append(" AND CAST(p.contract_nature AS text) = :contractNature");
}
if (filter.getProcedureType() != null) {
sql.append(" AND CAST(p.procedure_type AS text) = :procedureType");
}
if (StringUtils.hasText(filter.getCpvPrefix())) {
sql.append(" AND (")
.append(" EXISTS (SELECT 1 FROM unnest(COALESCE(p.cpv_codes, ARRAY[]::varchar[])) cpv WHERE cpv LIKE :cpvPrefixLike)");
if (!projectionOnly) {
sql.append(" OR EXISTS (SELECT 1 FROM unnest(COALESCE(l.cpv_codes, ARRAY[]::varchar[])) cpv WHERE cpv LIKE :cpvPrefixLike)");
}
sql.append(")");
}
if (filter.getCpvCodes() != null && !filter.getCpvCodes().isEmpty()) {
sql.append(" AND (")
.append(" EXISTS (SELECT 1 FROM unnest(COALESCE(p.cpv_codes, ARRAY[]::varchar[])) cpv WHERE cpv IN (:cpvCodes))");
if (!projectionOnly) {
sql.append(" OR EXISTS (SELECT 1 FROM unnest(COALESCE(l.cpv_codes, ARRAY[]::varchar[])) cpv WHERE cpv IN (:cpvCodes))");
}
sql.append(")");
}
if (StringUtils.hasText(filter.getNutsCode())) {
sql.append(" AND (")
.append(" p.buyer_nuts_code LIKE :nutsCodeLike")
.append(" OR EXISTS (SELECT 1 FROM unnest(COALESCE(p.nuts_codes, ARRAY[]::varchar[])) nuts WHERE nuts LIKE :nutsCodeLike)");
if (!projectionOnly) {
sql.append(" OR EXISTS (SELECT 1 FROM unnest(COALESCE(l.nuts_codes, ARRAY[]::varchar[])) nuts WHERE nuts LIKE :nutsCodeLike)")
.append(" OR COALESCE(o.nuts_code, '') LIKE :nutsCodeLike");
}
sql.append(")");
}
if (filter.getNutsCodes() != null && !filter.getNutsCodes().isEmpty()) {
sql.append(" AND (")
.append(" p.buyer_nuts_code IN (:nutsCodes)")
.append(" OR EXISTS (SELECT 1 FROM unnest(COALESCE(p.nuts_codes, ARRAY[]::varchar[])) nuts WHERE nuts IN (:nutsCodes))");
if (!projectionOnly) {
sql.append(" OR EXISTS (SELECT 1 FROM unnest(COALESCE(l.nuts_codes, ARRAY[]::varchar[])) nuts WHERE nuts IN (:nutsCodes))")
.append(" OR COALESCE(o.nuts_code, '') IN (:nutsCodes)");
}
sql.append(")");
}
if (filter.getPublicationDateFrom() != null) {
sql.append(" AND p.publication_date >= :publicationDateFrom");
}
if (filter.getPublicationDateTo() != null) {
sql.append(" AND p.publication_date <= :publicationDateTo");
}
if (filter.getSubmissionDeadlineAfter() != null) {
sql.append(" AND (p.submission_deadline > :submissionDeadlineAfter");
if (!projectionOnly) {
sql.append(" OR l.submission_deadline > :submissionDeadlineAfter");
}
sql.append(")");
}
if (filter.getEuFunded() != null) {
if (filter.getEuFunded()) {
sql.append(" AND (COALESCE(p.eu_funded, false) = true");
if (!projectionOnly) {
sql.append(" OR COALESCE(l.eu_funded, false) = true");
}
sql.append(")");
} else {
sql.append(" AND COALESCE(p.eu_funded, false) = false");
if (!projectionOnly) {
sql.append(" AND NOT EXISTS (SELECT 1 FROM TED.ted_notice_lot lx WHERE lx.notice_projection_id = p.id AND COALESCE(lx.eu_funded, false) = true)");
}
}
}
if (StringUtils.hasText(filter.getBuyerNameContains())) {
sql.append(" AND (")
.append(" LOWER(COALESCE(p.buyer_name, '')) LIKE :buyerNameLike");
if (!projectionOnly) {
sql.append(" OR LOWER(COALESCE(o.name, '')) LIKE :buyerNameLike");
}
sql.append(")");
}
if (StringUtils.hasText(filter.getProjectTitleContains())) {
sql.append(" AND (")
.append(" LOWER(COALESCE(p.project_title, '')) LIKE :projectTitleLike");
if (!projectionOnly) {
sql.append(" OR LOWER(COALESCE(l.title, '')) LIKE :projectTitleLike");
}
sql.append(")");
}
}
private String resolveSort(String sortBy, String sortDirection) {
boolean asc = "asc".equalsIgnoreCase(sortDirection);
String dir = asc ? "ASC" : "DESC";
String field = sortBy == null ? "publicationDate" : sortBy;
return switch (field) {
case "submissionDeadline" -> "p.submission_deadline " + dir + " NULLS LAST, p.publication_date DESC NULLS LAST";
case "buyerName" -> "p.buyer_name " + dir + " NULLS LAST, p.publication_date DESC NULLS LAST";
case "projectTitle" -> "p.project_title " + dir + " NULLS LAST, p.publication_date DESC NULLS LAST";
default -> "p.publication_date " + dir + " NULLS LAST";
};
}
private static final RowMapper<TedStructuredSearchSummaryRow> SUMMARY_ROW_MAPPER = (rs, rowNum) -> new TedStructuredSearchSummaryRow(
rs.getObject("document_id", UUID.class),
rs.getString("publication_id"),
rs.getString("notice_id"),
parseNoticeType(rs.getString("notice_type")),
rs.getString("project_title"),
rs.getString("buyer_name"),
rs.getString("buyer_country_code"),
rs.getString("buyer_city"),
parseContractNature(rs.getString("contract_nature")),
parseProcedureType(rs.getString("procedure_type")),
rs.getObject("publication_date", LocalDate.class),
rs.getObject("submission_deadline", OffsetDateTime.class),
stringArray(rs.getArray("cpv_codes")),
rs.getObject("total_lots") != null ? rs.getInt("total_lots") : null,
rs.getBigDecimal("estimated_value"),
rs.getString("estimated_value_currency")
);
private static NoticeType parseNoticeType(String value) {
return value == null ? null : NoticeType.valueOf(value);
}
private static ContractNature parseContractNature(String value) {
return value == null ? null : ContractNature.valueOf(value);
}
private static ProcedureType parseProcedureType(String value) {
return value == null ? null : ProcedureType.valueOf(value);
}
private static List<String> stringArray(Array array) throws SQLException {
if (array == null) {
return List.of();
}
Object raw = array.getArray();
if (raw instanceof String[] strings) {
return Arrays.asList(strings);
}
return List.of();
}
}

@ -1,16 +0,0 @@
package at.procon.dip.domain.ted.search.dto;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;
@Data
@Builder
@NoArgsConstructor
@AllArgsConstructor
public class TedStructuredSearchFacetEntry {
private String key;
private String label;
private long count;
}

@ -1,20 +0,0 @@
package at.procon.dip.domain.ted.search.dto;
import java.util.List;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;
@Data
@Builder
@NoArgsConstructor
@AllArgsConstructor
public class TedStructuredSearchFacets {
private List<TedStructuredSearchFacetEntry> countries;
private List<TedStructuredSearchFacetEntry> noticeTypes;
private List<TedStructuredSearchFacetEntry> procedureTypes;
private List<TedStructuredSearchFacetEntry> buyers;
private List<TedStructuredSearchFacetEntry> publicationMonths;
private List<TedStructuredSearchFacetEntry> cpvFamilies;
}

@ -1,34 +0,0 @@
package at.procon.dip.domain.ted.search.dto;
import at.procon.ted.model.entity.ContractNature;
import at.procon.ted.model.entity.NoticeType;
import at.procon.ted.model.entity.ProcedureType;
import java.time.LocalDate;
import java.time.OffsetDateTime;
import java.util.List;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;
@Data
@Builder
@NoArgsConstructor
@AllArgsConstructor
public class TedStructuredSearchFilter {
private String countryCode;
private List<String> countryCodes;
private NoticeType noticeType;
private ContractNature contractNature;
private ProcedureType procedureType;
private String cpvPrefix;
private List<String> cpvCodes;
private String nutsCode;
private List<String> nutsCodes;
private LocalDate publicationDateFrom;
private LocalDate publicationDateTo;
private OffsetDateTime submissionDeadlineAfter;
private Boolean euFunded;
private String buyerNameContains;
private String projectTitleContains;
}

@ -1,30 +0,0 @@
package at.procon.dip.domain.ted.search.dto;
import at.procon.ted.model.entity.ContractNature;
import at.procon.ted.model.entity.NoticeType;
import at.procon.ted.model.entity.ProcedureType;
import java.math.BigDecimal;
import java.time.LocalDate;
import java.time.OffsetDateTime;
import java.util.List;
import java.util.UUID;
public record TedStructuredSearchSummaryRow(
UUID documentId,
String publicationId,
String noticeId,
NoticeType noticeType,
String projectTitle,
String buyerName,
String buyerCountryCode,
String buyerCity,
ContractNature contractNature,
ProcedureType procedureType,
LocalDate publicationDate,
OffsetDateTime submissionDeadline,
List<String> cpvCodes,
Integer totalLots,
BigDecimal estimatedValue,
String estimatedValueCurrency
) {
}

@ -14,14 +14,13 @@ import at.procon.dip.runtime.config.RuntimeMode;
import at.procon.ted.model.entity.Organization;
import at.procon.ted.model.entity.ProcurementDocument;
import at.procon.ted.model.entity.ProcurementLot;
import java.util.*;
import java.util.ArrayList;
import java.util.List;
import java.util.UUID;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.stereotype.Service;
import org.springframework.transaction.annotation.Transactional;
import org.springframework.util.StringUtils;
/**
* Phase 3 service that materializes TED-specific structured projections on top of the generic DOC document root.
@ -147,59 +146,14 @@ public class TedNoticeProjectionService {
lotRepository.saveAll(projectedLots);
}
private Organization mergeOrganization(Organization left, Organization right) {
if (completenessScore(right) > completenessScore(left)) {
return right;
}
return left;
}
private int completenessScore(Organization organization) {
int score = 0;
score += textScore(organization.getOrgReference());
score += textScore(organization.getRole());
score += textScore(organization.getName());
score += textScore(organization.getCompanyId());
score += textScore(organization.getCountryCode());
score += textScore(organization.getCity());
score += textScore(organization.getPostalCode());
score += textScore(organization.getStreetName());
score += textScore(organization.getNutsCode());
score += textScore(organization.getWebsiteUri());
score += textScore(organization.getEmail());
score += textScore(organization.getPhone());
return score;
}
private int textScore(String value) {
return StringUtils.hasText(value) ? 1 : 0;
}
private void replaceOrganizations(TedNoticeProjection projection, List<Organization> legacyOrganizations) {
organizationRepository.deleteByNoticeProjection_Id(projection.getId());
if (legacyOrganizations == null || legacyOrganizations.isEmpty()) {
return;
}
Map<String, Organization> byReference = new LinkedHashMap<>();
int duplicateCount = 0;
for (Organization organization : legacyOrganizations) {
String key = organizationKey(organization, byReference.size());
Organization existing = byReference.get(key);
if (existing == null) {
byReference.put(key, organization);
} else {
duplicateCount++;
byReference.put(key, mergeOrganization(existing, organization));
}
}
if (duplicateCount > 0) {
log.warn("Collapsing {} duplicate TED organization rows for projection {} before insert", duplicateCount, projection.getId());
}
List<TedNoticeOrganization> projectedOrganizations = new ArrayList<>();
for (Organization organization : byReference.values()) {
for (Organization organization : legacyOrganizations) {
projectedOrganizations.add(TedNoticeOrganization.builder()
.noticeProjection(projection)
.orgReference(organization.getOrgReference())
@ -222,21 +176,4 @@ public class TedNoticeProjectionService {
private String[] copyArray(String[] source) {
return source == null ? null : source.clone();
}
private int arrayScore(String[] values) {
return values != null && values.length > 0 ? 1 : 0;
}
private String organizationKey(Organization organization, int ordinal) {
if (StringUtils.hasText(organization.getOrgReference())) {
return organization.getOrgReference().trim();
}
if (StringUtils.hasText(organization.getCompanyId())) {
return "company:" + organization.getCompanyId().trim();
}
if (StringUtils.hasText(organization.getName())) {
return "name:" + organization.getName().trim();
}
return "__row__" + ordinal;
}
}

@ -1,186 +0,0 @@
package at.procon.dip.domain.ted.service;
import at.procon.dip.domain.ted.config.TedProjectionProperties;
import at.procon.dip.domain.ted.search.TedStructuredSearchRepository;
import at.procon.dip.domain.ted.search.dto.TedStructuredSearchFacets;
import at.procon.dip.domain.ted.search.dto.TedStructuredSearchFilter;
import at.procon.dip.domain.ted.search.dto.TedStructuredSearchSummaryRow;
import at.procon.dip.runtime.condition.ConditionalOnRuntimeMode;
import at.procon.dip.runtime.config.RuntimeMode;
import at.procon.dip.search.dto.SearchMode;
import at.procon.dip.search.dto.SearchSortMode;
import at.procon.dip.search.spi.SearchDocumentScope;
import at.procon.dip.search.service.SearchOrchestrator;
import at.procon.ted.model.dto.DocumentDtos.DocumentSummary;
import at.procon.ted.model.dto.DocumentDtos.SearchRequest;
import at.procon.ted.model.dto.DocumentDtos.SearchResponse;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.UUID;
import java.util.stream.Collectors;
import lombok.RequiredArgsConstructor;
import org.springframework.stereotype.Service;
import org.springframework.transaction.annotation.Transactional;
import org.springframework.util.StringUtils;
@Service
@RequiredArgsConstructor
@ConditionalOnRuntimeMode(RuntimeMode.NEW)
@Transactional(readOnly = true)
public class TedStructuredSearchService {
private final TedStructuredSearchRepository repository;
private final SearchOrchestrator searchOrchestrator;
private final TedProjectionProperties tedProjectionProperties;
public SearchResponse search(SearchRequest request) {
int page = request.getPage() != null && request.getPage() >= 0 ? request.getPage() : 0;
int size = request.getSize() != null && request.getSize() > 0 ? request.getSize() : 20;
TedStructuredSearchFilter filter = toFilter(request);
int facetLimit = request.getFacetBucketLimit() != null && request.getFacetBucketLimit() > 0
? request.getFacetBucketLimit()
: tedProjectionProperties.getStructuredSearchFacetBucketLimit();
TedStructuredSearchFacets facets = Boolean.FALSE.equals(request.getIncludeFacets())
? null
: repository.computeFacets(filter, facetLimit);
SearchResponse response = hasQuery(request)
? searchHybrid(request, filter, page, size)
: searchStructuredOnly(request, filter, page, size);
response.setFacets(facets);
return response;
}
private SearchResponse searchStructuredOnly(SearchRequest request,
TedStructuredSearchFilter filter,
int page,
int size) {
long total = repository.countDistinctDocuments(filter);
List<TedStructuredSearchSummaryRow> rows = repository.searchStructured(filter, page, size, request.getSortBy(), request.getSortDirection());
return SearchResponse.builder()
.documents(rows.stream().map(this::toSummary).toList())
.page(page)
.size(size)
.totalElements(total)
.totalPages((int) Math.ceil(total / (double) size))
.hasNext((page + 1L) * size < total)
.hasPrevious(page > 0)
.build();
}
private SearchResponse searchHybrid(SearchRequest request,
TedStructuredSearchFilter filter,
int page,
int size) {
List<UUID> candidateIds = repository.findCandidateDocumentIds(filter, tedProjectionProperties.getStructuredSearchHybridCandidateLimit());
if (candidateIds.isEmpty()) {
return SearchResponse.builder()
.documents(List.of())
.page(page)
.size(size)
.totalElements(0)
.totalPages(0)
.hasNext(false)
.hasPrevious(page > 0)
.build();
}
at.procon.dip.search.dto.SearchRequest genericRequest = at.procon.dip.search.dto.SearchRequest.builder()
.queryText(request.getSemanticQuery())
.modes(Set.of(SearchMode.HYBRID))
.page(page)
.size(size)
.sortMode(resolveSortMode(request.getSortBy(), request.getSortDirection()))
.semanticSimilarityThreshold(request.getSimilarityThreshold())
.build();
var genericResponse = searchOrchestrator.search(
genericRequest,
new SearchDocumentScope(Set.of(), null, null, null, null, Set.copyOf(candidateIds))
);
List<UUID> orderedIds = genericResponse.getHits().stream().map(hit -> hit.getDocumentId()).toList();
Map<UUID, TedStructuredSearchSummaryRow> summaryById = repository.findSummariesByDocumentIds(orderedIds).stream()
.collect(Collectors.toMap(TedStructuredSearchSummaryRow::documentId, row -> row, (a, b) -> a, LinkedHashMap::new));
List<DocumentSummary> docs = genericResponse.getHits().stream()
.map(hit -> {
TedStructuredSearchSummaryRow row = summaryById.get(hit.getDocumentId());
if (row == null) {
return null;
}
DocumentSummary summary = toSummary(row);
summary.setSimilarity(hit.getFinalScore());
return summary;
})
.filter(java.util.Objects::nonNull)
.toList();
return SearchResponse.builder()
.documents(docs)
.page(page)
.size(size)
.totalElements(genericResponse.getTotalHits())
.totalPages((int) Math.ceil(genericResponse.getTotalHits() / (double) size))
.hasNext((page + 1L) * size < genericResponse.getTotalHits())
.hasPrevious(page > 0)
.build();
}
private boolean hasQuery(SearchRequest request) {
return StringUtils.hasText(request.getSemanticQuery());
}
private TedStructuredSearchFilter toFilter(SearchRequest request) {
return TedStructuredSearchFilter.builder()
.countryCode(request.getCountryCode())
.countryCodes(request.getCountryCodes())
.noticeType(request.getNoticeType())
.contractNature(request.getContractNature())
.procedureType(request.getProcedureType())
.cpvPrefix(request.getCpvPrefix())
.cpvCodes(request.getCpvCodes())
.nutsCode(request.getNutsCode())
.nutsCodes(request.getNutsCodes())
.publicationDateFrom(request.getPublicationDateFrom())
.publicationDateTo(request.getPublicationDateTo())
.submissionDeadlineAfter(request.getSubmissionDeadlineAfter())
.euFunded(request.getEuFunded())
.buyerNameContains(request.getBuyerNameContains())
.projectTitleContains(request.getProjectTitleContains())
.build();
}
private SearchSortMode resolveSortMode(String sortBy, String sortDirection) {
if ("projectTitle".equalsIgnoreCase(sortBy) && "asc".equalsIgnoreCase(sortDirection)) {
return SearchSortMode.TITLE_ASC;
}
if ("publicationDate".equalsIgnoreCase(sortBy) || "submissionDeadline".equalsIgnoreCase(sortBy)) {
return SearchSortMode.CREATED_AT_DESC;
}
return SearchSortMode.SCORE_DESC;
}
private DocumentSummary toSummary(TedStructuredSearchSummaryRow row) {
return DocumentSummary.builder()
.id(row.documentId())
.publicationId(row.publicationId())
.noticeId(row.noticeId())
.noticeType(row.noticeType())
.projectTitle(row.projectTitle())
.buyerName(row.buyerName())
.buyerCountryCode(row.buyerCountryCode())
.buyerCity(row.buyerCity())
.contractNature(row.contractNature())
.procedureType(row.procedureType())
.publicationDate(row.publicationDate())
.submissionDeadline(row.submissionDeadline())
.cpvCodes(row.cpvCodes())
.totalLots(row.totalLots())
.estimatedValue(row.estimatedValue())
.estimatedValueCurrency(row.estimatedValueCurrency())
.build();
}
}

@ -1,91 +0,0 @@
package at.procon.dip.domain.ted.web;
import at.procon.dip.runtime.condition.ConditionalOnRuntimeMode;
import at.procon.dip.runtime.config.RuntimeMode;
import at.procon.dip.domain.ted.service.TedStructuredSearchService;
import at.procon.ted.model.dto.DocumentDtos.SearchRequest;
import at.procon.ted.model.dto.DocumentDtos.SearchResponse;
import at.procon.ted.model.entity.ContractNature;
import at.procon.ted.model.entity.NoticeType;
import at.procon.ted.model.entity.ProcedureType;
import io.swagger.v3.oas.annotations.Parameter;
import java.time.LocalDate;
import java.time.OffsetDateTime;
import java.util.List;
import lombok.RequiredArgsConstructor;
import org.springframework.format.annotation.DateTimeFormat;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
@RestController
@RequestMapping("/v1/documents")
@RequiredArgsConstructor
@ConditionalOnRuntimeMode(RuntimeMode.NEW)
public class TedStructuredSearchController {
private final TedStructuredSearchService searchService;
@GetMapping("/search")
public ResponseEntity<SearchResponse> searchDocuments(
@RequestParam(required = false) String countryCode,
@RequestParam(required = false) List<String> countryCodes,
@RequestParam(required = false) NoticeType noticeType,
@RequestParam(required = false) ContractNature contractNature,
@RequestParam(required = false) ProcedureType procedureType,
@RequestParam(required = false) String cpvPrefix,
@RequestParam(required = false) List<String> cpvCodes,
@RequestParam(required = false) String nutsCode,
@RequestParam(required = false) List<String> nutsCodes,
@RequestParam(required = false) @DateTimeFormat(iso = DateTimeFormat.ISO.DATE) LocalDate publicationDateFrom,
@RequestParam(required = false) @DateTimeFormat(iso = DateTimeFormat.ISO.DATE) LocalDate publicationDateTo,
@RequestParam(required = false) @DateTimeFormat(iso = DateTimeFormat.ISO.DATE_TIME) OffsetDateTime submissionDeadlineAfter,
@RequestParam(required = false) Boolean euFunded,
@RequestParam(required = false) String buyerNameContains,
@RequestParam(required = false) String projectTitleContains,
@RequestParam(required = false, name = "q") String q,
@RequestParam(required = false) Double similarityThreshold,
@RequestParam(required = false) Boolean includeFacets,
@RequestParam(required = false) Integer facetBucketLimit,
@RequestParam(required = false, defaultValue = "0") Integer page,
@RequestParam(required = false, defaultValue = "20") Integer size,
@RequestParam(required = false, defaultValue = "publicationDate") String sortBy,
@RequestParam(required = false, defaultValue = "desc") String sortDirection
) {
SearchRequest request = SearchRequest.builder()
.countryCode(countryCode)
.countryCodes(countryCodes)
.noticeType(noticeType)
.contractNature(contractNature)
.procedureType(procedureType)
.cpvPrefix(cpvPrefix)
.cpvCodes(cpvCodes)
.nutsCode(nutsCode)
.nutsCodes(nutsCodes)
.publicationDateFrom(publicationDateFrom)
.publicationDateTo(publicationDateTo)
.submissionDeadlineAfter(submissionDeadlineAfter)
.euFunded(euFunded)
.buyerNameContains(buyerNameContains)
.projectTitleContains(projectTitleContains)
.semanticQuery(q)
.similarityThreshold(similarityThreshold)
.includeFacets(includeFacets)
.facetBucketLimit(facetBucketLimit)
.page(page)
.size(size)
.sortBy(sortBy)
.sortDirection(sortDirection)
.build();
return ResponseEntity.ok(searchService.search(request));
}
@PostMapping("/search")
public ResponseEntity<SearchResponse> searchDocumentsPost(@RequestBody SearchRequest request) {
return ResponseEntity.ok(searchService.search(request));
}
}

@ -50,9 +50,4 @@ public class SearchRequest {
* When omitted, the new embedding subsystem default query model is used.
*/
private String semanticModelKey;
/**
* Optional per-request similarity threshold override for semantic search.
*/
private Double semanticSimilarityThreshold;
}

@ -47,10 +47,6 @@ public class PgVectorSemanticSearchEngine implements SearchEngine {
EmbeddingModelDescriptor model = resolveModel(requestedModelKey);
validateModel(model);
double threshold = context.getRequest().getSemanticSimilarityThreshold() != null
? context.getRequest().getSemanticSimilarityThreshold()
: properties.getSimilarityThreshold();
return queryEmbeddingService.buildQueryEmbedding(
context.getRequest().getQueryText(),
model.modelKey())
@ -61,7 +57,7 @@ public class PgVectorSemanticSearchEngine implements SearchEngine {
model.distanceMetric(),
query.vectorString(),
properties.getSemanticCandidateLimit(),
threshold))
properties.getSimilarityThreshold()))
.orElseGet(() -> {
log.debug("Semantic search skipped because query embedding could not be generated for model {}", model.modelKey());
return List.of();

@ -85,11 +85,6 @@ final class SearchSqlFilterSupport {
sql.append(" AND dt.tenant_key IN (:ownerTenantKeys)");
params.addValue("ownerTenantKeys", context.getScope().ownerTenantKeys());
}
if (context.getScope() != null && !CollectionUtils.isEmpty(context.getScope().candidateDocumentIds())) {
sql.append(" AND ").append(documentAlias).append(".id IN (:candidateDocumentIds)");
params.addValue("candidateDocumentIds", context.getScope().candidateDocumentIds());
}
}
private static <T> Set<T> firstNonEmpty(Set<T> primary, Set<T> fallback) {

@ -4,7 +4,6 @@ import at.procon.dip.domain.access.DocumentVisibility;
import at.procon.dip.domain.document.DocumentFamily;
import at.procon.dip.domain.document.DocumentType;
import java.util.Set;
import java.util.UUID;
/**
* Minimal generic search scope for future hybrid/semantic search services.
@ -14,7 +13,6 @@ public record SearchDocumentScope(
Set<DocumentType> documentTypes,
Set<DocumentFamily> documentFamilies,
Set<DocumentVisibility> visibilities,
String languageCode,
Set<UUID> candidateDocumentIds
String languageCode
) {
}

@ -49,8 +49,7 @@ public class GenericSearchController {
request.getDocumentTypes(),
request.getDocumentFamilies(),
request.getVisibilities(),
scopeLanguage,
null
scopeLanguage
);
}
}

@ -4,7 +4,6 @@ import at.procon.ted.model.entity.ContractNature;
import at.procon.ted.model.entity.NoticeType;
import at.procon.ted.model.entity.ProcedureType;
import at.procon.ted.model.entity.VectorizationStatus;
import at.procon.dip.domain.ted.search.dto.TedStructuredSearchFacets;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
@ -200,10 +199,6 @@ public class DocumentDtos {
// Semantic search
private String semanticQuery;
private Double similarityThreshold;
// Additional options
private Boolean includeFacets;
private Integer facetBucketLimit;
// Pagination
private Integer page;
@ -227,7 +222,6 @@ public class DocumentDtos {
private int totalPages;
private boolean hasNext;
private boolean hasPrevious;
private TedStructuredSearchFacets facets;
}
/**

@ -1,6 +1,5 @@
package at.procon.ted.model.entity;
import at.procon.dip.architecture.SchemaNames;
import jakarta.persistence.*;
import lombok.*;
@ -14,7 +13,7 @@ import java.util.UUID;
* @author Martin.Schweitzer@procon.co.at and claude.ai
*/
@Entity
@Table(schema = SchemaNames.TED, name = "organization", indexes = {
@Table(name = "organization", indexes = {
@Index(name = "idx_org_document", columnList = "document_id"),
@Index(name = "idx_org_country", columnList = "country_code")
}, uniqueConstraints = {

@ -1,6 +1,5 @@
package at.procon.ted.model.entity;
import at.procon.dip.architecture.SchemaNames;
import jakarta.persistence.*;
import lombok.*;
import org.hibernate.annotations.JdbcTypeCode;
@ -23,7 +22,7 @@ import java.util.UUID;
* @author Martin.Schweitzer@procon.co.at and claude.ai
*/
@Entity
@Table(schema = SchemaNames.TED, name = "procurement_document", indexes = {
@Table(name = "procurement_document", indexes = {
@Index(name = "idx_doc_hash", columnList = "documentHash"),
@Index(name = "idx_doc_publication_id", columnList = "publicationId"),
@Index(name = "idx_doc_buyer_country", columnList = "buyerCountryCode"),

@ -1,6 +1,5 @@
package at.procon.ted.model.entity;
import at.procon.dip.architecture.SchemaNames;
import jakarta.persistence.*;
import lombok.*;
import org.hibernate.annotations.JdbcTypeCode;
@ -17,7 +16,7 @@ import java.util.UUID;
* @author Martin.Schweitzer@procon.co.at and claude.ai
*/
@Entity
@Table(schema = SchemaNames.TED, name = "procurement_lot", indexes = {
@Table(name = "procurement_lot", indexes = {
@Index(name = "idx_lot_document", columnList = "document_id")
}, uniqueConstraints = {
@UniqueConstraint(columnNames = {"document_id", "lot_id"})

@ -294,4 +294,4 @@ dip:
batch-size: 500
max-documents-per-run: 0
skip-when-primary-representation-missing: true
queue-missing-embeddings: true
queue-missing-embeddings: false

@ -28,9 +28,6 @@ class NewRuntimeMustNotDependOnTedProcessorPropertiesTest {
at.procon.dip.ingestion.service.TedPackageChildImportProcessor.class,
at.procon.dip.domain.ted.service.TedNoticeProjectionService.class,
at.procon.dip.domain.ted.startup.TedProjectionStartupRunner.class,
at.procon.dip.domain.ted.search.TedStructuredSearchRepository.class,
at.procon.dip.domain.ted.service.TedStructuredSearchService.class,
at.procon.dip.domain.ted.web.TedStructuredSearchController.class,
at.procon.dip.search.engine.fulltext.PostgresFullTextSearchEngine.class,
at.procon.dip.search.engine.trigram.PostgresTrigramSearchEngine.class,
at.procon.dip.search.engine.semantic.PgVectorSemanticSearchEngine.class,

@ -1,61 +0,0 @@
package at.procon.dip.domain.ted.search.integration;
import static org.springframework.test.web.servlet.request.MockMvcRequestBuilders.get;
import static org.springframework.test.web.servlet.result.MockMvcResultMatchers.jsonPath;
import static org.springframework.test.web.servlet.result.MockMvcResultMatchers.status;
import at.procon.dip.domain.document.DocumentFamily;
import at.procon.dip.domain.document.DocumentType;
import at.procon.dip.domain.document.RepresentationType;
import at.procon.dip.domain.ted.entity.TedNoticeProjection;
import at.procon.dip.testsupport.AbstractTedStructuredSearchIntegrationTest;
import at.procon.ted.model.entity.ContractNature;
import at.procon.ted.model.entity.NoticeType;
import at.procon.ted.model.entity.ProcedureType;
import java.time.LocalDate;
import java.time.OffsetDateTime;
import org.junit.jupiter.api.Test;
class TedStructuredSearchEndpointIntegrationTest extends AbstractTedStructuredSearchIntegrationTest {
@Test
void getSearch_should_return_structured_results_and_facets() throws Exception {
var created = dataFactory.createDocumentWithPrimaryRepresentation(
"Medical imaging systems for Vienna hospital",
"Procurement summary",
"Imaging systems and maintenance.",
DocumentType.TED_NOTICE,
DocumentFamily.PROCUREMENT,
"en",
RepresentationType.SEMANTIC_TEXT
);
tedNoticeProjectionRepository.save(TedNoticeProjection.builder()
.document(created.document())
.publicationId("100000-2026")
.noticeId("notice-100000-2026")
.noticeType(NoticeType.CONTRACT_NOTICE)
.buyerName("Vienna General Hospital")
.buyerCountryCode("AUT")
.buyerCity("Vienna")
.projectTitle("Medical imaging systems")
.contractNature(ContractNature.SUPPLIES)
.procedureType(ProcedureType.OPEN)
.publicationDate(LocalDate.of(2026, 4, 10))
.submissionDeadline(OffsetDateTime.parse("2026-05-01T10:00:00+02:00"))
.cpvCodes(new String[]{"33110000", "33120000"})
.totalLots(2)
.euFunded(true)
.build());
mockMvc.perform(get("/api/v1/documents/search")
.param("countryCode", "AUT")
.param("noticeType", "CONTRACT_NOTICE")
.param("includeFacets", "true"))
.andExpect(status().isOk())
.andExpect(jsonPath("$.documents[0].publicationId").value("100000-2026"))
.andExpect(jsonPath("$.documents[0].buyerName").value("Vienna General Hospital"))
.andExpect(jsonPath("$.facets.countries[0].key").value("AUT"))
.andExpect(jsonPath("$.facets.noticeTypes[0].key").value("CONTRACT_NOTICE"));
}
}

@ -1,95 +0,0 @@
package at.procon.dip.domain.ted.search.integration;
import static org.assertj.core.api.Assertions.assertThat;
import static org.mockito.Mockito.mock;
import at.procon.dip.domain.document.DocumentFamily;
import at.procon.dip.domain.document.DocumentType;
import at.procon.dip.domain.document.RepresentationType;
import at.procon.dip.domain.ted.entity.TedNoticeProjection;
import at.procon.dip.domain.ted.service.TedStructuredSearchService;
import at.procon.dip.testsupport.AbstractTedStructuredSearchIntegrationTest;
import at.procon.ted.config.TedProcessorProperties;
import at.procon.ted.model.dto.DocumentDtos.SearchRequest;
import at.procon.ted.model.entity.ContractNature;
import at.procon.ted.model.entity.NoticeType;
import at.procon.ted.model.entity.ProcedureType;
import at.procon.ted.model.entity.ProcurementDocument;
import at.procon.ted.service.SearchService;
import at.procon.ted.service.VectorizationService;
import java.time.LocalDate;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
class TedStructuredSearchParityIntegrationTest extends AbstractTedStructuredSearchIntegrationTest {
@Autowired
private TedStructuredSearchService newSearchService;
@Test
void structuredSearch_should_match_legacy_for_shared_filters() {
var created = dataFactory.createDocumentWithPrimaryRepresentation(
"Road maintenance services in Graz",
"Procurement summary",
"Road maintenance and winter service.",
DocumentType.TED_NOTICE,
DocumentFamily.PROCUREMENT,
"en",
RepresentationType.SEMANTIC_TEXT
);
tedNoticeProjectionRepository.save(TedNoticeProjection.builder()
.document(created.document())
.publicationId("200000-2026")
.noticeId("notice-200000-2026")
.noticeType(NoticeType.CONTRACT_NOTICE)
.buyerName("City of Graz")
.buyerCountryCode("AUT")
.buyerCity("Graz")
.projectTitle("Road maintenance services")
.contractNature(ContractNature.SERVICES)
.procedureType(ProcedureType.OPEN)
.publicationDate(LocalDate.of(2026, 4, 12))
.euFunded(false)
.build());
procurementDocumentRepository.save(ProcurementDocument.builder()
.documentHash("legacy-200000-2026")
.publicationId("200000-2026")
.noticeId("notice-200000-2026")
.noticeType(NoticeType.CONTRACT_NOTICE)
.buyerName("City of Graz")
.buyerCountryCode("AUT")
.buyerCity("Graz")
.projectTitle("Road maintenance services")
.contractNature(ContractNature.SERVICES)
.procedureType(ProcedureType.OPEN)
.publicationDate(LocalDate.of(2026, 4, 12))
.euFunded(false)
.build());
SearchRequest request = SearchRequest.builder()
.countryCode("AUT")
.noticeType(NoticeType.CONTRACT_NOTICE)
.contractNature(ContractNature.SERVICES)
.procedureType(ProcedureType.OPEN)
.projectTitleContains("maintenance")
.publicationDateFrom(LocalDate.of(2026, 4, 1))
.publicationDateTo(LocalDate.of(2026, 4, 30))
.page(0)
.size(20)
.sortBy("publicationDate")
.sortDirection("desc")
.build();
TedProcessorProperties props = new TedProcessorProperties();
SearchService legacySearchService = new SearchService(procurementDocumentRepository, mock(VectorizationService.class), props);
var legacy = legacySearchService.search(request);
var current = newSearchService.search(request);
assertThat(current.getTotalElements()).isEqualTo(legacy.getTotalElements());
assertThat(current.getDocuments()).extracting("publicationId")
.containsExactlyElementsOf(legacy.getDocuments().stream().map(d -> d.getPublicationId()).toList());
}
}

@ -50,7 +50,7 @@ class GenericSearchOrchestratorIntegrationTest extends AbstractSearchIntegration
SearchResponse response = searchOrchestrator.search(
request,
new SearchDocumentScope(Set.of(), null, null, null, null, null));
new SearchDocumentScope(Set.of(), null, null, null, null));
assertThat(response.getHits()).hasSize(1);
assertThat(response.getHits().getFirst().getTitle()).isEqualTo("Maintenance manual");
@ -84,11 +84,11 @@ class GenericSearchOrchestratorIntegrationTest extends AbstractSearchIntegration
SearchResponse primaryOnlyResponse = searchOrchestrator.search(
primaryOnly,
new SearchDocumentScope(Set.of(), Set.of(DocumentType.TEXT), Set.of(DocumentFamily.GENERIC), null, null, null));
new SearchDocumentScope(Set.of(), Set.of(DocumentType.TEXT), Set.of(DocumentFamily.GENERIC), null, null));
SearchResponse primaryAndChunksResponse = searchOrchestrator.search(
primaryAndChunks,
new SearchDocumentScope(Set.of(), Set.of(DocumentType.TEXT), Set.of(DocumentFamily.GENERIC), null, null, null));
new SearchDocumentScope(Set.of(), Set.of(DocumentType.TEXT), Set.of(DocumentFamily.GENERIC), null, null));
assertThat(primaryOnlyResponse.getHits()).isEmpty();
assertThat(primaryAndChunksResponse.getHits()).hasSize(1);
@ -121,7 +121,6 @@ class GenericSearchOrchestratorIntegrationTest extends AbstractSearchIntegration
Set.of(DocumentType.TEXT),
Set.of(DocumentFamily.GENERIC),
null,
null,
null
)
);
@ -160,7 +159,6 @@ class GenericSearchOrchestratorIntegrationTest extends AbstractSearchIntegration
Set.of(DocumentType.TEXT),
Set.of(DocumentFamily.GENERIC),
null,
null,
null
))
.page(0)

@ -67,7 +67,7 @@ class GenericSearchRepositoryIntegrationTest extends AbstractSearchIntegrationTe
.modes(Set.of(SearchMode.FULLTEXT))
.representationSelectionMode(SearchRepresentationSelectionMode.PRIMARY_ONLY)
.build())
.scope(new SearchDocumentScope(Set.of(), null, null, null, null, null))
.scope(new SearchDocumentScope(Set.of(), null, null, null, null))
.page(0)
.size(10)
.build();
@ -96,7 +96,7 @@ class GenericSearchRepositoryIntegrationTest extends AbstractSearchIntegrationTe
.modes(Set.of(SearchMode.TRIGRAM))
.representationSelectionMode(SearchRepresentationSelectionMode.PRIMARY_ONLY)
.build())
.scope(new SearchDocumentScope(Set.of(), null, null, null, null, null))
.scope(new SearchDocumentScope(Set.of(), null, null, null, null))
.page(0)
.size(10)
.build();

@ -52,7 +52,7 @@ class GenericSemanticSearchOrchestratorIntegrationTest extends AbstractSemanticS
SearchResponse response = searchOrchestrator.search(
request,
new SearchDocumentScope(Set.of(), Set.of(DocumentType.TEXT), Set.of(DocumentFamily.GENERIC), null, null, null)
new SearchDocumentScope(Set.of(), Set.of(DocumentType.TEXT), Set.of(DocumentFamily.GENERIC), null, null)
);
assertThat(response.getHits()).isNotEmpty();
@ -83,7 +83,7 @@ class GenericSemanticSearchOrchestratorIntegrationTest extends AbstractSemanticS
SearchResponse response = searchOrchestrator.search(
request,
new SearchDocumentScope(Set.of(), Set.of(DocumentType.TEXT), Set.of(DocumentFamily.GENERIC), null, null, null)
new SearchDocumentScope(Set.of(), Set.of(DocumentType.TEXT), Set.of(DocumentFamily.GENERIC), null, null)
);
assertThat(response.getHits()).isNotEmpty();

@ -57,12 +57,12 @@ class SemanticModelSelectionIntegrationTest extends AbstractSemanticSearchIntegr
SearchResponse defaultModelResponse = searchOrchestrator.search(
defaultModelRequest,
new SearchDocumentScope(Set.of(), Set.of(DocumentType.TEXT), Set.of(DocumentFamily.GENERIC), null, null, null)
new SearchDocumentScope(Set.of(), Set.of(DocumentType.TEXT), Set.of(DocumentFamily.GENERIC), null, null)
);
SearchResponse alternateModelResponse = searchOrchestrator.search(
alternateModelRequest,
new SearchDocumentScope(Set.of(), Set.of(DocumentType.TEXT), Set.of(DocumentFamily.GENERIC), null, null, null)
new SearchDocumentScope(Set.of(), Set.of(DocumentType.TEXT), Set.of(DocumentFamily.GENERIC), null, null)
);
assertThat(defaultModelResponse.getHits()).isEmpty();

@ -1,97 +0,0 @@
package at.procon.dip.testsupport;
import at.procon.dip.FixedPortPostgreSQLContainer;
import at.procon.dip.domain.document.repository.DocumentRepository;
import at.procon.dip.domain.document.repository.DocumentTextRepresentationRepository;
import at.procon.dip.domain.ted.repository.TedNoticeProjectionRepository;
import at.procon.ted.repository.ProcurementDocumentRepository;
import javax.sql.DataSource;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.TestInstance;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.test.context.DynamicPropertyRegistry;
import org.springframework.test.context.DynamicPropertySource;
import org.springframework.test.context.TestPropertySource;
import org.springframework.test.web.servlet.MockMvc;
import org.testcontainers.containers.PostgreSQLContainer;
import org.testcontainers.junit.jupiter.Container;
import org.testcontainers.junit.jupiter.Testcontainers;
@SpringBootTest(classes = TedStructuredSearchTestApplication.class, webEnvironment = SpringBootTest.WebEnvironment.MOCK)
@Testcontainers
@TestInstance(TestInstance.Lifecycle.PER_CLASS)
@TestPropertySource(properties = {
"spring.jpa.hibernate.ddl-auto=create-drop",
"spring.jpa.show-sql=false",
"spring.jpa.open-in-view=false",
"spring.jpa.properties.hibernate.default_schema=DOC",
"spring.main.lazy-initialization=true",
"dip.runtime.mode=NEW",
"dip.search.default-page-size=20",
"dip.search.max-page-size=100",
"dip.search.fulltext-weight=0.60",
"dip.search.trigram-weight=0.40",
"dip.search.semantic-weight=0.45",
"dip.search.recency-boost-weight=0.05",
"dip.search.trigram-similarity-threshold=0.10",
"server.servlet.context-path=/api"
})
public abstract class AbstractTedStructuredSearchIntegrationTest {
private static final int HOST_PORT = 15434;
private static final String DB_NAME = "dip_ted_search_test";
private static final String DB_USER = "test";
private static final String DB_PASSWORD = "test";
private static final String JDBC_URL = "jdbc:postgresql://localhost:" + HOST_PORT + "/" + DB_NAME;
@Container
static PostgreSQLContainer<?> postgres = new FixedPortPostgreSQLContainer<>("postgres:16-alpine", HOST_PORT)
.withDatabaseName(DB_NAME)
.withUsername(DB_USER)
.withPassword(DB_PASSWORD)
.withInitScript("sql/create-doc-search-test-schemas.sql");
@DynamicPropertySource
static void registerProperties(DynamicPropertyRegistry registry) {
if (!postgres.isRunning()) {
postgres.start();
}
registry.add("spring.datasource.url", () -> JDBC_URL);
registry.add("spring.datasource.username", () -> DB_USER);
registry.add("spring.datasource.password", () -> DB_PASSWORD);
registry.add("spring.datasource.driver-class-name", () -> "org.postgresql.Driver");
}
@Autowired protected JdbcTemplate jdbcTemplate;
@Autowired protected DataSource dataSource;
@Autowired protected MockMvc mockMvc;
@Autowired protected SearchTestDataFactory dataFactory;
@Autowired protected DocumentRepository documentRepository;
@Autowired protected DocumentTextRepresentationRepository representationRepository;
@Autowired protected TedNoticeProjectionRepository tedNoticeProjectionRepository;
@Autowired protected ProcurementDocumentRepository procurementDocumentRepository;
@BeforeEach
void resetDatabase() {
ensureSearchColumnsAndIndexes();
cleanupDatabase();
}
protected void ensureSearchColumnsAndIndexes() {
jdbcTemplate.execute("CREATE SCHEMA IF NOT EXISTS doc");
jdbcTemplate.execute("CREATE SCHEMA IF NOT EXISTS ted");
jdbcTemplate.execute("CREATE EXTENSION IF NOT EXISTS pg_trgm with schema doc");
jdbcTemplate.execute("ALTER TABLE doc.doc_text_representation ADD COLUMN IF NOT EXISTS search_config VARCHAR(64)");
jdbcTemplate.execute("ALTER TABLE doc.doc_text_representation ADD COLUMN IF NOT EXISTS search_vector tsvector");
jdbcTemplate.execute("CREATE INDEX IF NOT EXISTS idx_doc_text_repr_search_vector_test ON doc.doc_text_representation USING GIN (search_vector)");
jdbcTemplate.execute("CREATE INDEX IF NOT EXISTS idx_doc_document_title_trgm_test ON doc.doc_document USING GIN (title doc.gin_trgm_ops)");
jdbcTemplate.execute("CREATE INDEX IF NOT EXISTS idx_doc_document_summary_trgm_test ON doc.doc_document USING GIN (summary doc.gin_trgm_ops)");
jdbcTemplate.execute("CREATE INDEX IF NOT EXISTS idx_doc_text_repr_text_trgm_test ON doc.doc_text_representation USING GIN (text_body doc.gin_trgm_ops)");
}
protected void cleanupDatabase() {
jdbcTemplate.execute("TRUNCATE TABLE ted.ted_notice_organization, ted.ted_notice_lot, ted.ted_notice_projection, doc.doc_text_representation, doc.doc_document, doc.doc_tenant, doc.procurement_lot, doc.organization, doc.procurement_document RESTART IDENTITY CASCADE");
}
}

@ -1,78 +0,0 @@
package at.procon.dip.testsupport;
import at.procon.dip.config.JacksonConfig;
import at.procon.dip.domain.document.service.DocumentContentService;
import at.procon.dip.domain.document.service.DocumentRepresentationService;
import at.procon.dip.domain.document.service.DocumentService;
import at.procon.dip.domain.ted.config.TedProjectionProperties;
import at.procon.dip.domain.ted.search.TedStructuredSearchRepository;
import at.procon.dip.domain.ted.service.TedStructuredSearchService;
import at.procon.dip.domain.ted.web.TedStructuredSearchController;
import at.procon.dip.ingestion.config.DipIngestionProperties;
import at.procon.dip.search.config.DipSearchProperties;
import at.procon.dip.search.engine.fulltext.PostgresFullTextSearchEngine;
import at.procon.dip.search.engine.trigram.PostgresTrigramSearchEngine;
import at.procon.dip.search.plan.DefaultSearchPlanner;
import at.procon.dip.search.rank.DefaultSearchResultFusionService;
import at.procon.dip.search.rank.DefaultSearchScoreNormalizer;
import at.procon.dip.search.repository.DocumentFullTextSearchRepositoryImpl;
import at.procon.dip.search.repository.DocumentTrigramSearchRepositoryImpl;
import at.procon.dip.search.service.DefaultSearchOrchestrator;
import at.procon.dip.search.service.DocumentLexicalIndexService;
import at.procon.dip.search.service.SearchMetricsService;
import org.springframework.boot.SpringBootConfiguration;
import org.springframework.boot.autoconfigure.AutoConfigureOrder;
import org.springframework.boot.autoconfigure.ImportAutoConfiguration;
import org.springframework.boot.autoconfigure.domain.EntityScan;
import org.springframework.boot.autoconfigure.jdbc.DataSourceAutoConfiguration;
import org.springframework.boot.autoconfigure.jdbc.JdbcTemplateAutoConfiguration;
import org.springframework.boot.autoconfigure.orm.jpa.HibernateJpaAutoConfiguration;
import org.springframework.boot.autoconfigure.transaction.TransactionAutoConfiguration;
import org.springframework.boot.context.properties.EnableConfigurationProperties;
import org.springframework.boot.test.autoconfigure.web.servlet.AutoConfigureMockMvc;
import org.springframework.context.annotation.Import;
import org.springframework.data.jpa.repository.config.EnableJpaRepositories;
@SpringBootConfiguration
@AutoConfigureMockMvc
@ImportAutoConfiguration({
DataSourceAutoConfiguration.class,
HibernateJpaAutoConfiguration.class,
TransactionAutoConfiguration.class,
JdbcTemplateAutoConfiguration.class
})
@EnableConfigurationProperties({DipIngestionProperties.class, DipSearchProperties.class, TedProjectionProperties.class})
@EntityScan(basePackages = {
"at.procon.dip.domain.document.entity",
"at.procon.dip.domain.tenant.entity",
"at.procon.dip.domain.ted.entity",
"at.procon.ted.model.entity"
})
@EnableJpaRepositories(basePackages = {
"at.procon.dip.domain.document.repository",
"at.procon.dip.domain.tenant.repository",
"at.procon.dip.domain.ted.repository",
"at.procon.ted.repository"
})
@Import({
JacksonConfig.class,
DocumentService.class,
DocumentContentService.class,
DocumentRepresentationService.class,
DocumentLexicalIndexService.class,
SearchTestDataFactory.class,
DefaultSearchPlanner.class,
DocumentFullTextSearchRepositoryImpl.class,
DocumentTrigramSearchRepositoryImpl.class,
PostgresFullTextSearchEngine.class,
PostgresTrigramSearchEngine.class,
DefaultSearchScoreNormalizer.class,
DefaultSearchResultFusionService.class,
SearchMetricsService.class,
DefaultSearchOrchestrator.class,
TedStructuredSearchRepository.class,
TedStructuredSearchService.class,
TedStructuredSearchController.class
})
public class TedStructuredSearchTestApplication {
}
Loading…
Cancel
Save