You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
DIP/docs/embedding/NV3_IMPLEMENTATION_NOTES.md

44 lines
1.6 KiB
Markdown

# NV3 — Generic semantic search on the new embedding subsystem
This phase keeps the new embedding subsystem parallel to the legacy flow and plugs semantic retrieval
into the generic search architecture.
## Scope
- query embeddings are generated through `at.procon.dip.embedding.service.QueryEmbeddingService`
- semantic search uses `DOC.doc_embedding`
- retrieval joins `DOC.doc_text_representation` and `DOC.doc_document`
- chunk-aware document collapse remains in the generic search fusion layer
- no structured TED/mail search in this phase
- no legacy cutover in this phase
## Main classes
- `at.procon.dip.search.service.SemanticQueryEmbeddingService`
- `at.procon.dip.search.engine.semantic.PgVectorSemanticSearchEngine`
- `at.procon.dip.search.repository.DocumentSemanticSearchRepository`
## Query model selection
Order of precedence:
1. `SearchRequest.semanticModelKey`
2. `dip.embedding.default-query-model`
The selected model is ensured in `DOC.doc_embedding_model` through
`EmbeddingModelCatalogService` before the query runs.
## Search flow
1. planner includes `PGVECTOR_SEMANTIC` for `SEMANTIC` or `HYBRID`
2. `SemanticQueryEmbeddingService` builds a query vector
3. `DocumentSemanticSearchRepository` searches `DOC.doc_embedding`
4. generic fusion/collapse merges semantic hits with lexical hits
## Notes
- the SQL uses `public.vector` explicitly in casts to avoid search-path related surprises
- the repository returns representation metadata (`representation_type`, chunk offsets, etc.)
- `SearchRequest.semanticModelKey` is optional and keeps the API model-aware without forcing users
to choose a model for every request