DIP/docs/embedding/NV1-NV3_IMPLEMENTATION_PLAN.md

85 lines
2.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Parallel embedding subsystem plan (NV1NV3)
This plan assumes the old vectorization subsystem remains in place temporarily, while a new generic embedding subsystem is built in parallel.
## Principles
- Build the new subsystem under `at.procon.dip.embedding.*`.
- Do not shape it around old TED-specific or legacy vectorization services.
- Operate on `DocumentTextRepresentation` and `DocumentEmbedding` as the core abstraction.
- Keep the new subsystem configurable and provider-based.
- Migrate and cut over later.
## NV1 — provider/model/query foundation
### Goal
Create a standalone embedding foundation that can:
- resolve configured providers
- resolve configured models
- embed arbitrary text lists
- embed search queries
- support deterministic testing
### Deliverables
- `EmbeddingProperties`
- `EmbeddingUseCase`
- `EmbeddingRequest`
- `EmbeddingProviderResult`
- `EmbeddingModelDescriptor`
- `ResolvedEmbeddingProviderConfig`
- `EmbeddingProvider`
- `ExternalHttpEmbeddingProvider`
- `MockEmbeddingProvider`
- `EmbeddingProviderRegistry`
- `EmbeddingModelRegistry`
- `EmbeddingProviderConfigResolver`
- `EmbeddingExecutionService`
- `QueryEmbeddingService`
- startup validation of provider/model wiring
### Notes
- No cutover to the old vectorization path yet.
- No persistence/job orchestration yet.
- New subsystem should be safe to include in the app while disabled by default.
## NV2 — persistence and job orchestration
### Goal
Make the new subsystem able to create and process embedding jobs against `DocumentTextRepresentation`.
### Deliverables
- `EmbeddingJob` entity/repository/service
- retry / backoff policy
- default `EmbeddingSelectionPolicy`
- representation-level embedding execution
- `DocumentEmbedding` persistence updates through the new subsystem
## NV3 — generic semantic search engine
### Goal
Add semantic search into the generic search platform using only the new subsystem.
### Deliverables
- `PgVectorSemanticSearchEngine`
- `DocumentSemanticSearchRepository`
- query embedding through `QueryEmbeddingService`
- chunk-aware retrieval and collapse
- fusion with lexical search
## Migration philosophy
Because the app is still in development, prefer:
1. migrate documents and text representations first
2. re-embed through the new subsystem
3. only preserve old raw vector data if there is a strong operational reason
## Recommended implementation order
1. NV1 foundation
2. NV1 tests with mock provider
3. NV2 jobs and selection policy
4. NV3 semantic search
5. migration/backfill
6. cutover