You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
DIP/docs/embedding/NV1-NV3_IMPLEMENTATION_PLAN.md

85 lines
2.5 KiB
Markdown

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

# Parallel embedding subsystem plan (NV1NV3)
This plan assumes the old vectorization subsystem remains in place temporarily, while a new generic embedding subsystem is built in parallel.
## Principles
- Build the new subsystem under `at.procon.dip.embedding.*`.
- Do not shape it around old TED-specific or legacy vectorization services.
- Operate on `DocumentTextRepresentation` and `DocumentEmbedding` as the core abstraction.
- Keep the new subsystem configurable and provider-based.
- Migrate and cut over later.
## NV1 — provider/model/query foundation
### Goal
Create a standalone embedding foundation that can:
- resolve configured providers
- resolve configured models
- embed arbitrary text lists
- embed search queries
- support deterministic testing
### Deliverables
- `EmbeddingProperties`
- `EmbeddingUseCase`
- `EmbeddingRequest`
- `EmbeddingProviderResult`
- `EmbeddingModelDescriptor`
- `ResolvedEmbeddingProviderConfig`
- `EmbeddingProvider`
- `ExternalHttpEmbeddingProvider`
- `MockEmbeddingProvider`
- `EmbeddingProviderRegistry`
- `EmbeddingModelRegistry`
- `EmbeddingProviderConfigResolver`
- `EmbeddingExecutionService`
- `QueryEmbeddingService`
- startup validation of provider/model wiring
### Notes
- No cutover to the old vectorization path yet.
- No persistence/job orchestration yet.
- New subsystem should be safe to include in the app while disabled by default.
## NV2 — persistence and job orchestration
### Goal
Make the new subsystem able to create and process embedding jobs against `DocumentTextRepresentation`.
### Deliverables
- `EmbeddingJob` entity/repository/service
- retry / backoff policy
- default `EmbeddingSelectionPolicy`
- representation-level embedding execution
- `DocumentEmbedding` persistence updates through the new subsystem
## NV3 — generic semantic search engine
### Goal
Add semantic search into the generic search platform using only the new subsystem.
### Deliverables
- `PgVectorSemanticSearchEngine`
- `DocumentSemanticSearchRepository`
- query embedding through `QueryEmbeddingService`
- chunk-aware retrieval and collapse
- fusion with lexical search
## Migration philosophy
Because the app is still in development, prefer:
1. migrate documents and text representations first
2. re-embed through the new subsystem
3. only preserve old raw vector data if there is a strong operational reason
## Recommended implementation order
1. NV1 foundation
2. NV1 tests with mock provider
3. NV2 jobs and selection policy
4. NV3 semantic search
5. migration/backfill
6. cutover