You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
DIP/docs/embedding/NV1-NV3_IMPLEMENTATION_PLAN.md

2.5 KiB

Parallel embedding subsystem plan (NV1NV3)

This plan assumes the old vectorization subsystem remains in place temporarily, while a new generic embedding subsystem is built in parallel.

Principles

  • Build the new subsystem under at.procon.dip.embedding.*.
  • Do not shape it around old TED-specific or legacy vectorization services.
  • Operate on DocumentTextRepresentation and DocumentEmbedding as the core abstraction.
  • Keep the new subsystem configurable and provider-based.
  • Migrate and cut over later.

NV1 — provider/model/query foundation

Goal

Create a standalone embedding foundation that can:

  • resolve configured providers
  • resolve configured models
  • embed arbitrary text lists
  • embed search queries
  • support deterministic testing

Deliverables

  • EmbeddingProperties
  • EmbeddingUseCase
  • EmbeddingRequest
  • EmbeddingProviderResult
  • EmbeddingModelDescriptor
  • ResolvedEmbeddingProviderConfig
  • EmbeddingProvider
  • ExternalHttpEmbeddingProvider
  • MockEmbeddingProvider
  • EmbeddingProviderRegistry
  • EmbeddingModelRegistry
  • EmbeddingProviderConfigResolver
  • EmbeddingExecutionService
  • QueryEmbeddingService
  • startup validation of provider/model wiring

Notes

  • No cutover to the old vectorization path yet.
  • No persistence/job orchestration yet.
  • New subsystem should be safe to include in the app while disabled by default.

NV2 — persistence and job orchestration

Goal

Make the new subsystem able to create and process embedding jobs against DocumentTextRepresentation.

Deliverables

  • EmbeddingJob entity/repository/service
  • retry / backoff policy
  • default EmbeddingSelectionPolicy
  • representation-level embedding execution
  • DocumentEmbedding persistence updates through the new subsystem

NV3 — generic semantic search engine

Goal

Add semantic search into the generic search platform using only the new subsystem.

Deliverables

  • PgVectorSemanticSearchEngine
  • DocumentSemanticSearchRepository
  • query embedding through QueryEmbeddingService
  • chunk-aware retrieval and collapse
  • fusion with lexical search

Migration philosophy

Because the app is still in development, prefer:

  1. migrate documents and text representations first
  2. re-embed through the new subsystem
  3. only preserve old raw vector data if there is a strong operational reason
  1. NV1 foundation
  2. NV1 tests with mock provider
  3. NV2 jobs and selection policy
  4. NV3 semantic search
  5. migration/backfill
  6. cutover