From 87fdae9f21e56b78ddda39a1f7955ab19a44c731 Mon Sep 17 00:00:00 2001
From: trifonovt <87468028+TihomirTrifonov@users.noreply.github.com>
Date: Fri, 20 Mar 2026 17:51:10 +0100
Subject: [PATCH] embedding nv2

---
 docs/embedding/NV2_IMPLEMENTATION_NOTES.md | 39 ++++++++++++++++++++++
 1 file changed, 39 insertions(+)
 create mode 100644 docs/embedding/NV2_IMPLEMENTATION_NOTES.md

diff --git a/docs/embedding/NV2_IMPLEMENTATION_NOTES.md b/docs/embedding/NV2_IMPLEMENTATION_NOTES.md
new file mode 100644
index 0000000..57e2b76
--- /dev/null
+++ b/docs/embedding/NV2_IMPLEMENTATION_NOTES.md
@@ -0,0 +1,39 @@
+# NV2 - Embedding persistence and job orchestration
+
+This patch continues the new parallel `at.procon.dip.embedding.*` subsystem introduced in NV1.
+
+## Scope
+
+NV2 adds:
+
+- representation-driven selection policy
+- `DOC.doc_embedding_job` queue table
+- job lifecycle service with retry scheduling
+- model-catalog sync into `DOC.doc_embedding_model`
+- persistence of vectors into `DOC.doc_embedding`
+- orchestrator for enqueueing and processing jobs
+- unit tests for the new orchestration layer
+
+## Still intentionally missing
+
+- no cutover of the old vectorization route
+- no scheduler / background polling by default
+- no semantic search engine yet
+- no migration / backfill yet
+
+## Intended usage
+
+New code can now do:
+
+1. enqueue a document or representation for embedding with a configured model key
+2. process the pending jobs through the new provider-based subsystem
+3. store the resulting vectors in the generic DOC embedding tables
+
+## Next step after NV2
+
+NV3 should add:
+
+- `PgVectorSemanticSearchEngine`
+- semantic repository
+- query embedding integration into the generic search engine
+- hybrid lexical + semantic fusion