You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

1015 B

Phase 1 Generic Persistence Backbone

Phase 1 introduces the additive DOC schema and the first concrete persistence layer for the new generalized platform model.

What is implemented

  • DOC.doc_tenant
  • DOC.doc_document
  • DOC.doc_source
  • DOC.doc_content
  • DOC.doc_text_representation
  • DOC.doc_embedding_model
  • DOC.doc_embedding
  • DOC.doc_relation

Intent

The generic model now exists as real JPA entities, repositories, Flyway migration, and thin transactional services. Existing TED runtime behavior is intentionally unchanged.

Important limitation

The actual TED processing pipeline still writes only to the legacy TED-specific model. Dual-write and migration come in later phases.

Vector storage note

doc_embedding already separates vectorization lifecycle from the document root. The transient embeddingVector field is intentionally not wired into Hibernate yet. Writing native pgvector data and moving the vectorization pipeline to the new table is part of Phase 2.