Refactor CollectionModel/builder and introduce read-only property interfaces#13699
Refactor CollectionModel/builder and introduce read-only property interfaces#13699roji wants to merge 3 commits intomicrosoft:mainfrom
Conversation
a49d45d to
9f5b23c
Compare
9f5b23c to
3ed77b9
Compare
3ed77b9 to
abcdb5a
Compare
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
abcdb5a to
9bfbb38
Compare
There was a problem hiding this comment.
Automated Code Review
Reviewers: 4 | Confidence: 86%
✓ Correctness
This is a well-structured refactoring that introduces read-only interfaces (IPropertyModel, IKeyPropertyModel, IDataPropertyModel, IVectorPropertyModel) over the concrete property model classes, with all connectors updated to consume the interfaces. It also replaces the 'TemporaryStorageName' workaround with a proper 'SerializedKeyName' on IKeyPropertyModel, switches from virtual method-based property accessors to delegate-based ones configured during model building, and extracts duplicated embedding configuration logic into ConfigureVectorPropertyEmbedding. The changes are mechanically consistent, type-safe (leveraging IReadOnlyList covariance), and preserve existing behavior. No blocking correctness issues were found.
✗ Security Reliability
This PR introduces read-only interfaces (IPropertyModel, IKeyPropertyModel, IDataPropertyModel, IVectorPropertyModel) over the existing concrete property model classes and refactors accessors from virtual methods with inline null-guarding into delegate-based accessors configured during model building. The main reliability concern is that the new GetValueAsObject/SetValueAsObject methods guard the _getter/_setter delegates with Debug.Assert (stripped in Release builds), meaning a misconfigured property will produce an opaque NullReferenceException in production instead of a descriptive error. Additionally, the Activator.CreateInstance(recordType) call uses null-suppression without an explicit null check, though this is low-risk since model building validates the parameterless constructor exists. The rest of the changes are mechanical type signature updates from concrete to interface types, which look correct and pose no security or reliability risks.
✓ Test Coverage
This PR introduces read-only interfaces (IPropertyModel, IKeyPropertyModel, IDataPropertyModel, IVectorPropertyModel) for property models, refactors property accessors from inline virtual methods to configured delegates (ConfigurePocoAccessors/ConfigureDynamicAccessors), renames TemporaryStorageName to SerializedKeyName on KeyPropertyModel, replaces IRecordCreator with a simple Func, and extracts ConfigureVectorPropertyEmbedding. Existing tests cover the embedding generation refactoring and round-trip mapper behavior well. However, there are no direct unit tests for the new accessor delegate configuration pattern, which is a meaningful behavioral change: if accessors fail to be configured during model building, GetValueAsObject/SetValueAsObject will silently fail via Debug.Assert (no-op in release). The conformance and integration tests provide indirect coverage, but given the centrality of property access to the entire VectorData stack, targeted unit tests for the new accessor setup would reduce regression risk.
✗ Design Approach
The PR introduces
IPropertyModel/IKeyPropertyModel/IDataPropertyModel/IVectorPropertyModelinterfaces as read-only consumer-facing views of the concrete property model classes, updating all connectors to depend on interfaces rather than concrete types. The direction is sound and correctly separates the builder-facing mutable API from the connector-facing read API. Two design issues stand out: (1)IPropertyModel.ProviderAnnotationsis typed as the mutableDictionary<string, object?>?on an interface described as a 'read-only view', which lets any code holding only anIPropertyModelreference corrupt model state by modifying annotations; it should beIReadOnlyDictionary<string, object?>?. (2) TheDebug.Assertmessage inSqlitePropertyMapping.cswas changed to"property is VectorStoreRecordIKeyPropertyModel"which is not a real type name and is misleading. Everything else — the interface surface, theFunc<object>factory replacement, theSerializedKeyNamerename, and theConfigurePocoAccessors/ConfigureDynamicAccessorssplit — is well-structured.Flagged Issues
- In PropertyModel.GetValueAsObject and SetValueAsObject, Debug.Assert is used to check that _getter/_setter are non-null, but Debug.Assert is compiled out in Release builds. If accessors are not configured (e.g. due to a bug in a custom model builder subclass or future refactoring), production code will throw an unhelpful NullReferenceException instead of a descriptive error. Replace with a runtime guard (e.g. throw InvalidOperationException).
- IPropertyModel.ProviderAnnotations is declared as mutable Dictionary<string, object?>? on an interface described as a 'read-only view'. Any connector holding only an IPropertyModel reference can freely mutate the annotations dictionary, corrupting shared model state. This should be IReadOnlyDictionary<string, object?>? on the interface; the concrete PropertyModel class can keep the mutable Dictionary for builder/extension-method use.
Suggestions
- Add unit tests for ConfigurePocoAccessors and ConfigureDynamicAccessors verifying that GetValueAsObject and SetValueAsObject work correctly after Build() configures the delegates, including the InvalidCastException path for type mismatches. This is the most significant behavioral refactoring in the PR and currently has no direct test coverage.
- The test class rename from PropertyModelTests to IPropertyModelTests is a naming oddity since the tests still exercise concrete DataPropertyModel/KeyPropertyModel types. Consider keeping the original name or expanding the tests to also cover accessor methods (GetValueAsObject/SetValueAsObject) with both POCO and dynamic accessors configured.
- The Debug.Assert message in SqlitePropertyMapping.cs was changed to "property is VectorStoreRecordIKeyPropertyModel" which is not a real type name. Use the actual interface name IKeyPropertyModel or a plain description like "Expected a key property".
- In CollectionModelBuilder.cs, the extracted ConfigureVectorPropertyEmbedding now uses vectorProperty.Type instead of definitionVectorProperty.Type, which is an improvement for CLR-backed properties. Consider adding a unit test covering the scenario where a VectorStoreCollectionDefinition omits Type for a CLR-backed property to confirm this fix.
- The factory lambda () => Activator.CreateInstance(recordType)! uses null-suppression. While the parameterless constructor is validated before this point, a defensive null check would be more robust.
- Consider adding a test in CollectionModelBuilderTests that asserts SerializedKeyName is populated correctly after building a model with a reserved key storage name, since the TemporaryStorageName → SerializedKeyName rename is otherwise only tested indirectly.
Automated review by roji's agents
dotnet/src/VectorData/VectorData.Abstractions/ProviderServices/PropertyModel.cs
Show resolved
Hide resolved
dotnet/src/VectorData/VectorData.Abstractions/ProviderServices/IPropertyModel.cs
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
This PR refactors the VectorData provider model surface to expose immutable, read only property model interfaces after build, while simplifying model construction and deduplicating embedding resolution logic across builder paths.
Changes:
- Introduces
IPropertyModelplus specialized read only interfaces (IKeyPropertyModel,IDataPropertyModel,IVectorPropertyModel) and updates connectors and tests to consume them. - Refactors
PropertyModelaccessors to use configured delegates and updatesCollectionModelrecord creation to use a factory delegate instead ofIRecordCreator. - Extracts
ConfigureVectorPropertyEmbedding()to centralize embedding generator and embedding type resolution during model building.
Reviewed changes
Copilot reviewed 57 out of 57 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| dotnet/test/VectorData/VectorData.UnitTests/PropertyModelTests.cs | Renames test class to reflect the new interface based surface. |
| dotnet/test/VectorData/SqliteVec.UnitTests/SqlitePropertyMappingTests.cs | Updates tests to use IPropertyModel collections. |
| dotnet/test/VectorData/Redis.UnitTests/RedisCollectionCreateMappingTests.cs | Updates tests to use IPropertyModel[]. |
| dotnet/test/VectorData/PgVector.UnitTests/PostgresPropertyMappingTests.cs | Updates tests to use List<IPropertyModel>. |
| dotnet/src/VectorData/Weaviate/WeaviateQueryBuilder.cs | Switches helper signatures to IVectorPropertyModel and IDataPropertyModel. |
| dotnet/src/VectorData/Weaviate/WeaviateMapper.cs | Switches vector population helper to IVectorPropertyModel. |
| dotnet/src/VectorData/Weaviate/WeaviateCollection.cs | Switches search vector generation signature to IVectorPropertyModel. |
| dotnet/src/VectorData/VectorData.Abstractions/ProviderServices/VectorPropertyModel.cs | Implements IVectorPropertyModel and clarifies EmbeddingType contract. |
| dotnet/src/VectorData/VectorData.Abstractions/ProviderServices/PropertyModel.cs | Implements IPropertyModel and moves to delegate based accessors. |
| dotnet/src/VectorData/VectorData.Abstractions/ProviderServices/KeyPropertyModel.cs | Implements IKeyPropertyModel and adds SerializedKeyName. |
| dotnet/src/VectorData/VectorData.Abstractions/ProviderServices/IVectorPropertyModel.cs | Adds read only interface for vector property models. |
| dotnet/src/VectorData/VectorData.Abstractions/ProviderServices/IRecordCreator.cs | Removes the record creator abstraction. |
| dotnet/src/VectorData/VectorData.Abstractions/ProviderServices/IPropertyModel.cs | Adds read only interface for common property model behavior. |
| dotnet/src/VectorData/VectorData.Abstractions/ProviderServices/IKeyPropertyModel.cs | Adds read only interface for key property models. |
| dotnet/src/VectorData/VectorData.Abstractions/ProviderServices/IDataPropertyModel.cs | Adds read only interface for data property models. |
| dotnet/src/VectorData/VectorData.Abstractions/ProviderServices/Filter/FilterTranslatorBase.cs | Updates binding API to output IPropertyModel. |
| dotnet/src/VectorData/VectorData.Abstractions/ProviderServices/DataPropertyModel.cs | Implements IDataPropertyModel. |
| dotnet/src/VectorData/VectorData.Abstractions/ProviderServices/CollectionModelBuilder.cs | Deduplicates embedding configuration and switches to factory delegate creation. |
| dotnet/src/VectorData/VectorData.Abstractions/ProviderServices/CollectionModel.cs | Exposes interface typed property lists and uses a record factory. |
| dotnet/src/VectorData/VectorData.Abstractions/ProviderServices/CollectionJsonModelBuilder.cs | Uses SerializedKeyName for reserved key serializer name remapping. |
| dotnet/src/VectorData/SqliteVec/SqlitePropertyMapping.cs | Updates mapping logic to consume interface based property models. |
| dotnet/src/VectorData/SqliteVec/SqliteFilterTranslator.cs | Updates filter translator signature to IPropertyModel. |
| dotnet/src/VectorData/SqliteVec/SqliteCommandBuilder.cs | Updates command builder to use IVectorPropertyModel dictionary keys and interface lists. |
| dotnet/src/VectorData/SqliteVec/SqliteCollection.cs | Updates embedding generation bookkeeping to interface keyed dictionaries. |
| dotnet/src/VectorData/SqlServer/SqlServerMapper.cs | Updates mapper helper to take IPropertyModel. |
| dotnet/src/VectorData/SqlServer/SqlServerFilterTranslator.cs | Updates translator APIs to take IPropertyModel. |
| dotnet/src/VectorData/SqlServer/SqlServerCommandBuilder.cs | Updates command builder signatures and pattern matching to interface models. |
| dotnet/src/VectorData/SqlServer/SqlServerCollection.cs | Updates embedding generation bookkeeping to interface keyed dictionaries. |
| dotnet/src/VectorData/Redis/RedisJsonMapper.cs | Updates property concatenation to use IPropertyModel. |
| dotnet/src/VectorData/Redis/RedisCollectionSearchMapping.cs | Updates query build APIs to use IVectorPropertyModel. |
| dotnet/src/VectorData/Redis/RedisCollectionCreateMapping.cs | Updates schema mapping to consume IPropertyModel and derived interfaces. |
| dotnet/src/VectorData/Qdrant/QdrantMapper.cs | Updates mapping helpers to use IPropertyModel and IVectorPropertyModel. |
| dotnet/src/VectorData/Qdrant/QdrantCollectionCreateMapping.cs | Updates mapping APIs to consume IVectorPropertyModel. |
| dotnet/src/VectorData/Qdrant/QdrantCollection.cs | Updates search vector conversion signature to IVectorPropertyModel. |
| dotnet/src/VectorData/Pinecone/PineconeFilterTranslator.cs | Updates translation helper to take IPropertyModel. |
| dotnet/src/VectorData/Pinecone/PineconeCollection.cs | Updates distance function mapping to consume IVectorPropertyModel. |
| dotnet/src/VectorData/PgVector/PostgresSqlBuilder.cs | Updates SQL builder to use interface models and interface keyed embeddings map. |
| dotnet/src/VectorData/PgVector/PostgresPropertyMapping.cs | Updates type mapping and index info to consume interface models. |
| dotnet/src/VectorData/PgVector/PostgresPropertyExtensions.cs | Updates extensions to target interface models. |
| dotnet/src/VectorData/PgVector/PostgresMapper.cs | Updates property population helper to take IPropertyModel. |
| dotnet/src/VectorData/PgVector/PostgresFilterTranslator.cs | Updates translator signature to take IPropertyModel. |
| dotnet/src/VectorData/PgVector/PostgresCollection.cs | Updates embedding generation bookkeeping and search conversion to interface models. |
| dotnet/src/VectorData/MongoDB/MongoFilterTranslator.cs | Updates translation helper to take IPropertyModel. |
| dotnet/src/VectorData/MongoDB/MongoCollectionCreateMapping.cs | Updates index mapping APIs to consume interface typed property lists. |
| dotnet/src/VectorData/MongoDB/MongoCollection.cs | Updates search vector conversion signature to IVectorPropertyModel. |
| dotnet/src/VectorData/CosmosNoSql/CosmosNoSqlMapper.cs | Updates key rename behavior to use SerializedKeyName. |
| dotnet/src/VectorData/CosmosNoSql/CosmosNoSqlFilterTranslator.cs | Updates property access generation to take IPropertyModel. |
| dotnet/src/VectorData/CosmosNoSql/CosmosNoSqlDynamicMapper.cs | Updates dynamic mapping switch patterns to interface models. |
| dotnet/src/VectorData/CosmosNoSql/CosmosNoSqlCollectionQueryBuilder.cs | Updates projection building to filter on IVectorPropertyModel. |
| dotnet/src/VectorData/CosmosNoSql/CosmosNoSqlCollection.cs | Updates partition key property storage to IPropertyModel list. |
| dotnet/src/VectorData/CosmosMongoDB/CosmosMongoFilterTranslator.cs | Updates translation helper to take IPropertyModel. |
| dotnet/src/VectorData/CosmosMongoDB/CosmosMongoCollectionCreateMapping.cs | Updates index mapping APIs to consume interface typed property lists. |
| dotnet/src/VectorData/Common/SqlFilterTranslator.cs | Updates base SQL translation column generation and array binding signature to IPropertyModel. |
| dotnet/src/VectorData/AzureAISearch/AzureAISearchDynamicMapper.cs | Updates dynamic mapper switch patterns to interface models. |
| dotnet/src/VectorData/AzureAISearch/AzureAISearchCollectionCreateMapping.cs | Updates field mapping APIs to consume key, data, and vector interfaces. |
| dotnet/src/VectorData/AzureAISearch/AzureAISearchCollection.cs | Updates mapping loops and helpers to use interface models. |
| dotnet/src/InternalUtilities/connectors/Memory/MongoDB/MongoDynamicMapper.cs | Updates internal dynamic mapper to switch on interface models. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Changes
ConfigureVectorPropertyEmbedding(): Deduplicated embedding resolution logic that was repeated acrossProcessTypePropertiesandProcessRecordDefinition.IRecordCreatorwithFunc<object>: RemovedIRecordCreatorinterface and two implementing classes (ActivatorBasedRecordCreator,DynamicRecordCreator), replacing them with simple lambdas.TemporaryStorageName→SerializedKeyName: Moved fromPropertyModelbase toKeyPropertyModelwhere it belongs (only used by CosmosNoSql for key property JSON serializer name remapping).GetValueAsObject/SetValueAsObjectoverrides with delegate fields (_getter/_setter), configured viaConfigurePocoAccessors()/ConfigureDynamicAccessors(). Converted null-coalescing throws toDebug.Assert.VectorPropertyModel.EmbeddingTypeexplaining the[AllowNull]invariant.IPropertyModel,IKeyPropertyModel,IDataPropertyModel,IVectorPropertyModel):CollectionModelnow exposesIReadOnlyList<IVectorPropertyModel>etc., giving providers an immutable view post-build. All provider code updated to consume interface types.Note that the last is a non-trivial breaking change for providers (not users), which I think is fine (note that the provider-facing APIs are flagged as [Experimental]).