Hi all,
We realize that we are able to search the data from our transient fields when testing the search functionality of our application. But that is what we do not expect. Does anyone know why A12 index the transient fields? Is there any official way to not index the field or exclude the transient fields from searching?
We are using:
- The Lucene search service
- The ADD_DOCUMENT operation through RestRpcOperationsClient for creating the A12 documents
- The LIST_DOCUMENTS operation through RestRpcOperationsClient for searching data.
You can explicitly remove fields before indexing on DocumentBeforeIndexEvent handler.
Here is the example in our project. We remove all the fields except some fields and attachment_id (this is used to sync attachments after document index event)
private val retainedFieldPaths = listOf(
"/metadata/user/idp",
"/metadata/user/bund_id/bPK2",
"/metadata/payment/gcMerchantTxId",
"/metadata/document/status",
"/metadata/document/unique_key",
"/metadata/document/draft_modified_at",
"/metadata/document/transferred_at"
)
@EventListener
fun changeIndexDocumentContent(documentBeforeIndexEvent: DocumentBeforeIndexEvent) {
documentBeforeIndexEvent.dataServicesDocument.apply {
kernelDocument = DocumentImpl(modelName).apply {
kernelDocument.entityInstances.filter { entityInstance ->
retainedFieldPaths.any { it.contains(entityInstance.path) } || entityInstance.path.endsWith("/attachment_id")
}.forEach { addEntityInstance(it) }
}
}
}
Hi @quynh-still-birch ,
The transient fields are removed from the documents during serialization not during deserialization. So when the document is received by DS we first de-serialize JSON to IDocument (the transient fields are still present). We will use this instance for indexing (transient fields are still present) and then the document serialized to XML (transient fields are removed). I can see now how this can be confusing and not well documented.
Please create a ticket and we will create a new configuration and remove transient data from the index without a need of implementing extension points.
This answer is valid for Data services 32.2.0 - 36.0.0
Thank you, @tomas-thin-gale
I created a bug ticket A12-14795. I think this is a bug because the indexed transient fields are different before & after restarting the application
@tomas-thin-gale How does this relate to configuration option
mgmtp.a12.dataservices.documents.validation.persistTransientFields = false: boolean
This property allows overriding documents persistence to persist transient fields. The default state in v34.0.0 will be not to persist transient fields.
Document related configuration
Are you stating that this configuration option has no influence on the index?
Hi @andreas-fresh-mesa,
Yes exactly. This configuration is a kernel configuration how the documents are serialized to XML/JSON, not the way how they are deserialized from XML/JSON. For this reason we will introduce new key in the newest version and afterwards (in the breaking release) we will unify this behavior
Hello,
the fix to the above-mentioned ticket is part of 2023.06-ext1 release.