This module is currently under design. It is a data aggregator plugin for the External Entities module. The goal is to use external entities group data aggregator as an import/export pivot: a group of storage clients can be used to import data (read) while other groups can be used to export data (write). The module will provide an internal mapping between storage groups based on a data model defined by the user through the UI.

How will it work?
The Data Model aggregator will provide a UI where one could define a data model (ie a data structure). For instance, let's consider a "Gene" external entity using the Data Model aggregator. A (very) simplified "gene" data model could be defined through the UI:

- gene_name
- gene_accession
- protein_accession
- dna_sequence
- rna_sequence
- protein_sequence

Then, the "Gene" xntt could have 4 groups of storage clients: a first group called "dna_fasta", a second called "rna_fasta", a third one called "protein_fasta" and a last one called "gene_nexus".
The Data Model aggregator UI would allow to map the raw data provided by each of those 4 groups to the data model (using JSON Path). Then the UI would also allow to select which group are "input" groups and which groups are "output" groups and order them. For instance, if we put the "dna_fasta" group as an input group and the others as output groups (rna_fasta being first and gene_nexus last), viewing a "Gene" would only display what is available from the "dna_fasta" group but, since it is possible to infer the RNA sequence and the protein sequence from a DNA sequence (using custom data processors), by itself the dna_fasta group would be enough to map all the model fields. Then, a saving operation would write the data to the other groups and fill them. Now, if we did not have any FASTA sequence but a gene nexus file instead, we could use the gene_nexus group as input and the others as output. In this example, we show that the Data Model aggregator can be used to generate missing data or convert the data into different formats depending on what is available.
Now, from the Drupal entity perspective, we can have Drupal fields corresponding to the data model fields as well as extra fields. For instance we could have the following fields:

- field_gene_name
- field_ gene_accession
- field_dna_sequence
- field_dna_sequence_length

Those fields will not be mapped to groups/storage clients raw data but to fields of the data model instead. Therefore, changing which group is used as input or output as well as changing their order would have no impact on the mapping (but only on the data displayed). The Data Model aggregator can be used to decouple raw data mapping to Drupal mapping. One advantage is that if the raw source field name changes, it is not needed to updated the Drupal field mapping: just the data model field mapping needs to be updated for the given group. In our example, field_dna_sequence and field_dna_sequence_length are mapped to the same data model field "dna_sequence" and their mapping wont change if the source raw field used for dna_sequence changes.

Features

The module will be able to manage any number of group of storage clients. The user will be able to select which groups are enabled as input, or output or disabled.

Version	Type	Release date
1.0.0-alpha1	Pre-release	Apr 16, 2025
1.0.x-dev	Dev	Feb 24, 2025

External Entities Data Model Aggregator

Features

Activity

Releases