Imagine this scenario: You are a new Anaplan Customer, looking to start building models in your Anaplan Ecosystem. You want to do planning and reporting around transactional data. What is the first area you should focus on when laying the foundation of your Anaplan Ecosystem?

Solution: Create a Data Hub! The Data Hub will serve as your data's entry point into the Anaplan Ecosystem. Here all of your data and metadata will be stored for use across the Ecosystem. All future spoke models will point to the Data Hub for data loads. Having a Data Hub ensures consistency of data across every model and serves as a checkpoint for what comes in and goes out.

The Data Hub is where you connect with external systems. This is the place where you will manage data integrations scheduling from external systems. Any data transformations or taxonomy management will occur here, so that your flat lists, hierarchies, and system modules pull from a single source of truth. Let us take a look and see how we can optimize transactional data loads with the power of Data Hubs.

Data Source Considerations: Where is your Data coming from? How will it reach Anaplan? Who manages this data? Does your data need any text string math (such as concatenations, FINDITEMS, and parsing text strings)? With regards to transactional data, these are some questions that are important to examine.

Data Formats Considerations: When loading Data into your Data Hub, it is important to minimize the amount of text formatted line items. Line items that are text formatted and process calculations, reduce performance. The underlying technical reason for this is text formatted cells generally take up more memory in the backend and thus require more processing power. There are valid situations for text-formatted line items, however, you should aim to minimize them for model performance considerations. See visual below for breakdown in memory usage between different line item formats.

No alt text provided for this image

Caption: size of each line item format, with 26 million cells in a module. Credit to David Smith for the data on different line item format sizes. See link for breakdown in module setup.

No alt text provided for this image

Here are a few guiding principles you should keep in mind:

  1. Choose your data integrations tool wisely. See what makes most sense for your data. If your data comes in clean from your source system and requires minimum transformation, perhaps consider Anaplan Connect with a scheduler. If your data requires heavy cleanup and transformation, consider a more advanced ETL tool. Doing as much data cleanup, transformation, and string math ahead of time as possible, will help make your Ecosystem that much more efficient.
  2. It is recommended that you deploy a data governance team. This team will serve as a resource that governs what data goes in and out of Anaplan.
  3. If text string math is absolutely necessary in Anaplan, then try to perform text string math in a Data Hub.
  4. Keep the Data Hub in a separate workspace from other spoke models. This will help separate resources between calculation data heavy models and your spoke models.
  5. Create saved views in the Data Hub, only import the data you need to use in spoke models. Ensure your unique IDs, hierarchies, and flat lists organization has been prepared in the Data Hub within a data staging module.
  6. Minimize the use of text formatted line items in spoke models. List formatted line items take up less space and offer better processing power. Consider creating lists first in spoke models, then driving module dimensionality, and line item formats using those lists.
  7. Use System modules. System modules are the "glue" that holds everything together. System modules are an effective way to store attributes and metadata of a dimension.
  8. Utilize DISCO and leverage naming conventions across lists and modules. Make use of prefixes and suffixes when labeling lists and modules.
  9. Minimize sparsity, dimensionalize line items utilizing critical thinking. Consider which dimensions do line items truly require? Where is the best place to reference them? Think of PLANS when building models.
No alt text provided for this image