Migrating content from one system to another system, whether it’s AEM or another content management system (CMS), is a challenging journey for many reasons. Just making decisions on what is or is not relevant is incredibly tricky and time consuming. But with the right processes in place, including saving your legacy data, migration can be a smooth transition.
Categories of Data
When migrating data, it’s typical to see anywhere from 100-400 different fields, each with its own significance to the new system. All of this information, regardless of which XML is used, can be broken down into two categories:
- Author generated: content that is added or edited by the authors within a CMS
- System generated: content that is created by the CMS
Author Generated Content
During migration, all author generated content needs to be put into the new system because it is what makes up the majority of the page content on a site. Some author generated content might need to go through a cleanup and/or transformation process before migrating it over to AEM. All of this content will be need to be made accessible to authors, either as part of the page properties or as part of a template that was also migrated over.
Some examples of author generated content include:
- Article date
- Text area and any other relevant content
System Generated Content
System generated content is content that is added by system; either as a result of an action taken by an author, or due to the nature of the system in place.
Some examples of system generated content as a result of an author’s action are:
- Template value based on template selected by author
- SystemPublishedDate based on when author published specific page
- Serverurl and Pageurl is based on where page was authored
Some examples of system generated content that is created without an author’s actions include:
- Metainfo which is just an amalgamation of various content within a page that has no significance within the target system
- Level which identifies which level page is on
- Encrypted values within a field that have no significance in the target system
Always migrate over fields from the legacy system into JCR so that they are available if needed for any future improvements or reporting. Ensure all reference legacy fields only start with a standard prefix like “old” or “legacy” (EX: oldPageUrl or oldTemplate, etc. The screenshot of a sample JCR node shows all system generated values with a prefix of “old”).
During migration, you should always clearly identify that these are non-authorable fields; meaning they should not be available in any page content or page properties. By doing this, you will avoid any accidental manipulation that your authors might make by mistake.
Storing legacy data is a solid contingency plan for helping to resolve bugs or issues that may arise after the migration process has been completed. You never know when or why you might need to reference old fields or data on the new site, and having the information organized and clearly labeled makes the process much easier.