Digital Marketing

Migration is nothing but ETL: Extract, Transform, and Load

The Importance of Content Migration

When companies move from a legacy Content Management System (CMS) to a robust CMS like Adobe Experience Manager (AEM), there are many decisions that need to be made including: what modules are needed, which version of AEM to utilize, what integrations will boost performance, and what custom development is necessary?
An area that is often ignored is defining a content migration strategy, which provides a roadmap on how to best transfer existing content into AEM, in the most effective and efficient manner. A successful migration enables companies to increase their productivity by providing better access to—and control of—their existing content.

Content is Not Just Pages

When discussing content migration, most of us think “content = pages”, but content can be anything:

    • Pages
    • Documents (pdfs, excel files, videos)
    • Images
    • HTML code
    • Custom code
  • Any other assets available on your site

Each content type will require a different approach during the migration process, based on the nature of the content and how clean it is. These approaches can include:

    • Automated scripts for structured high volume content
    • Manual content scraping and authoring for unstructured content that requires creativity on authors
    • Combination of automated and manual, automation to get content into system, but manual cleanup to adhere to specific business needs like ADA
  • Manual migration for low volume content

Legacy Systems and Clean Content

After working with many custom .net websites and SharePoint migrations, we have found an alarming trend of redundant data that is present because of the limitations and requirements of legacy systems.

For example, while working with one customer, we received a list of 99,747 images to be migrated. After careful review by our migration expert, we found that each image file had at least 5 renditions – a requirement left over from the Legacy System. The reality was that only 19,720 images were identified as required for migration.
The takeaway is analysis of content before migration is critical to avoid migrating unnecessary and/or junk content into AEM. Failing to do this can result in a poorly performing system and an elongated project timeline that costs you more time and resources than necessary.

ETL: Extract, Transform, and Load


ETL (Extract, Transform, and Load) is a process in data warehousing, responsible for pulling data out of the source system, and placing it into a data warehouse. The concept of ETL can be applied in content migration to achieve higher efficiencies when using automated scripting migration.

It all starts with analyzing the source, which is usually an xml file format from the current CMS. In order to achieve a successful migration strategy, it is critical to first create a Source Target Mapping document that identifies:

Source

The source is all of the data fields from the current system that will be migrated. Each data field in the xml is identified with an example; having both data field name, and a corresponding sample will help identify if it is required for migration or not.

Transformation Rules

Transformation rules are edits that need to be applied to content such as:

    • Any clean up for URL (removing unnecessary pages or default.aspx)
    • Converting special characters in URLs to AEM best practices (ex: replacing characters with a simple hyphen, such as %20 or $ with – )
  • Moving content pages within content tree structure based on article date (ex: /interest/topic/news/topicnews/article-title-20150612.aspx with an article date of 2015-06-12 is transformed into /content/news/2015/jun/article-title-20150612.html )

Target

The target identifies where the content is stored within AEM at both the component and JCR level.

Investing time and resources in migration strategy and analysis upfront, will help you to avoid polluting your AEM system with unnecessary content; thereby improving your overall system performance. Highly qualified migration consultants, like ours, provide a deliberate planning approach that helps their team set the appropriate expectations for deliverables. Our consultants will also enhance execution by establishing roles and responsibilities in the services process. Their involvement from the outset can help reduce much of the frustration that companies experience when attempting to overcome the inherent complexities of the data migration process.

Divya Kandikatti
Divya is the Director of PMO/BSA here at Blue Acorn iCi. She works with clients throughout their digital transformation journey, working closely to align business goals and strategic vision, ensuring projects are delivered with the highest ROI. She has over 10 years of experience managing and delivering complex projects for enterprise clients within Finance, Publishing & Media, Banking, Hospitality. Divya is trusted by clients like ALSAC/St. Jude Children's Research Hospital,  AICPA, Domtar, Ingersoll Rand, Panera Bread, and Capital One. She is very competitive and learns on the fly. On any given day, you will find her either doing kick-boxing or enjoying acrylic painting or trying her hand at Standup comedy.
View All Posts By This Author

Subscribe to Our Newsletter

Get the latest insights from Blue Acorn iCi

Let's build something together.