Migrations are critical to the overall success of any Content Services implementation and should be treated with the same detailed planning as the other components of the project. Migrations can be compared to laying the foundation of a house. People don’t often pay attention to this work stream, but laying a solid foundation is critical to the success of the entire project. Migrations into content services platforms such as Alfresco, Nuxeo, Documentum or OpenText typically require moving millions or billions of objects and metadata. If the foundation of seeding the new application with content and metadata is not done correctly, the end result will be a failed implementation.
In our blog we previously published an article, 10 Migration Tips for Success. This post will now focus on the migration software, listing the top 11 features that should exist in your migration tool to ensure a successful migration.
1. Multiple Source Adapters
Source adapters allow migration software to connect and migrate content and data out of different source repositories (File System, FileNet P8, Mobius, etc.). When the migration software is able to connect to different source systems to extract content, as well as connecting directly to your target ECM repository, the migration can be performed in one step, versus using one software tool to extract data and content and a second software tool to import content (two steps).
Having a tool that can support the extraction, transformation and load activities all in one step greatly simplifies any migration and reduces errors. Selecting a migration tool that supports many different source systems will provide more flexibility to perform one-step migrations now and in the future. While future migrations are not always known, we guarantee they will occur much more frequently than planned due to consolidation and acquisitions. For more information on the importance of a one-step migration, see Tip 9 in our Top 10 ECM Migration Tips for Success.
2. Batching Flexibility
Staying organized and batching the migration of millions or billions of documents into manageable sets is critical for large migrations. It can be unwieldy to try to migrate all documents as one big batch. Most clients find it easiest to batch content into smaller pieces based on metadata. Batches are most frequently created based on document type, creation date, folder path or a combination thereof.
Some tools have limits on how content can be batched (only by folder structure, only by one property, etc.). Having the flexibility to batch content based on one or more property fields, including folder structure, will provide the flexibility needed for migrating content in batches. In fact, it may be a non-negotiable requirement if a gradual migration or phased onboarding is planned.
3. Transformation
Typically metadata needs to be cleaned or enhanced as part of the migration. Clean data ensures a controlled and successful start for the new application. Often, data cleansing can be a significant side benefit of moving to a new application. Any chosen migration tool must have the ability to:
- Map metadata to new fields.
- Transform metadata into one or more new fields. For example, split the user’s name into a First Name and a Last Name field.
- Add new metadata from different sources, including database tables, XML, Excel/CSV, or other source systems.
4. Rendition and Version Support
Any content services migration tool must support the basic tenets of content management, including the ability to migrate multiple renditions and versions. In addition to migrating multiple renditions, a very common requirement is the ability to generate a PDF rendition from another file format and also the ability to combine individual TIFF files into a single multi-page PDF document.
5. Create Folder Structures on the Fly
Content is typically reorganized when migrated into a new content services platform. As a result, the migration tool needs to support the ability of lazily creating a folder structure as content is migrated in. While folder structures can be manually created prior to migration, it is tedious and error prone. Placing this job in the hands of the migration tool ensures that only the necessary folder paths are created.
6. Document-Level Security Support
A robust Content Services migration tool should set the security appropriately for each document that is migrated into the application. Security is typically set based on one or more metadata fields (i.e. Document Type and Document Status), or it is inherited from the folder in which the document is placed. If document versions require different security models from one version to another, the tool should be evaluated to confirm it can support this advanced security requirement.
7. Throughput Reporting
The migration execution step should not be a black box. Information such as the total number of document nodes, number migrated, average throughput broken down by time reading from source, time to transform and time to write to the target repository is important to allow the migration to be optimized.
It is common that optimized migrations can reduce the migration time by 30-50% once data is analyzed, determining the bottle necks. The architecture of the target repository may need to be changed (additional CPU, RAM, etc.), more or fewer threads may need to be utilized in the migration tool, or the approach to implementing more complex transformations may need to be modified to achieve maximum throughput.
These adjustments can only be made if the throughput data points are made available.
8. Fault Tolerance
Some migration tools stop if a failure occurs at any point during the extract, transform and load process. The problem has to be fixed, detailed analysis needs to be performed to determine which documents migrated and which ones did not, and then the new batch needs to be generated and run again. This analysis can introduce hours or days of delay when each error occurs.
A key feature to have in your migration tool is the ability for a migration to continue should an error occur. For example, when migrating over 100,000 documents, if one document fails due to corrupt data in the source system, the other 99,999 documents will still migrate successfully and files in error are bypassed and written to the error log.
9. Detailed Audit Trail Reporting
Detailed audit logs are paramount to a smooth migration. Error logs are important because migrations will stop for a myriad of reasons (lost connection, database limitations, corrupt data, etc.). Being able to quickly understand where a migration stopped and what content and data need to still be migrated is very important to keeping a migration on time and on budget.
Success logs are equally important in proving that the migration occurred as expected and for traceability when users ask why a document was not migrated. For example, success logs will help explain why a document was not included in the migration (i.e. it did not have the appropriate document status).
10. Delta Migration Support
Delta migrations allow for the source repository to continue to be used while the bulk migration is being executed. With large migrations, it’s risky and sometimes impossible to execute a full migration during an outage window. Migration tools that allow for smaller delta migrations to be run after the bulk migration is complete to synchronize any content that was added or modified since the start of the bulk migration, add a significant benefit. This strategy allows for migrations to occur over a longer period of time and allow for more careful verification steps to be taken prior to going live.
Once everything has been verified, a final delta migration can be scheduled during an outage window at a time when users are ready to switch from using the old system to the new system.
11. Scalability
Migration software should be able to scale to support the volume of content being migrated. Migrating 30 million documents is a very different effort than migrating a billion documents. If your organization is migrating hundreds of millions or billions of documents, it is imperative that the migration tool be multi-threaded and/or support multiple instances concurrently reading and writing to the same repository.
SUMMARY
Using a robust and proven migration tool that supports the requirements above will provide flexibility and also promote an efficient migration process so the transition to the modern content repository starts off on the right foot!
Please add any other features or functions you think we missed in the comments below or let Docuvela know on LinkedIn.
0 Comments