The aility to use data is fundamental to an organisation’s survival. Despite an awareness that the effective use of data underpins corporate success, many organisations are struggling to use data as a valuable resource to support their corporate strategies. We consistently see many organisations still grappling with the basics, lacking a ‘fit for purpose’ data management platform with core coverage integrating customer, product and services reference data with transactional data.
Our experience has shown us that, whether you are in the middle of a data transformation programme or about to embark upon one, organisations need to take a more innovative approach to address these challenges – finding smarter and faster ways to solve problems.
Technology is advancing at a pace and this brings an ever-increasing number of open-source tools and FinTech products to the market. The organisations that will be successful in the future will be those that first recognise they have internal challenges and limitations and secondly that building everything themselves is inevitably set to fail since no one can be an expert in everything. Businesses should focus on serving their customers using innovative products – not trying to become technology pioneers. Businesses need to be willing to partner with multiple companies that can provide the technology and expertise that they require. If an organisation can respond quickly to challenges, optimising the available technologies in the market and avoiding the delays usually presented by the on-boarding process, it is a long way to becoming truly agile in its ways of working and thinking and in its delivery capability.
Given the challenges outlined above and the opportunities available in the market today, organisations should be asking themselves whether they have the correct artillery. They need to focus on being able to easily integrate and deploy new components, tools and utilities rather than building them. Replacing everything you have is not an option, so these capabilities need to function alongside your existing technology platforms (legacy or strategic) and act as accelerators facilitating an organisation’s data journey. This approach allows real business problems to be solved in days and weeks, not months and years.
Let us reflect on four key questions that should resonate with you to better understand the types of capabilities you should be employing to solve your problems and generate benefits quickly.
1. So how do you actually accelerate a data conversion and migration journey and generate a whole set of benefits in the process?
To address these hurdles, you should be considering data migration tools which come at the problem in a different way to traditional pure ETL tools. One approach we have seen uses a combination of both ETL and non-ETL tools in a two-stage reverse and forward engineering approach. It accelerates the data migration, management and integration delivery for both structured and unstructured data and is interoperable with many existing applications.
Taking a look at each stage in turn, the first involves reverse engineering, reconciling and cleansing your legacy ETL and business transformation logic, using AI/ML and natural language processing (NLP) capability. AI/ML accelerates this process by identifying the outliers that exist within the metadata and business transformation logic. This focuses your efforts immediately on the areas that need attention. The second stage involves forward engineering the deployment of your cleansed business logic and metadata into your target platform. This process is similarly accelerated by AI/ML toolkits along with data quality automation tools which compare the quality of data of the target platform with that in the legacy estate.
We have seen that organisations that are able to apply this technique to their estates achieve migration from legacy to target platforms, with cleansed business logic and source to target metadata mapping, in significantly accelerated timeframes – in some cases reducing timelines by up to 70%. Also, because one of the outcomes of this process is a more transparent view of the data discovered and produced in the organisation, this usually leads to a more effective marriage between the producers and the consumers of data. This is good on several fronts. First, by having these tools, the producers can better understand what they are responsible for and can therefore take ownership for their data. Secondly, they can understand where, how and why the data they produce is being used by consumers for onward business services, creating a positive feedback loop. Thirdly, the role of data management becomes that much more straightforward in terms of how it operates between the consumer and producer which in turn helps to promote higher quality, timely, accurate and complete data for the consumer. Finally, and perhaps the greatest benefit of all, this drives the embedding of a new culture of data ownership and ways of working deeper into the organisation.
2. Is it possible to simplify the data mapping and transformation process?
Let us consider what usually happens when delivering a programme that centres around the journey of data through an organisation.
The business data analyst spends significant amounts of time working with the business subject matter experts (SMEs) trying to get up to speed with the domain, and gathering and understanding their requirements. This involves understanding the source to target mappings and transformation logic of the data so that, when ultimately implemented, the data will be consumer ready as information for consumption by business services. This information is then conveyed to the developers and testers working with specific technical skills to build and implement these requirements, usually further complicated by the use of proprietary based software platforms. And so it goes on.
But you get the picture. There are many challenges with these more traditional ways of running projects. These include multiple hand-off points between the many roles working on the project. Also business data owners have to spend significant amounts of time conveying expert knowledge to less experienced or less knowledgeable business data analysts. The process is lengthy and often involves repetitive cycles between gathering requirements and delivering accurately on them, not to mention the associated project management and reporting overhead.
Consequentially, all of these factors tend to roll-up into high costs and lengthy delays to projects, poor quality deliveries and less than satisfactory outcomes. What if you could automate the role of the business data analyst and move the requirements gathering to the data producers. How much better would that be?
Well the good news is that tools like this do exist in the market today. There are comprehensive, feature-rich, cloud-native tools that can streamline and automate data-loading and data-transformation tasks. They reduce the need for manual analysis and contain all of the necessary user interfaces and audit capabilities to underpin these processes as well. By their nature, many of them are DevOps centric, resilient, scalable and audit-control aware. The frameworks that come with these tools usually leverage AI and ML to help automate key aspects of the business data analyst function by analysing multiple disparate data sets (structured and unstructured) and scoring them for alignment. NLP can also be applied at a lower field and underlying data level if required using similar matching and scoring techniques. Ultimately, all relationships can be catalogued with an inventory of aligned data to enable source to target mapping. As well as providing a powerful visual interface for ease of use, these tools also provide a fully-documented and version-controlled source-to-target master for sign-off.
The benefits of achieving the above are significant. You will have gone a long way towards automating a significant part of your business data analyst function, but more importantly you will have empowered the business data owners to take charge of their own analysis and requirements gathering as well as providing the capability for them to build, map and deploy their metadata, data and transformation logic to the target platform without the extensive team of analysts, developers and testers. The business data owners will also be able to engage directly with the business consumers acting more like a product owner. Last, but by no means least, you will be able to reduce team sizes and the number of hand-off points and genuinely move the delivery of data programmes to a DevOps, data product, agile delivery model.
In short you will have simplified your data programme.
3. Can you really use AI/ML to automate the job of data quality?
As well as the challenges associated with the ability to map, load and transform data as outlined in the previous question, many issues also arise from the quality of the data sourced into an organisation. Price breaks arising from Corporate Action splits and stale prices received from a provider are just a couple of examples of issues that occur on a regular basis resulting in data quality issues. How often have you been involved in an incident where a significant amount of time has been spent over weeks and sometimes months, with people from data management, operations, business and technology, trying to understand why you have been operating on stale data for many months, let alone how to resolve it, then figure out the impact on the business and customer?– Sound familiar? What if you could automate a lot of the process and trap the data quality issues before they have had time to infiltrate the data platforms and application and impact consumers in the business? That would be a great result.
Again, there are advanced pattern-matching utilities that can significantly improve accessibility, consistency and the quality of data across legacy systems and multiple-data sources. These proven tools leverage metadata and AI/ML-driven scoring techniques. The components include utilities for full instrumentation, lineage, cataloguing and dashboards along with automated prediction, prevention and fixing of data errors.
Achieving all this in a more proactive way, to continuously improve, recommend and even automate error correction on ingestion, will significantly reduce the amount of time spent fixing data issues as well as improving data accuracy and creating fewer manual touch points.
4. How far can you go in automating the data governance and data estate management (some call it data dictionary) function?
Whether you are looking at the journey of data from source to target from an operational or a change delivery stand-point, the ability to instil and maintain an agile data governance framework for legacy and strategic estates is a significant challenge. Simply put, asking a data producer to take ownership for their data, whether legacy or new, without the tools to help show them what they have and how to manage it, is a significant challenge. It is often the cause of endless discussions about why data governance is so difficult to implement in practice.
We have explored a number of capabilities above which, if adopted, address many challenges for organisations. Because the tools that underpin these capabilities do what they do, a great by-product is that they also automate and simplify the data governance and data estate management. They include utilities such as intelligent lineage of sources to target version-controlled transformation rules and business glossary with integration to metadata repository. They also provide visualisation capabilities and automated documentation production. In essence, you are building and managing your governance function as you go.
So now you have data agility with agile data governance. Crucially, it adds transparency to the data which brings the control and accountability of data to the data producers. It simplifies the role of data management and makes the business consumer comfortable that they are walking on a stable bridge that is engineered for quality.
What has also been achieved is a cultural and organisational shift towards the data producers becoming owners of their data, knowing where and how it is used both in an operational and change capacity.
Scott has more than 25 years of experience working in the finance sector with large and medium multinational enterprises where he has been responsible for helping organisations successfully innovate, transform, change and modernise their data, digital and platform journeys.
Scott has held many senior leadership roles across organisations such as Bank of New York Mellon, Fidelity International and Royal London.
At Holley Holland, Scott heads up the Data Practice bringing the value of his experience developed over the years to Holley Holland’s clients today and in the future.