DATA DRIVEN TRANSFORMATION: The case of a manufacturing organization leveraging big data and data mesh to drive competitive advantage

Written by: Dr. Anurag Vij

We recently wrote about using a product-centric approach (vs. traditional projects) to drive digital transformation. Whilst there are many underlying factors to a successful digital transformation, enterprise data and how that’s leveraged across analytical planes in the enterprise is critical to the success of any transformative effort at scale. This post shares a point of view on leveraging a similar product-centric approach to big data, which we believe in time will become the DNA of every successful enterprise imparting itself the ability to continuously evolve and transform, leveraging a data mesh architecture.

Although it’s logically possible to take an industry-agnostic approach to enterprise data, analytics are only meaningful and drive the expected business outcomes when they possess real-time contextual awareness. For that very purpose, instead of having a general discussion on enterprise data without a business context, I will take an example of a regional manufacturing organization (referred to as RMO through the rest of this article) that produces industry chemicals. RMO is pursuing several transformative efforts relying heavily on analytics, insights, and thereby the business decisions that can be made by leveraging its enterprise data that exists in a multinational and complex organization. This transformation is spearheaded by RMO’s recently appointed Chief Data Officer (CDO), who I sat down with.

RMO has been in business for a few decades and has implemented several industry-recognized best practices on driving lean manufacturing, implementing six sigma, and continuously training its workforce on its evolving processes and business methodologies. As much as the broader consensus across the organization is that its operations are lean and one of the best in the world, the leadership believes they need to strive for new methods on driving competitive advantage that emerges from the quality of its products, the speed of production, and the cost of production, with data at the heart of all things.


RMO has outlined a 2×5 matrix across the shop-floor and the supply-chain. This comprises a total of ten OKRs that RMO believes will create competitiveness across quality, speed, and cost. For brevity, I am only listing the objectives and data-led priorities that define the key results:

Shop Floor:

a.      Improve Quality (5 points): Implement real-time production process monitoring, equipment fault monitoring and root cause analysis, and variability monitoring.  

b.     Increase Yield (2 points): In addition to process and equipment monitoring, implement ability to assess process complexity and effectiveness in the context of variability and its impact on yield (reducing defective products as a percentage of total products produced).

c.      Improve equipment uptime and performance (as defined in OEE, Overall Equipment Effectiveness) (5 points): Implement fault prediction and predictive maintenance.

d.     Reduce Waste (3 points): Implement streamlined inventory management, recycling, scaling and substitution, and minimizing production line stoppages.

e.     Improve worker wellness by reduction in leaks and accidents (10 points): Implement IIoT sensors (Industrial Internet of Things) in conjunction to SCADA systems to enhance fault and leak prediction analytics that drive swift decisions to avoid hazardous situations implemented through dashboards and frontline worker IoT devices.

 Supply Chain:

a. Achieve near real-time demand forecast and order management: Analytics based on sales, predictive forecasts, supplier inventories, geo-political and other factors that impact demand, and integrating and automating supplier order management.

b. Benchmark supplier performance: Implement analytics across a common set of KPIs for suppliers and defining performance benchmarks feeding into supplier-side improvements, incentives, and future selection.

c. Achieve near real-time multisite inventory management: Remove data silos in inventory tracking, order management, transfers, and purchase decisions across sites to reduce stockouts, waste, and improve turnarounds.

d. Transportation optimization (10 points): Reduce freight spending and increasing turnarounds through analytics that create optimized and efficient load-plans.

e. Support and Returns efficiency (5 points): Implement predictive customer satisfaction, transportation analytics, and inventory management.


Given that RMO has been in operation for decades and that its business and organizational structure (Graphic 1) has evolved over time, the enterprise data is fragmented across various business units, legacy and monolithic systems, complex governance barriers, and with questionable quality.

No alt text provided for this image

Graphic 1

The RMO architectural team started by defining five core principles for RMO enterprise data strategy:

  1. A cloud-first strategy that caters for hybrid and multi-cloud scenarios.
  2. A data-driven culture that fosters open, collaborative, and ever evolving participation of the entire workforce.
  3. Data as a product to achieve scalability and quality with a domain driven design (DDD) where data domain-nodes follow the domain boundaries (vs. technology boundaries).
  4. A self-serve data platform that prioritizes business use-cases over technical complexity.
  5. A federated data governance model.

Let’s double-click on each of these to understand the thought process behind the selection of these core principles and how these would assist RMO to deliver to the 10 prioritized OKRs:

1.     A cloud-first strategy: While the organization has significant investments in legacy systems and tools such as SCADA solutions, RMO understands the benefits of cloud and cloud-based solutions that offer much higher scalability, reliability, availability, security, and new abilities such as edge computing. Further, RMO operates in a regulated environment in certain countries, including those with data residency restrictions, that motivated RMO’s decision to choose a hybrid strategy.

 Consider the example of data collected through SCADA solution from waste management plants that can now be processed in real-time together with production process data, inventory data, and transportation data to increase recycling speed and efficiency. A cloud-first strategy will help RMO implement advanced scenarios over time, including the use of edge-computing for faster cycle and decision times. 

2.     A data-driven culture: To be able to innovate quickly, RMO believes in the power of data democratization. For that to manifest, the workforce across the organization must continuously interact to learn and improve the desired business outcomes. Many of the outlined OKRs require automation and use of AI, which can only be achieved through a data-driven culture across the enterprise. 

An enterprise-wide unified data strategy together with a data-driven culture will enable RMO to make more informed decisions, such as in the case of benchmarking supplier performance which will further drive performance improvements, incentives, and even supplier selection. 

3.     Data domains and data as a product: To manage complexity, RMO chooses to use bounded-context wherein the domain influences the boundaries of the data product. This in turn drives clarity on which data and underlying code is owned, managed, and governed by which teams, and where the dependencies are that need orchestration across other domains and data products. A data product must serve a specific business need. As needs evolve to deliver successfully to the OKRs, that define the NorthStar of business success, the data products must evolve too. The data product may produce insights to serve the business needs on its own or by leveraging, integrating, or making sense of data from other data products.  

As an example, driving improvements in yield for an already lean manufacturing process requires deep analysis and predictive models looking at multiple input and output variables, and variability effects of such. A data mesh architecture that drives integration across various data products in the enterprise caters for such use-cases while keeping data owned and managed by those who understand it the best (Graphic 2).  

4.     A self-serve data platform: For the teams to autonomously own and manage their data product, a self-serve data platform is required. RMO chooses for each data domain to align with one data landing zone and each data product within to align with one resource group. The data landing zone provides capabilities such as network, monitoring, metadata services, data lake services, ingestion and processing, data integration, reporting, and so forth through its resource groups. Furthermore, each data landing zone and management zone align with the underlying subscriptions. The data management and analytics scenario templates inherit their respective policies from the hosting data management landing zone, which simplifies management, provisioning, integration, and testing.  

The shop-floor produces multiple products, with interdependencies across some and serial manufacturing cycles across others. To drive improvements in quality, teams need to continuously monitor data emerging from multiple systems and look for root causes of variability. This in turn may need the teams to provision resources such as compute cycles or visualization services on-demand for large data sets. A self-serve data platform as part of the data mesh architecture enables such requirements with elasticity on-demand while staying cost efficient. 

5.     A federated data governance model: To drive autonomous decision making at data product level while ensuring that each data owner can trust others and their data products, RMO implemented an enterprise level data governance body. The data management landing zone, that uses the data management capability from data management and analytics scenario, provides a federated model of governance for the self-service platform and the data domains within. Data management landing zone is all-encompassing of the data domains across the enterprise. It provides shared resources for all data landing zones, a common architecture for data products, central visibility of data ownership, consistent data access and privacy policies, and ensures data quality.   

In addition, RMO chose Azure Purview as its data governance service to simplify automated data discovery, lineage identification, and classification, and to build a unified map of its data assets across a hybrid environment inclusive of legacy on-premises systems. In other words, RMO will now be able to discover and manage data across its legacy and modern systems. With features such as Data Catalog, RMO is able to document key business terms and their definitions to create a common vocabulary across the enterprise. This is critical to RMO’s needs of being able to move rapidly from ideation to proof through minimal viable products (MVPs) to creating data products that serve the defined OKRs.

No alt text provided for this image

Graphic 2

Underlying these architectural decisions is the use of an identity-driven data access model that builds upon the principle of least privilege (through MIs, user-assigned MIs, and nested security groups), leverage of the Microsoft Zero Trust security model, and underpinning network design that includes network isolation through private endpoints and private network communication. This design caters well for uniform data access while providing centralized data governance and auditing.   

While there are various ways to achieve the outcomes RMO is targeting, a unified enterprise data model that puts the empowerment and ownership with those that understand their data the best and leverages a data mesh architecture is well suited for an organization such as RMO. 


Industry leaders in manufacturing are moving towards Industry 4.0 that leverages data from sensors, robots, processes, and simulations to enable smarter and faster decision making, opening the possibilities of new business models that companies can explore. Establishing a data driven culture, appointing leadership positions such as CDO and empowering them, defining a prioritized list of OKRs that keeps the entire organization focused on building data products that define the maximum business benefits, and underpinning all of this with well thought through architectural and governance principles are critical to success of a data driven transformation for any organization.


Sincere gratitude to the following amazingly talented leaders at Microsoft for their contributions and reviews of this article: Andreas Wasita (Managing Architect), Angus Foreman (Chief Architect), Danny Tambs (Managing Architect), Darren Dillon (CTO), Hany Azzam (Specialist Sales Lead).


1.    The CDO Seat at the Cloud Table:

2.    OKR Wiki:

3.    Azure Data Landing Zone:

4.    Cloud Adoption Framework – Introduction to data management and analytics scenario:

5.    Azure Purview:

6.    Business Glossary for Governed Tagging:


Leave a Reply