(editorial changes)
 
(21 intermediate revisions by 5 users not shown)
Line 1: Line 1:
__NOTOC__
 
<imagemap>
<imagemap>
Image:1.5-DataStrategy.png|frameless|1000px|Overview of Ignite AIoT Framework
Image:1.5-DataStrategy.png|frameless|1000px|Overview of Ignite AIoT Framework


rect 79 4 341 65 [[AIoT_Execution_and_Delivery|AIoT]]
rect 4 0 651 133 [[AIoT_Framework|More...]]
rect 357 4 618 63 [[Artificial_Intelligence|Artificial Intelligence]]
rect 970 0 1298 133 [[AIoT_Data_Strategy|More...]]
rect 636 4 894 65 [[Internet_of_Things|Internet of Things]]
rect 651 0 970 133 [[Artificial_Intelligence|More...]]
 
rect 1298 0 1767 133 [[Digital_Twin_Execution|More...]]
rect 1417 112 1451 750 [[AIoT_Data_Strategy|AIoT Data Strategy]]
rect 1767 0 2095 133 [[Internet_of_Things|More...]]
 
rect 2095 0 2542 133 [[Hardware.exe|Hardware.exe]]
rect 1460 117 1788 209 [[Business_Model|Business Model]]
rect 1462 222 1788 312 [[Product_Architecture|Product Architecture]]
rect 1462 328 1786 420 [[AIoT_DevOps_and_Infrastructure|AIoT DevOps & Infrastructure]]
rect 1462 434 1786 523 [[Trust_and_Security|Trust & Security]]
rect 1462 539 1786 631 [[Reliability_and_Resilience|Reliability & Resilience]]
rect 1462 645 1786 735 [[Verification_and_Validation|Verification & Validation]]


rect 22 674 429 773 [[Product_Organization|Product Organization]]
rect 2764 128 3539 257 [[Product_Architecture|More...]]
rect 505 672 1011 773 [[Sourcing_and_Procurement|Sourcing and Procurement]]
rect 2764 257 3539 390 [[Agile AIoT|More...]]
rect 1071 672 1406 773 [[Service_Operations|Service Operations]]
rect 2764 385 3539 518 [[AIoT_DevOps_and_Infrastructure|More...]]
rect 2764 518 3539 651 [[Trust_and_Security|More...]]
rect 2764 651 3539 784 [[Reliability_and_Resilience|More...]]
rect 2764 779 3539 917 [[Verification_and_Validation|More...]]


desc none
desc none
</imagemap>
</imagemap>
<s data-category="AIoTFramework"></s>
__NOTOC__


As part of their digital transformation initiatives, many companies are putting data strategy at the center stage. Most enterprise data strategies are a mixture of high-level vision, strategic principles, goal definitions, priority setting, data governance models, as well as architecture tools and best practices for managing semantics and deriving information from raw data.
As part of their digital transformation initiatives, many companies are putting data strategies at the center stage. Most enterprise data strategies are a mixture of high-level vision, strategic principles, goal definitions, priority setting, data governance models, architecture tools and best practices for managing semantics and deriving information from raw data.


Since both AI and IoT are also very much about data, every AIoT initiative should also adopt a data strategy. However, it is important to notice that this data strategy must work on the level of an individual AIoT-enabled product or solution - not the entire enterprise (unless, of course, the enterprise is pretty much build around said product/solution). This section of the AIoT Framework is proposing a setup for an AIoT Data Strategy, as well as identifying the typical dependencies which must be managed.
Since both AI and IoT are also very much about data, every AIoT initiative should also adopt a data strategy. However, it is important to note that this data strategy must work on the level of an individual AIoT-enabled product or solution, not the entire enterprise (unless, of course, the enterprise is pretty much built around said product/solution). This section of the AIoT Framework proposes a structure for an AIoT Data Strategy and identifies the typical dependencies that must be managed.


__TOC__
__TOC__


=== Overview ===
= Overview =


The AIoT Data Stategy proposed by the AIoT Framework is designed to work well for AIoT product/solution initiatives in the context of a larger enterprise. Consequently, it is focusing on supporting the product/solution implementation and long-term evolution, and is trying to avoid replicating typical elements of an enterprise data strategy.
The AIoT Data Strategy proposed by the AIoT Framework is designed to work well for AIoT product/solution initiatives in the context of a larger enterprise. Consequently, it focuses on supporting product/solution implementation and long-term evolution and tries to avoid replicating typical elements of an enterprise data strategy.


[[File:1.5-DSDetails.png|800px|frameless|center|AIoT Data Strategy]]
[[File:1.5-DSDetails.png|800px|frameless|center|link=|AIoT Data Strategy]]


The AIoT Data Stategy has four main elements. First, the development of a prioritization framework which aims to make the relationship between use cases and their data needs visible. Second, management of the data-specific implementation aspects, as well as the Data Lifecycle Management. Third, Data Capabilities required to support the data strategy. Fourth, a lean and efficient Data Governance approach designed to work on the product/solution level.
The AIoT Data Strategy has four main elements. First, the development of a prioritization framework that aims to make the relationship between use cases and their data needs visible. Second, management of the data-specific implementation aspects, as well as the Data Lifecycle Management. Third, Data Capabilities required to support the data strategy. Fourth, a lean and efficient Data Governance approach was designed to work on the product/solution level.


Of course, each of these 4 elements of the AIoT Data Strategy has to be seen in the context of the enterprise which is hosting the product/solution development: Enterprise Business Strategy must be well aligned with the use cases. The data-specific implementation projects often have to take cross-organization dependencies into consideration, e.g. if data is imported or exported across the boundaries of the current AIoT product/solution. Product/solution-specific data capabilities must be aligned with the existing enterprise capabilities. And Product/solution-specific data governance always has to take existing enterprise-level governance into consideration.
Of course, each of these four elements of the AIoT Data Strategy has to be seen in the context of the enterprise that is hosting product/solution development: Enterprise Business Strategy must be well aligned with the use cases. Data-specific implementation projects frequently have to take cross-organization dependencies into consideration, e.g., if data are imported or exported across the boundaries of the current AIoT product/solution. Product/solution-specific data capabilities must be aligned with the existing enterprise capabilities. Product/solution-specific data governance always has to take existing enterprise-level governance into consideration.


=== Business Alignment & Prioritization ===
= Business Alignment & Prioritization =
The starting point for business alignment & prioritization should be the actual use cases, which are defined and prioritized by the business sponsors - or, alternatively, Epics which have been prioritized in the agile backlog. Sometimes, Epics might be too coarse grained. In this case, Features can be used alternatively.
The starting point for business alignment and prioritization should be the actual use cases, which are defined and prioritized by business sponsors, or Epics which have been prioritized in the agile backlog. Sometimes, Epics might be too coarse grained. In this case, Features can be used instead.


For each Use Case / Epic, an analysis from the data perspective should be done:
For each Use Case/Epic, an analysis from the data perspective should be completed:
* What are the actual data needs to support the Use Case / Epic?
* What are the actual data needs to support the Use Case/Epic?
* Which of this data is believed to be already available, which must be newly acquired?
* Which of these data is believed to be already available, which must be newly acquired?
* How can the required data quality be ensured for the particular use case?
* How can the required data quality be ensured for the particular use case?
* What are potential financial aspects of the data acquisition?
* What are potential financial aspects of the data acquisition?
* And how does the use cases support the monetization side of things?
* How do the use cases support the monetization side of things?
* Is this a case where the required data is adding functional value to the use case, or is there a direct data monetization aspect to it?
* Is this a case where the required data adds functional value to the use case, or is there a direct data monetization aspect to it?
* What are the relationships between the identified data and the other elements of the AIoT Data Strategy: Implementation & Data Lifecycle Management, specific capabilities applying to this particular kind of data, and Data Governance.
* What are the relationships between the identified data and the other elements of the AIoT Data Strategy: Implementation & Data Lifecycle Management, specific capabilities applying to this particular kind of data, and Data Governance?


A key aspect of the analysis will be the '''Data Acquisition''' perspective. For data which can (at least theoretically) be acquired within the boundaries of the AIoT product/solution organization, the following questions have to be answered:
A key aspect of the analysis will be the '''Data Acquisition''' perspective. For data that can (at least theoretically) be acquired within the boundaries of the AIoT product/solution organization, the following questions should be answered:
* Is the required technical infrastructure already available?
* Is the required technical infrastructure already available?
* Does the team have the required capabilities and resources available?
* Does the team have the required capabilities and resources available?
* Especially in the case of AIoT data acquired via sensors:
* Especially in the case of AIoT data acquired via sensors:
** Are new sensor required?
** Are new sensors required?
** If so, what is the additional development & unit cost?
** If so, what is the additional development and unit cost?
** Is there an additional downstream cost from the asset/sensor line-fit point of view (i.e. additional manufacturing costs)?
** Is there an additional downstream cost from the asset/sensor line-fit point of view (i.e. additional manufacturing costs)?
** What is the impact on the business plan?
** What is the impact on the business plan?
Line 63: Line 63:
** What are required steps in terms of sourcing and procurement?
** What are required steps in terms of sourcing and procurement?


For data that has to be acquired from other business units, a number of additional questions will have to be answered:
For data that need to be acquired from other business units, a number of additional questions will have to be answered:
* Is it technically feasible to access the data (availability of APIs, bandwidth, support of required data access frequency and volume, etc.)
* Is it technically feasible to access the data (availability of APIs, bandwidth, support of required data access frequency and volume, etc.)?
* Can the neighboring business unit support your requirements not only in terms of technical access, but also in terms of project support and timelines?
* Can the neighboring business unit support your requirements, not only in terms of technical access, but also in terms of project support and timelines?
* Are there costs involved for the technical implementation and/or the data access (internal billing)?
* Are there costs involved in technical implementation and/or data access (internal billing)?
* Are there potential limitations or restrictions due to existing internal data governance guidelines, regional or organizational boundaries, etc.
* Are there potential limitations or restrictions due to existing internal data governance guidelines, regional or organizational boundaries, etc.?


For data which has to be acquired from external partners or suppliers, there are typically a number of additional complexities which will have to be addressed:
For data that have to be acquired from external partners or suppliers, there are typically a number of additional complexities that will have to be addressed:
* Technical feasibility across enterprise boundaries
* Technical feasibility across enterprise boundaries
* Legal framework required for data access
* Legal framework required for data access
* SLA ensurance
* SLA insurance
* Billing and cost management
* Billing and cost management


Based on all of the above, the team should be able to make an assessment of the overall feasibility and costs / efforts involved on a per use case / per data item basis. This information is then used as part of the overall prioritization process.
Based on all of the above, the team should be able to assess the overall feasibility and costs/efforts involved on a per use case/per data item basis. This information is then used as part of the overall prioritization process.


=== Implementation & Data Lifecycle Management ===
= Data Pipeline: Implementation & Data Lifecycle Management =
It can sometimes be difficult so separate data-specific implementation aspects from general implementation aspects. This is something that the AIoT Data Strategy needs to deal with in order to avoid redundant efforts. Typical, data-specific implementation and Data Lifecycle Management aspects inlude:
Sometimes it can be difficult to separate data-specific implementation aspects from general implementation aspects. This is an issue that the AIoT Data Strategy needs to deal with to avoid redundant efforts. Typical data-specific implementation and Data Lifecycle Management aspects include the following:
* Data Ingestion: In our context, data ingestion should first be seen as moving data from outside of our organizations boundary to within. Second, technical aspects such as stream vs batch processing, etc., need to be addressed. Typical data ingestion tasks also include cleansing and quality assurance.
* Data Ingestion: In our context, data ingestion should first be seen as moving data from outside of our organization's boundary to within. Second, technical aspects such as stream vs. batch processing need to be addressed. Typical data ingestion tasks also include cleansing and quality assurance.
* Storage: Depending on the business and technical requirements, data can be stored permanently or temporarily, structure or unstructured, with or without backup, with cache-only or with operational/transactional support, etc. This often needs to be address differently for different data types.
* Storage: Depending on the business and technical requirements, data can be stored permanently or temporarily, structured or unstructured, with or without backup, with cache-only or with operational/transactional support, etc. This often needs to be addressed differently for different data types.
* Integration: Data integration is the process of merging data from different sources into a single, unified view. In the case of AIoT, this can be - for example - sensor data fusion, done close to the sensors in the edge layer. Or it can be - usually on a high-level of abstraction - a real-time data stream integration process. Or it can be - typically further in the backend - a batch-oriented integration process.
* Integration: Data integration is the process of merging data from different sources into a single, unified view. In the case of AIoT, this can be -- for example -- sensor data fusion, done close to the sensors in the edge layer. Or it can be -- usually on a high-level of abstraction -- a real-time data stream integration process. Or it can be -- typically further in the backend -- a batch-oriented integration process.
* Transformation: Many projects spend a lot of time with data transformation, since this is often a prerequisite for data integration or further data processing. The approaches chosen usually very widely depending on the format, structure, complexity, and volume of the data being transformed.
* Transformation: Many projects spend much time with data transformation, since this is often a prerequisite for data integration or further data processing. The approaches chosen usually vary widely depending on the format, structure, complexity, and volume of the data being transformed.
* Modeling: Data modeling is usually a key step towards dealing with semantics of data, and deriving information from raw data. There are different levels to data modeling, including conceptual, logical and physical levels. Another important type of model building on top of data models are then the AI/ML models. However, these are usually less data-structure oriented and more mathematical/statistical models.
* Modeling: Data modeling is usually a key step toward dealing with semantics of data and deriving information from raw data. There are different levels of data modeling, including conceptual, logical and physical levels. Another important type of model building on top of data models is AI/ML models. However, these models are usually less data-structure oriented and more mathematical/statistical models.
* Validation: Data validation is the tool which helps ensuring data quality, e.g. by applying data cleansing and validation checks. Data validation can use simple, local "validation rules" or "validation constraints" that check for correctness and meaningfulness (e.g. a date of birth can not be in the future). In some cases, data validation can actually be much more complex, e.g. involving interactions with remote systems, or even AI/ML-based validation algorithms.
* Validation: Data validation is the tool that helps ensure data quality, e.g., by applying data cleansing and validation checks. Data validation can use simple, local "validation rules" or "validation constraints" that check for correctness and meaningfulness (e.g., a date of birth cannot be in the future). In some cases, data validation can actually be much more complex, e.g., involving interactions with remote systems, or even AI/ML-based validation algorithms.
* Analysis: In many cases, data analysis is a key use case - other than, for example, transactional use of the data. Generally, data analysis supports the discovery of useful information and supporting decision-making. Data analysis is a multi-faceted topic. It is key that the required Data Capabilities are provided to support here.
* Analysis: In many cases, data analysis is a key use case other than, for example, transactional use of the data. Generally, data analysis supports the discovery of useful information and supports decision-making. Data analysis is a multifaceted topic. It is key that the required Data Capabilities are provided to support here.
* Access Control & Security: Finally, effectively ensuring confidentiality and secure handling of data must be part of every AIoT data strategy. This includes both IoT data coming from assets, as well as data combing from users, other business units, or event external data sources. While security is sometimes dealt with on a different level, fine-grained data access control must usually be dealt with as part of the data strategy.
* Access Control & Security: Finally, effectively ensuring confidentiality and secure handling of data must be part of every AIoT data strategy. This includes both IoT data coming from assets and data combining from users, other business units, or event external data sources. While security is sometimes dealt with on a different level, fine-grained data access control must usually be dealt with as part of the data strategy.


Finally, another key aspect of Implementation & Data Lifecycle Management is dealing with cross-organizational dependencies. While the earlier data acquisition phase might have already answered some of the high-level questions related to this topic, on the implementation level efficient stakeholder management is a key success factor. Often, earlier agreements with respect to technical data access or commercial conditions, will have to be reviewed, revised or refined during the implementation phase. Some practitioners are saying that this can actually sometimes be more difficult in case of cross-divisional data integration within one enterprise than across enterprise boundaries.
Another key aspect of Implementation & Data Lifecycle Management is dealing with cross-organizational dependencies. While the earlier data acquisition phase might have already answered some of the high-level questions related to this topic, on the implementation level efficient stakeholder management is a key success factor. Often, earlier agreements with respect to technical data access or commercial conditions, will have to be reviewed, revised or refined during the implementation phase. Some practitioners say that this can sometimes be more difficult in the case of cross-divisional data integration within one enterprise than across enterprise boundaries.


=== Data Capabilities and Resource Availability ===
= Data Capabilities and Resource Availability =
Data-related capabilities can be important in a number of different areas, including:
Data-related capabilities can be important in a number of different areas, including:
* Skills: Data-related skills can include a number of areas, including specific data-processing technologies, mathematical, statistical, or algorithmic skills in AI/ML, etc.
* Skills: Data-related skills can include a number of areas, including specific data-processing technologies and mathematical, statistical, or algorithmic skills in AI/ML, etc.
* Technology: For an AIoT product/solution initiative is is usually important that the technical management agrees on a fixed set up technologies which cover most of the required use cases, e.g. batch vs real-time processing, basic analytics vs AI/ML, etc.
* Technology: For an AIoT product/solution initiative, it is usually important that technical management agrees on fixed setup technologies that cover most of the required use cases, e.g., batch vs real-time processing, basic analytics vs AI/ML, etc.
* Processes & Methods: Depending on the specific environment, this can also be a very important aspect. Data-related processes & methods can be specific to a certain analytics method, or they can related to certain processes & methods defined by an enterprise organization as mandatory.
* Processes & Methods: Depending on the specific environment, this can also be a very important aspect. Data-related processes and methods can be specific to a certain analytics method, or they can be related to certain processes and methods defined by an enterprise organization as mandatory.


Depending on the project requirements, it is also important that specific capabilities are supported by appropriate resources. For example, if it is clear that an AIoT project will require development of certain AI/ML algorithms, then the project management will have to ensure that this particular capability is supported by skilled resources who are available during the required time period. Managing the availability of such highly specialized resources is a topic which can be difficult to align with the pure agile project management paradigm and might require longer term planning, involving alignment with HR or sourcing/procurement.
Depending on the project requirements, it is also important that specific capabilities be supported by appropriate resources. For example, if it is clear that an AIoT project will require the development of certain AI/ML algorithms, then the project management will have to ensure that this particular capability is supported by skilled resources that are available during the required time period. Managing the availability of such highly specialized resources is a topic that can be difficult to align with the pure agile project management paradigm and might require longer-term planning, involving alignment with HR or sourcing/procurement.


=== Data Governance ===
= Data Governance =
Finally, especially larger AIoT product/solution initiatives will require Data Governance as part of their Data Strategy.
Larger AIoT product/solution initiatives will require Data Governance as part of their Data Strategy. This Data Governance cannot be compared with a Data Governance approach typically found on the enterprise level. It needs to be lightweight and pragmatic, covering basic aspects such as:
This Data Governance can not be compared with a Data Governance approach typically found on the enterprise level. It needs to be lightweight and pragmatic, covering basic aspects such as:
* Data & Trust Policies: How is this specific AIoT product/solution dealing with this topic? This is likely to be very use case specific, so the AIoT initiative will have to build on generic enterprise-level requirements but will have to add policies specific to its own use case.  
* Data & Trust Policies: How is this specific AIoT product/solution dealing with this topic? This is likely to be very use cases specific, so the AIoT initiative will have to build on generic enterprise-level requirements, but will have to add policies specific to its own use case.  
* Data Architecture: It is not always clear if data architecture is a discipline on its own, or if this is simply one facet of the product/solution architecture. For example, the AIoT Framework has a dedicated viewpoint to support the combination of [[AIoT_Data_and_Functional_Viewpoint|data and functionality]].
* Data Architecture: It is not always clear if data architecture is a discipline on its own, or if this is simply one facet of the product/solution architecture. For example, the AIoT Framework has a dedicated viewpoint to support the combination of [[AIoT_Data_and_Functional_Viewpoint|data and functionality]].
* Data Lineage: Data lineages traces where data is originating, what happens with it on the way, and where is moves over time. Data lineage provides visibility and transparency, and can help simplifying root cause analysis in the data analytics process. Data Governance can either support the central documentation of data lineage, or provide tools and best practices for the implementation teams.
* Data Lineage: Data lineages traces where data originate, what happens with it on the way, and where it moves over time. Data lineage provides visibility and transparency and can help simplify root cause analysis in the data analytics process. Data Governance can either support the central documentation of data lineages or provide tools and best practices for implementation teams.
* Metadata Mgmt and Data Catalog: Efficient management of meta data is a prerequisite for efficient data processing and analytics. Types of metadata include descriptive, structural and administrative. A data catalog can provide support for metadata management, together with other tools, such as search.  
* Metadata Management and Data Catalog: Efficient management of metadata is a prerequisite for efficient data processing and analytics. Types of metadata include descriptive, structural and administrative. A data catalog can provide support for metadata management, together with other tools, such as search.  
* Data Model Management: For many AIoT applications, centrally managing a high-level data model which describes key entities and their relationships, as well as dependencies to different use cases and components can be of great help to create transparency and improve alignment between different teams. The AIoT Framework proposes a lightweight [[AIoT_Data_and_Functional_Viewpoint#Data_Domain_Model|AIoT Domain Model]] approach. In addition, the Data Governance team could also provide tooling and best practices for the teams who need more detailed models in their areas. This can also be linked back to the Metadata Mgmt and Data Catalog topics.
* Data Model Management: For many AIoT applications, centrally managing a high-level data model that describes key entities and their relationships, as well as dependencies on different use cases and components, can be of great help in creating transparency and improving alignment between different teams. The AIoT Framework proposes a lightweight [[AIoT_Data_and_Functional_Viewpoint#Data_Domain_Model|AIoT Domain Model]] approach. In addition, the Data Governance team could also provide tooling and best practices for teams that need more detailed models in their areas. This can also be linked back to the Metadata Management and Data Catalog topics.
* API Management: In his famous "API Mandate", Amazon CEO Jeff Bezos declared that ''"All teams will henceforth expose their data and functionality through service interfaces."'' at Amazon. This executive-level support for an API-centric way of dealing with data exchange (and exposing component functionality) shows how important API management has become on the enterprise level. The success of an AIoT initiative will also depend strongly on it. If there is no enterprise-wide API infrastructure and management approach available, this is a key support element which must be provided and enforced by the Data Governance team.
* API Management: In his famous "API Mandate", Amazon CEO Jeff Bezos declared that ''"All teams will henceforth expose their data and functionality through service interfaces."'' at Amazon. This executive-level support for an API-centric way of dealing with data exchange (and exposing component functionality) shows how important API management has become at the enterprise level. The success of an AIoT initiative will also depend strongly on it. If there is no enterprise-wide API infrastructure and management approach available, this is a key support element that must be provided and enforced by the Data Governance team.


Finally, the Data Governance / Data Strategy team should give itself a set up KPIs by which they can measure their own success, and the effectiveness and efficiency of the AIoT Data Strategy.
Finally, the Data Governance / Data Strategy team should give itself a setup of KPIs by which they can measure their own success and the effectiveness and efficiency of the AIoT Data Strategy.


== Authors and Contributors ==
= Authors and Contributors =


{|{{Borderstyle-author}}
{|{{Borderstyle-author}}
|{{Dirk Slama|Title=AUTHOR}}
|{{Designstyle-author|Image=[[File:Dirk Slama.jpeg|left|100px]]|author={{Dirk Slama|Title=AUTHOR}}}}
|}
|}

Latest revision as of 16:12, 27 June 2022

More...More...More...More...More...Hardware.exeMore...More...More...More...More...More...Overview of Ignite AIoT Framework


As part of their digital transformation initiatives, many companies are putting data strategies at the center stage. Most enterprise data strategies are a mixture of high-level vision, strategic principles, goal definitions, priority setting, data governance models, architecture tools and best practices for managing semantics and deriving information from raw data.

Since both AI and IoT are also very much about data, every AIoT initiative should also adopt a data strategy. However, it is important to note that this data strategy must work on the level of an individual AIoT-enabled product or solution, not the entire enterprise (unless, of course, the enterprise is pretty much built around said product/solution). This section of the AIoT Framework proposes a structure for an AIoT Data Strategy and identifies the typical dependencies that must be managed.

Overview

The AIoT Data Strategy proposed by the AIoT Framework is designed to work well for AIoT product/solution initiatives in the context of a larger enterprise. Consequently, it focuses on supporting product/solution implementation and long-term evolution and tries to avoid replicating typical elements of an enterprise data strategy.

AIoT Data Strategy

The AIoT Data Strategy has four main elements. First, the development of a prioritization framework that aims to make the relationship between use cases and their data needs visible. Second, management of the data-specific implementation aspects, as well as the Data Lifecycle Management. Third, Data Capabilities required to support the data strategy. Fourth, a lean and efficient Data Governance approach was designed to work on the product/solution level.

Of course, each of these four elements of the AIoT Data Strategy has to be seen in the context of the enterprise that is hosting product/solution development: Enterprise Business Strategy must be well aligned with the use cases. Data-specific implementation projects frequently have to take cross-organization dependencies into consideration, e.g., if data are imported or exported across the boundaries of the current AIoT product/solution. Product/solution-specific data capabilities must be aligned with the existing enterprise capabilities. Product/solution-specific data governance always has to take existing enterprise-level governance into consideration.

Business Alignment & Prioritization

The starting point for business alignment and prioritization should be the actual use cases, which are defined and prioritized by business sponsors, or Epics which have been prioritized in the agile backlog. Sometimes, Epics might be too coarse grained. In this case, Features can be used instead.

For each Use Case/Epic, an analysis from the data perspective should be completed:

  • What are the actual data needs to support the Use Case/Epic?
  • Which of these data is believed to be already available, which must be newly acquired?
  • How can the required data quality be ensured for the particular use case?
  • What are potential financial aspects of the data acquisition?
  • How do the use cases support the monetization side of things?
  • Is this a case where the required data adds functional value to the use case, or is there a direct data monetization aspect to it?
  • What are the relationships between the identified data and the other elements of the AIoT Data Strategy: Implementation & Data Lifecycle Management, specific capabilities applying to this particular kind of data, and Data Governance?

A key aspect of the analysis will be the Data Acquisition perspective. For data that can (at least theoretically) be acquired within the boundaries of the AIoT product/solution organization, the following questions should be answered:

  • Is the required technical infrastructure already available?
  • Does the team have the required capabilities and resources available?
  • Especially in the case of AIoT data acquired via sensors:
    • Are new sensors required?
    • If so, what is the additional development and unit cost?
    • Is there an additional downstream cost from the asset/sensor line-fit point of view (i.e. additional manufacturing costs)?
    • What is the impact on the business plan?
    • What is the impact on the project plan?
    • What are the technical risks for new, unknown sensor technologies?
    • What are required steps in terms of sourcing and procurement?

For data that need to be acquired from other business units, a number of additional questions will have to be answered:

  • Is it technically feasible to access the data (availability of APIs, bandwidth, support of required data access frequency and volume, etc.)?
  • Can the neighboring business unit support your requirements, not only in terms of technical access, but also in terms of project support and timelines?
  • Are there costs involved in technical implementation and/or data access (internal billing)?
  • Are there potential limitations or restrictions due to existing internal data governance guidelines, regional or organizational boundaries, etc.?

For data that have to be acquired from external partners or suppliers, there are typically a number of additional complexities that will have to be addressed:

  • Technical feasibility across enterprise boundaries
  • Legal framework required for data access
  • SLA insurance
  • Billing and cost management

Based on all of the above, the team should be able to assess the overall feasibility and costs/efforts involved on a per use case/per data item basis. This information is then used as part of the overall prioritization process.

Data Pipeline: Implementation & Data Lifecycle Management

Sometimes it can be difficult to separate data-specific implementation aspects from general implementation aspects. This is an issue that the AIoT Data Strategy needs to deal with to avoid redundant efforts. Typical data-specific implementation and Data Lifecycle Management aspects include the following:

  • Data Ingestion: In our context, data ingestion should first be seen as moving data from outside of our organization's boundary to within. Second, technical aspects such as stream vs. batch processing need to be addressed. Typical data ingestion tasks also include cleansing and quality assurance.
  • Storage: Depending on the business and technical requirements, data can be stored permanently or temporarily, structured or unstructured, with or without backup, with cache-only or with operational/transactional support, etc. This often needs to be addressed differently for different data types.
  • Integration: Data integration is the process of merging data from different sources into a single, unified view. In the case of AIoT, this can be -- for example -- sensor data fusion, done close to the sensors in the edge layer. Or it can be -- usually on a high-level of abstraction -- a real-time data stream integration process. Or it can be -- typically further in the backend -- a batch-oriented integration process.
  • Transformation: Many projects spend much time with data transformation, since this is often a prerequisite for data integration or further data processing. The approaches chosen usually vary widely depending on the format, structure, complexity, and volume of the data being transformed.
  • Modeling: Data modeling is usually a key step toward dealing with semantics of data and deriving information from raw data. There are different levels of data modeling, including conceptual, logical and physical levels. Another important type of model building on top of data models is AI/ML models. However, these models are usually less data-structure oriented and more mathematical/statistical models.
  • Validation: Data validation is the tool that helps ensure data quality, e.g., by applying data cleansing and validation checks. Data validation can use simple, local "validation rules" or "validation constraints" that check for correctness and meaningfulness (e.g., a date of birth cannot be in the future). In some cases, data validation can actually be much more complex, e.g., involving interactions with remote systems, or even AI/ML-based validation algorithms.
  • Analysis: In many cases, data analysis is a key use case other than, for example, transactional use of the data. Generally, data analysis supports the discovery of useful information and supports decision-making. Data analysis is a multifaceted topic. It is key that the required Data Capabilities are provided to support here.
  • Access Control & Security: Finally, effectively ensuring confidentiality and secure handling of data must be part of every AIoT data strategy. This includes both IoT data coming from assets and data combining from users, other business units, or event external data sources. While security is sometimes dealt with on a different level, fine-grained data access control must usually be dealt with as part of the data strategy.

Another key aspect of Implementation & Data Lifecycle Management is dealing with cross-organizational dependencies. While the earlier data acquisition phase might have already answered some of the high-level questions related to this topic, on the implementation level efficient stakeholder management is a key success factor. Often, earlier agreements with respect to technical data access or commercial conditions, will have to be reviewed, revised or refined during the implementation phase. Some practitioners say that this can sometimes be more difficult in the case of cross-divisional data integration within one enterprise than across enterprise boundaries.

Data Capabilities and Resource Availability

Data-related capabilities can be important in a number of different areas, including:

  • Skills: Data-related skills can include a number of areas, including specific data-processing technologies and mathematical, statistical, or algorithmic skills in AI/ML, etc.
  • Technology: For an AIoT product/solution initiative, it is usually important that technical management agrees on fixed setup technologies that cover most of the required use cases, e.g., batch vs real-time processing, basic analytics vs AI/ML, etc.
  • Processes & Methods: Depending on the specific environment, this can also be a very important aspect. Data-related processes and methods can be specific to a certain analytics method, or they can be related to certain processes and methods defined by an enterprise organization as mandatory.

Depending on the project requirements, it is also important that specific capabilities be supported by appropriate resources. For example, if it is clear that an AIoT project will require the development of certain AI/ML algorithms, then the project management will have to ensure that this particular capability is supported by skilled resources that are available during the required time period. Managing the availability of such highly specialized resources is a topic that can be difficult to align with the pure agile project management paradigm and might require longer-term planning, involving alignment with HR or sourcing/procurement.

Data Governance

Larger AIoT product/solution initiatives will require Data Governance as part of their Data Strategy. This Data Governance cannot be compared with a Data Governance approach typically found on the enterprise level. It needs to be lightweight and pragmatic, covering basic aspects such as:

  • Data & Trust Policies: How is this specific AIoT product/solution dealing with this topic? This is likely to be very use case specific, so the AIoT initiative will have to build on generic enterprise-level requirements but will have to add policies specific to its own use case.
  • Data Architecture: It is not always clear if data architecture is a discipline on its own, or if this is simply one facet of the product/solution architecture. For example, the AIoT Framework has a dedicated viewpoint to support the combination of data and functionality.
  • Data Lineage: Data lineages traces where data originate, what happens with it on the way, and where it moves over time. Data lineage provides visibility and transparency and can help simplify root cause analysis in the data analytics process. Data Governance can either support the central documentation of data lineages or provide tools and best practices for implementation teams.
  • Metadata Management and Data Catalog: Efficient management of metadata is a prerequisite for efficient data processing and analytics. Types of metadata include descriptive, structural and administrative. A data catalog can provide support for metadata management, together with other tools, such as search.
  • Data Model Management: For many AIoT applications, centrally managing a high-level data model that describes key entities and their relationships, as well as dependencies on different use cases and components, can be of great help in creating transparency and improving alignment between different teams. The AIoT Framework proposes a lightweight AIoT Domain Model approach. In addition, the Data Governance team could also provide tooling and best practices for teams that need more detailed models in their areas. This can also be linked back to the Metadata Management and Data Catalog topics.
  • API Management: In his famous "API Mandate", Amazon CEO Jeff Bezos declared that "All teams will henceforth expose their data and functionality through service interfaces." at Amazon. This executive-level support for an API-centric way of dealing with data exchange (and exposing component functionality) shows how important API management has become at the enterprise level. The success of an AIoT initiative will also depend strongly on it. If there is no enterprise-wide API infrastructure and management approach available, this is a key support element that must be provided and enforced by the Data Governance team.

Finally, the Data Governance / Data Strategy team should give itself a setup of KPIs by which they can measure their own success and the effectiveness and efficiency of the AIoT Data Strategy.

Authors and Contributors

Dirk Slama.jpeg
DIRK SLAMA
(Editor-in-Chief)

AUTHOR
Dirk Slama is VP and Chief Alliance Officer at Bosch Software Innovations (SI). Bosch SI is spearheading the Internet of Things (IoT) activities of Bosch, the global manufacturing and services group. Dirk has over 20 years experience in very large-scale distributed application projects and system integration, including SOA, BPM, M2M and most recently IoT. He is representing Bosch at the Industrial Internet Consortium and is active in the Industry 4.0 community. He holds an MBA from IMD Lausanne as well as a Diploma Degree in Computer Science from TU Berlin.