“The scary thing is how bad it is. We’re betting our economy (sic) on an information world, and we don’t have any idea how good the information in those databases is”
– Robert Golberg, MIT Sloan School of Management

In the past, the attention of the management and of the academicians was very much concentrated on the quality of the products and its development and management. Later more and more attention has been devoted to the processes. The idea is that if the process is sound, also the quality of the product will be good. In a World 2.0, more and more the attention should be devoted to the data. Thanks to the spreading of the Information and Telecommunication Technology and of the sensors, more and more data are available. As a matter of fact more and more people are talking on the Big Data: This expression refer to data which are processable and have the three characteristics of the 3V’s: large Volume, need to access them in Velocity, and with a great Variety (structured and un-structured, internal and external, and so on). Data are becoming more and more important. Their analysis is labeled Business Intelligence: the use of data to support the building of information. Information are then used to make decisions. If this is the chain, it is extremely important the quality of the data must be excellent. If this is not the case, at the end of the chain (data=>information=>decision) the decisions are wrong. Actually, until now the attention to the quality of the data has been relatively limited. In the world of Big Data this is not anymore possible.

How is it possible to assure the quality of the data?

It is now time to move from the certification of products to the certification of data. We believe that this can be achieved acting on applying Lean and Digitize to the data.This means to act on what I call the 3 P’s:

People, Processes,and Platforms:

  • To act on People means essentially to take care of the Data Governance. Data Governance is an integral part of corporate and ICT governance. It combines leadership, organizational structures, and processes to ensure that data provide value to the Business. For an effective, efficient, and economical Data Governance it is essential to define a proper organization.
  • Organization is important, but it is essential to define Processes. In the case of data, processes they are relative to the creation, transformation and loading of data and the auditing and vetting of data already present in the databases.

To be more precise, the sub-processes are:

  1. Data creation/capture/extraction and recording at the time of gathering;
  2. Data manipulation/transformation (label preparation, copying of data to a ledger, and so on);
  3. Classification and tagging of data (class, observation, and so on) and its recording;
  4. Digitization/Transfer of the data;
  5. Documentation of the data (capturing and recording the metadata);
  6. Data storage and archiving;
  7. Data presentation and dissemination (paper and electronic publications, web-enabled databases, and so on);
  8. Using the data (analysis and manipulation).
  9. Once identified the proper organization and improved processes, it is necessary to evaluate the tools that can help in the certification of data (what we call here Platforms). The tools are essential due to the sheer volume of the data, their diversity, and the speed at which it is necessary to perform the processes.

Two aspects are important in the data certification:

  • Accuracy refers to the closeness of measured values, observations or estimates to the real or true value (or to a value that is accepted as being true);
  • Precision (or Resolution) can be divided into two main types:
    • Statistical precision is the closeness with which repeated observations conform to themselves. They have nothing to do with their relationship to the true value, and may have high precision, but low accuracy
    • Numerical precision is the number of significant digits that an observation is recorded in and has become far more obvious with the advent of computers.

