The Data Factory is the underlying technical part that supports our reporting and dashboarding process.
A Data Factory is a way of organizing data sourcing, treatment, and reporting within an organisation unit such as a PMO, aiming at providing the right information to the right persons at the right time.
It contains organisation, governance, tools and data management aspects.
1. Sourcing:
The first component of a Data Factory concerns the sourcing of the data that resides in a variety of heterogenous systems and technologies. While most of the systems contain structured data, there's an increasing proportion of data residing in an unstructured format (anyone using email?). This sourcing has a few characteristics:
- The accountability of systems remains unchanged. Each source system is represented by the system operator that plays the role of SPOC (Single Point of Contact).
- SPOC are asked to send the results of these systems periodically. Only useful data are transmitted.
2. Datafarm:
- Data are connected in a “star” architecture that reduces the number of reconciliations in a "Datafarm".
- Data are historized (e.g. monthly archiving).
- Enriched data is considered as regular data.
3. Data Quality:
- A process of DQ (Data Quality) will identify gaps and problems between different sources and will suggest corrections to the SPOC (responsible for their transmission). We use 4 different levels of Data Quality and have developed a catalog of several hundred in various domains.
4. Enrichment:
- A process that infers with the existing data and creates information based on documented models ex: alerters, ratings, simulations, forecasts...
5. Industrial Reporting:
An industrial reporting environment produces:
- Reports using a single source to validate with the SPOC that the data and statistics are correct
- Reports mixing different sources that will be endorsed by the involved SPOC
- Industrial reports normalized according to an established production schedule
6. Dashboarding:
Dashboarding brings a new dimension to reporting (without however addressing the interest of traditional reporting) by allowing further analysis on different dimensions and scopes of a specific problem. We use the most advanced Dashboarding tools including Qlikview.
