[ON APRIL 4 THE CONFERENCE ROOM IN UTRECHT IS FULLY BOOKED. YOU CAN STILL JOIN ONLINE. AT ALL TIME SLOTS 1 SESSION IS ENGLISH SPOKEN.
WORKSHOPS ON APRIL 5 ARE IN-PERSON IN UTRECHT, ON APRIL 19 VIRTUAL DELIVERY.]
This session discusses the data lakehouse, which is the new kid on the block in the world of data architectures. In a nutshell, the data lakehouse is a combination of a data warehouse and a data lake. In other words, this architecture is developed to support a typical data warehouse workload plus a data lake workload. It holds structured, semi-structured, and unstructured data. Technically, in a data lakehouse the data is stored in files that can be accessed by any type of tool and database server. The data is not kept hostage by a specific database server. SQL engines are also able to access that data efficiently for more traditional business intelligence workloads. And data scientists can create their descriptive and prescriptive models directly on the data.
It makes a lot of sense to combine these two worlds, because they are sharing the same data and they are sharing logic. But is this really possible? Is this all too good to be true? This session discusses various aspects of data warehouses and data lakes to determine if the data lakehouse is a marketing hype or whether this is really a valuable and realistic new data architecture.
Read less
In this session we will touch on the most used models. How to apply them in context. Also the need to choose a model fitting the rhythm and purpose of your data.
A lot of discussions are going around about what would be the best data model. Guru’s fall over each other to proof their points. Depending on the background of IT professionals, they might not even know that there are several types of models available. Due to the focus on data science the models behind it are ignored or overlooked. What model you choose has implications for your applications down the line. So, you need to choose a model fitting the purpose of your data, through the life cycle of the data.
Why is it relevant now? GDPR, the new EU Data Act and AI Act ask organisations to know what data they have got, with what purpose, when they started collecting the data and how long are they going to keep it. And those are just the minimal demands. In a lot of industries data is recorded as an asset in the balance sheets. Data has moved from a supporting role to the main act.
A data model determines how data elements relate to each other. If data can be combined, how it can be retrieved. When designing a system most of the time it is overlooked that data is not just saved for registration purposes, but also for analysis.
With the rise of Data Science, Artificial Intelligence and Machine Learning data quality is not a minor concern anymore. There are no people interpreting the registered data. Data is interpreted on a large scale by (mathematical) models in computers. Computers are used for what they do best: calculation. If data is stored in the wrong way, the models will give wrong results, often with disastrous consequences.
The main types of models, relational, dimensional, ensemble and graph are explained.
The focus when choosing a model is: which concern of the organisation needs to be addressed. Is it important to register as accurately as possible what happened in interactions with a customer? Is it necessary to look back at how business was run to improve? Or should we look forward to the future, based on historical data?
Based on the objectives, we discuss the advantages and disadvantages of each type of model. There is no ‘one size fits all’ in data modelling. Choices made during development have long-term consequences for the possible applications of the data. These concerns have implications for the data architecture. In this session we focus on data models.
Most firms today want to create a high quality, compliant data foundation to support multiple analytical workloads. A rapidly emerging approach to building this is to create DataOps pipelines that produce reusable data products. However, there needs to be somewhere where these data products can be made available so data to be shared. The solution is a data marketplace where ready-made, high quality data products that can be published for others to consume and use. This session looks at what a data marketplace is, how to build one and how you can use it to govern data sharing across the enterprise and beyond. It also looks at what is needed to operate a data marketplace and the trend to become a marketplace for both data and analytical products.
Data management encompasses a broad spectrum of dimensions and focal areas which come to play when an organization needs to adapt to an environment that becomes more and more digital. But where to start? And how do you develop a successful strategy that will be adopted throughout the organization?
Evides is already 5 years underway in rolling out a data management program to grow towards a data conscious and mature data driven company. In retrospect there are a few lessons to draw which proved of key importance. And with that, the future perspective becomes more clear every day. From the perspective of people, process, technology and organization a few insights will be further elaborated:
Read less
Since Google announced its Knowledge Graph solution in 2012 the paradigm has found its way into many real-world use cases. These are mostly in the analytics space. The graph database market has exploded over the last 10 years with at least 50 brand names today. International Standardization is coming – very soon SQL will be extended by functionality for property query queries. A full international standard for property graphs, called GQL, will surface in late 2023.
The inclusion of graph technology dramatically enlarges the scope of analytics by enabling semi-structured information, semantic sources such as ontologies and taxonomies, social networks as well as schema-less sources of data. At the same time graph databases are much better suited for doing complex multi-joins analyzing large networks of data, opening up for advanced fraud detection etc. The Panama papers is the best-known example. Finally graph theory is a mathematical discipline with a long history, which among other things have created graph algorithms for many complex analytics, such as clustering, shortest path, page rank, centrality and much more.
This presentation will cover what a Knowledge Graph is, how it is different and yet complementary to other technologies. Furthermore, Thomas will cover:
It is a non-technical presentation, focusing on business requirements and architecture. More technical information will be covered in the workshop Understanding Graph Technologies on the 5th of April.
Read lessThe concepts and practices of Data Mesh and Data Fabric are data management’s new hot topics. These contrasting yet complementary technology and organisational approaches promise better data management through the delivery of defined data products and the automation of real time data integration.
But to succeed both depend on getting their Data Quality foundations right. To work, Data Mesh requires high quality, well curated data sets and data products; Data Fabric also relies on high quality, standardised data and metadata which insulates data users from the complexities of multiple systems and platforms.
This session will briefly recap the main concepts and practices of Data Mesh and Data Fabric and consider their implications for Data Quality Management. Will the Mesh and Fabric make Data Quality easier or harder to get right? As a foundational data discipline how should Data Quality principles and practices evolve and adapt to meet the needs of these new trends? What new approaches and practices may be needed? What are the implications for Data Quality practitioners and other data management professionals working in other data disciplines such as Data Governance, Business Intelligence and Data Warehousing?
This session will include:
[On April 5th, Nigel will run a full day workshop on Data Strategy, check the conference schedule for details.]
Read lessWe come from a world of algorithms and the focus of AI is still largely on optimizing models. In this contribution by Jan Veldsink, we are going to focus on data centricity, putting the data in the centre and turning it into multiple analyses/reports and applications.
Datacentric AI is a form of artificial intelligence that focuses on working with and using data to solve problems. This type of AI typically involves using machine learning algorithms and other techniques to analyze large amounts of data and extract actionable insights from it.
Some key points of datacentric AI are:
This session looks at the emergence of Data Observability and looks at what it is about, what Data Observability can observe, vendors in the market and examples of what vendors are capturing about data. The presentation will also look at Data Observability requirements, the strengths and weaknesses of current offerings, where the gaps are and tool complexity (overlaps, inability to share metadata) from a customer perspective. It will explore the link between Data Observability, data catalogs, data intelligence and the move towards augmented data governance and discuss how Data Observability and data intelligence can be used in a real-time automated Data Governance Action Framework to govern data across multiple tools and data stores in next generation Data governance.
Read less
After a brief introduction of Vektis, Herman Bennema will use practical examples to show what kind of information can be extracted from the claims data (with a total value of over 850 billion euros) managed by Vektis. He will also discuss the legal bounderies within which Vektis operates and outline the dilemma we face in the Netherlands: how do we ensure a sound balance between, on the one hand, the privacy risk when using healthcare data and, on the other, the potential to keep healthcare affordable, accessible and of high quality based on data analysis.
Engaging Stakeholders and Other Mere Mortals
Interest in Data Modelling, especially Concept Modelling (Conceptual Data Modelling) has increased dramatically in recent years. That’s great news, but our modelling can still be improved. When it’s done well, Concept Modelling is a powerful enabler of communication among different stakeholders including senior leaders, subject matter experts, business analysts, solution architects, and others. Unfortunately, the communication often gets lost – in the clouds, in the weeds, or somewhere off to the side. Sometimes the modeller has drifted too quickly into abstraction, sometimes the modeller has taken the famous “deep dive for detail,” but the outcome is the same – confusion, frustration, and detachment. The result – inaccurate, incomplete, or unappreciated models.
It doesn’t have to be this way! Drawing on over 40 years of successful modelling, this session describes core techniques, backed by practical examples, for helping people appreciate, use, and possibly even want to build data models.
Topics include:
[On April 5th, Alec will run a half day workshop on Concept Modelling, check the conference schedule for details.]
Read lessThis session discusses the data lakehouse, which is the new kid on the block in the world of data architectures. In a nutshell, the data lakehouse is a combination of a data warehouse and a data lake. In other words, this architecture is developed to support a typical data warehouse workload plus a data lake workload. It holds structured, semi-structured, and unstructured data. Technically, in a data lakehouse the data is stored in files that can be accessed by any type of tool and database server. The data is not kept hostage by a specific database server. SQL engines are also able to access that data efficiently for more traditional business intelligence workloads. And data scientists can create their descriptive and prescriptive models directly on the data.
It makes a lot of sense to combine these two worlds, because they are sharing the same data and they are sharing logic. But is this really possible? Is this all too good to be true? This session discusses various aspects of data warehouses and data lakes to determine if the data lakehouse is a marketing hype or whether this is really a valuable and realistic new data architecture.
Read less
In this session we will touch on the most used models. How to apply them in context. Also the need to choose a model fitting the rhythm and purpose of your data.
A lot of discussions are going around about what would be the best data model. Guru’s fall over each other to proof their points. Depending on the background of IT professionals, they might not even know that there are several types of models available. Due to the focus on data science the models behind it are ignored or overlooked. What model you choose has implications for your applications down the line. So, you need to choose a model fitting the purpose of your data, through the life cycle of the data.
Why is it relevant now? GDPR, the new EU Data Act and AI Act ask organisations to know what data they have got, with what purpose, when they started collecting the data and how long are they going to keep it. And those are just the minimal demands. In a lot of industries data is recorded as an asset in the balance sheets. Data has moved from a supporting role to the main act.
A data model determines how data elements relate to each other. If data can be combined, how it can be retrieved. When designing a system most of the time it is overlooked that data is not just saved for registration purposes, but also for analysis.
With the rise of Data Science, Artificial Intelligence and Machine Learning data quality is not a minor concern anymore. There are no people interpreting the registered data. Data is interpreted on a large scale by (mathematical) models in computers. Computers are used for what they do best: calculation. If data is stored in the wrong way, the models will give wrong results, often with disastrous consequences.
The main types of models, relational, dimensional, ensemble and graph are explained.
The focus when choosing a model is: which concern of the organisation needs to be addressed. Is it important to register as accurately as possible what happened in interactions with a customer? Is it necessary to look back at how business was run to improve? Or should we look forward to the future, based on historical data?
Based on the objectives, we discuss the advantages and disadvantages of each type of model. There is no ‘one size fits all’ in data modelling. Choices made during development have long-term consequences for the possible applications of the data. These concerns have implications for the data architecture. In this session we focus on data models.
Most firms today want to create a high quality, compliant data foundation to support multiple analytical workloads. A rapidly emerging approach to building this is to create DataOps pipelines that produce reusable data products. However, there needs to be somewhere where these data products can be made available so data to be shared. The solution is a data marketplace where ready-made, high quality data products that can be published for others to consume and use. This session looks at what a data marketplace is, how to build one and how you can use it to govern data sharing across the enterprise and beyond. It also looks at what is needed to operate a data marketplace and the trend to become a marketplace for both data and analytical products.
Data management encompasses a broad spectrum of dimensions and focal areas which come to play when an organization needs to adapt to an environment that becomes more and more digital. But where to start? And how do you develop a successful strategy that will be adopted throughout the organization?
Evides is already 5 years underway in rolling out a data management program to grow towards a data conscious and mature data driven company. In retrospect there are a few lessons to draw which proved of key importance. And with that, the future perspective becomes more clear every day. From the perspective of people, process, technology and organization a few insights will be further elaborated:
Read less
Since Google announced its Knowledge Graph solution in 2012 the paradigm has found its way into many real-world use cases. These are mostly in the analytics space. The graph database market has exploded over the last 10 years with at least 50 brand names today. International Standardization is coming – very soon SQL will be extended by functionality for property query queries. A full international standard for property graphs, called GQL, will surface in late 2023.
The inclusion of graph technology dramatically enlarges the scope of analytics by enabling semi-structured information, semantic sources such as ontologies and taxonomies, social networks as well as schema-less sources of data. At the same time graph databases are much better suited for doing complex multi-joins analyzing large networks of data, opening up for advanced fraud detection etc. The Panama papers is the best-known example. Finally graph theory is a mathematical discipline with a long history, which among other things have created graph algorithms for many complex analytics, such as clustering, shortest path, page rank, centrality and much more.
This presentation will cover what a Knowledge Graph is, how it is different and yet complementary to other technologies. Furthermore, Thomas will cover:
It is a non-technical presentation, focusing on business requirements and architecture. More technical information will be covered in the workshop Understanding Graph Technologies on the 5th of April.
Read lessThe concepts and practices of Data Mesh and Data Fabric are data management’s new hot topics. These contrasting yet complementary technology and organisational approaches promise better data management through the delivery of defined data products and the automation of real time data integration.
But to succeed both depend on getting their Data Quality foundations right. To work, Data Mesh requires high quality, well curated data sets and data products; Data Fabric also relies on high quality, standardised data and metadata which insulates data users from the complexities of multiple systems and platforms.
This session will briefly recap the main concepts and practices of Data Mesh and Data Fabric and consider their implications for Data Quality Management. Will the Mesh and Fabric make Data Quality easier or harder to get right? As a foundational data discipline how should Data Quality principles and practices evolve and adapt to meet the needs of these new trends? What new approaches and practices may be needed? What are the implications for Data Quality practitioners and other data management professionals working in other data disciplines such as Data Governance, Business Intelligence and Data Warehousing?
This session will include:
[On April 5th, Nigel will run a full day workshop on Data Strategy, check the conference schedule for details.]
Read lessWe come from a world of algorithms and the focus of AI is still largely on optimizing models. In this contribution by Jan Veldsink, we are going to focus on data centricity, putting the data in the centre and turning it into multiple analyses/reports and applications.
Datacentric AI is a form of artificial intelligence that focuses on working with and using data to solve problems. This type of AI typically involves using machine learning algorithms and other techniques to analyze large amounts of data and extract actionable insights from it.
Some key points of datacentric AI are:
This session looks at the emergence of Data Observability and looks at what it is about, what Data Observability can observe, vendors in the market and examples of what vendors are capturing about data. The presentation will also look at Data Observability requirements, the strengths and weaknesses of current offerings, where the gaps are and tool complexity (overlaps, inability to share metadata) from a customer perspective. It will explore the link between Data Observability, data catalogs, data intelligence and the move towards augmented data governance and discuss how Data Observability and data intelligence can be used in a real-time automated Data Governance Action Framework to govern data across multiple tools and data stores in next generation Data governance.
Read less
After a brief introduction of Vektis, Herman Bennema will use practical examples to show what kind of information can be extracted from the claims data (with a total value of over 850 billion euros) managed by Vektis. He will also discuss the legal bounderies within which Vektis operates and outline the dilemma we face in the Netherlands: how do we ensure a sound balance between, on the one hand, the privacy risk when using healthcare data and, on the other, the potential to keep healthcare affordable, accessible and of high quality based on data analysis.
Engaging Stakeholders and Other Mere Mortals
Interest in Data Modelling, especially Concept Modelling (Conceptual Data Modelling) has increased dramatically in recent years. That’s great news, but our modelling can still be improved. When it’s done well, Concept Modelling is a powerful enabler of communication among different stakeholders including senior leaders, subject matter experts, business analysts, solution architects, and others. Unfortunately, the communication often gets lost – in the clouds, in the weeds, or somewhere off to the side. Sometimes the modeller has drifted too quickly into abstraction, sometimes the modeller has taken the famous “deep dive for detail,” but the outcome is the same – confusion, frustration, and detachment. The result – inaccurate, incomplete, or unappreciated models.
It doesn’t have to be this way! Drawing on over 40 years of successful modelling, this session describes core techniques, backed by practical examples, for helping people appreciate, use, and possibly even want to build data models.
Topics include:
[On April 5th, Alec will run a half day workshop on Concept Modelling, check the conference schedule for details.]
Read lessIn this digital world, it is becoming clear to many organisations that their success or failure depends on how well they manage data. They recognise that data is as a critical business asset which should be managed as carefully and actively as all other business assets such as people, finance, products etc. But like any other asset data does not improve itself and will decline in usefulness and value unless actively maintained and enhanced.
For any organisation a critical first step in maintaining and enhancing its data asset is to understand two critical things:
The primary purpose of a data strategy is to answer these two critical questions. For any data driven organisation a data strategy is essential because it serves as a blueprint for prioritising and guiding current and future data improvement activities. Without a data strategy, organisations will inevitably try to enhance their data assets in a piecemeal, disconnected, unfocused way, usually ending in disappointment or even failure. What’s needed is a well crafted and coherent data strategy which sets out a clear direction which all data stakeholders can buy into. And as the famous US baseball player Yogi Berra once said, “If you don’t know where you are going, you’ll end up somewhere else.”
This seminar will teach you how to produce a workable and achievable data strategy and supporting roadmap and plan, and how to ensure that it becomes a living and agile blueprint for change.
The seminar
In this full day seminar Nigel Turner will outline how to create and implement a data strategy. This includes:
The seminar will take you through a simple and proven four step process to develop a data strategy. It will also include practical exercises to help participants apply the approach before doing it for real back in their own organisations, as well as highlighting some real world case studies where the approach has been successful.
Learning Objectives
Since Google announced its Knowledge Graph solution in 2012 graph database technologies have found their way into many organizations and companies. The graph database market has exploded over the last 10 years with at least 50 brand names today. International Standardization is coming – very soon SQL will be extended by functionality for property graph queries. A full international standard for property graphs, called GQL, will surface in late 2023 (from the same ISO committee that maintains the SQL standard).
Graph databases are generally quite easy to understand – the paradigm is intuitive and seems straightforward. In spite of that, the breadth and power of the solutions, one can create, are overwhelmingly impressive. The inclusion of graph technology dramatically enlarges the scope of analytics by enabling semi-structured information, semantic sources such as ontologies and taxonomies, social networks as well as schema-less sources of data.
At the same time graph databases are much better suited for doing complex multi-joins analyzing large networks of data, opening up for advanced fraud detection etc. The Panama papers is the best-known example.
Finally graph theory is a mathematical discipline with a long history, which among other things have created graph algorithms for many complex analytics, such as clustering, shortest path, page rank, centrality and much more.
Learning Objectives
Who is it for?
Although code examples (in graph database query languages) will be used frequently, the audience is not expected to be proficient database developers (but even SQL experts will benefit from the workshop).
Workshop Course Outline
It is a somewhat technical workshop, focusing on what and how, using examples. Business and architectural level information can be found in the knowledge graph session on the DW&BI Summit on April 4th.
Read lessIn this digital world, it is becoming clear to many organisations that their success or failure depends on how well they manage data. They recognise that data is as a critical business asset which should be managed as carefully and actively as all other business assets such as people, finance, products etc. But like any other asset data does not improve itself and will decline in usefulness and value unless actively maintained and enhanced.
For any organisation a critical first step in maintaining and enhancing its data asset is to understand two critical things:
The primary purpose of a data strategy is to answer these two critical questions. For any data driven organisation a data strategy is essential because it serves as a blueprint for prioritising and guiding current and future data improvement activities. Without a data strategy, organisations will inevitably try to enhance their data assets in a piecemeal, disconnected, unfocused way, usually ending in disappointment or even failure. What’s needed is a well crafted and coherent data strategy which sets out a clear direction which all data stakeholders can buy into. And as the famous US baseball player Yogi Berra once said, “If you don’t know where you are going, you’ll end up somewhere else.”
This seminar will teach you how to produce a workable and achievable data strategy and supporting roadmap and plan, and how to ensure that it becomes a living and agile blueprint for change.
The seminar
In this full day seminar Nigel Turner will outline how to create and implement a data strategy. This includes:
The seminar will take you through a simple and proven four step process to develop a data strategy. It will also include practical exercises to help participants apply the approach before doing it for real back in their own organisations, as well as highlighting some real world case studies where the approach has been successful.
Learning Objectives
Since Google announced its Knowledge Graph solution in 2012 graph database technologies have found their way into many organizations and companies. The graph database market has exploded over the last 10 years with at least 50 brand names today. International Standardization is coming – very soon SQL will be extended by functionality for property graph queries. A full international standard for property graphs, called GQL, will surface in late 2023 (from the same ISO committee that maintains the SQL standard).
Graph databases are generally quite easy to understand – the paradigm is intuitive and seems straightforward. In spite of that, the breadth and power of the solutions, one can create, are overwhelmingly impressive. The inclusion of graph technology dramatically enlarges the scope of analytics by enabling semi-structured information, semantic sources such as ontologies and taxonomies, social networks as well as schema-less sources of data.
At the same time graph databases are much better suited for doing complex multi-joins analyzing large networks of data, opening up for advanced fraud detection etc. The Panama papers is the best-known example.
Finally graph theory is a mathematical discipline with a long history, which among other things have created graph algorithms for many complex analytics, such as clustering, shortest path, page rank, centrality and much more.
Learning Objectives
Who is it for?
Although code examples (in graph database query languages) will be used frequently, the audience is not expected to be proficient database developers (but even SQL experts will benefit from the workshop).
Workshop Course Outline
It is a somewhat technical workshop, focusing on what and how, using examples. Business and architectural level information can be found in the knowledge graph session on the DW&BI Summit on April 4th.
Read less
Prefer online? Join the live video stream!
You can join us in Utrecht, The Netherlands or online. Delegates also gain four months access to the conference recordings so there’s no need to miss out on any session that we run in parallel.
Payment by credit card is also available. Please mention this in the Comment-field upon registration and find further instructions for credit card payment on our customer service page.
“Longer sessions created room for more depth and dialogue. That is what I appreciate about this summit.”
“Inspiring summit with excellent speakers, covering the topics well and from different angles. Organization and venue: very good!”
“Inspiring and well-organized conference. Present-day topics with many practical guidelines, best practices and do's and don'ts regarding information architecture such as big data, data lakes, data virtualisation and a logical data warehouse.”
“A fun event and you learn a lot!”
“As a BI Consultant I feel inspired to recommend this conference to everyone looking for practical tools to implement a long term BI Customer Service.”
“Very good, as usual!”