The project is an unprecedented collaboration between principal scientists from Eindhoven University of Technology, Leiden University, University of Amsterdam, Radboud University Nijmegen, Tilburg University, VU University, Amsterdam Medical Center, VU Medical Center, Leiden University Medical Center, Delft University of Technology, and CWI (National Research Institute for Mathematics and Computer Science).
The groups involved cover the broader data science spectrum, including disciplines like statistics, data/process mining, machine learning, visualization, law, humanities, ethics, natural language processing, information retrieval, and security. Hence, the team is well equipped to take on the challenges and are a good representation of the leading data science scholars in The Netherlands.
Further indication of the strength of the RDS consortium are the various grants obtained by researchers in this consortium, including 3 ERC grants (2 Advanced: Jacobs, Van der Vaart; 1 Starting: Helberger) and various NWO grants (4 Veni: Lemmens, Grünwald, Van Zanten, Helberger; 3 Vidi: Lemmens, Grünwald, Van Zanten; 8 Vici/Pionier: De Rijke, Grünwald, Heskes, Kappen, Jacobs, Meulman, Van Dijck, Van Zanten; 5 Top grants: Van der Aalst, De Rijke, Van Harmelen, Marchiori, Heskes, Jacobs). Van der Vaart and Vossen received Spinoza awards. Grünwald, Van Zanten, and Van der Vaart are Van Dantzig Laureates. Van der Aalst, Van Dijck, Prins, Van der Vaart and Meulman are members of, and Van Dijck is currently President of the Royal Netherlands Academy of Arts and Sciences (Koninklijke Nederlandse Akademie van Wetenschappen). Van der Aalst, Van Dijck, Van Harmelen, Vossen, Kok, Prins, and Jacobs are members of the Royal Holland Society of Sciences and Humanities (Koninklijke Hollandsche Maatschappij der Wetenschappen). Van der Aalst, Van Harmelen, and Jacobs are members of the Academia Europaea. Research from the consortium members has also resulted in successful spin-off companies, e.g., Software Improvement Group (2000), MagnaView (2003), Futura Process Intelligence (2009, now part of Lexmark), Bonaparte (2009), Fluxicon (2010), Infotron (2010), SynerScope (2011), Scyfer (2013), 904Labs (2014), and Big4Data (2015).
Academic Medical Center (AMC)
The AMC is one of the leading research institutions and largest hospitals in the Netherlands. Within the AMC, primarily the Medical Informatics department (Klinische InformatieKunde, KIK) will be actively involved in RDS. Primary investigator for RDS is Nicolette de Keizer, Professor and leader of the Evaluation of Healthcare and Healthcare information systems group, manager of the National Intensive Care Evaluation (NICE) registry, and head of the Health Informatics Postgraduate Program. In her group, there is a long tradition with quality registry-related research, e.g. the development and evaluation of quality indicators, audit and feedback; case-mix adjustment models and data quality. The majority of this research is performed for the ISO9001-certified NICE registry, one of the most successful quality registries in the Netherlands, covering a network of almost all (90) intensive care units and run by the KIK department. Other research lines include: evaluation of healthcare information systems, under de Keizer’s chair, the International Medical Informatics Association (IMIA) working group on health technology and quality assessments developed guidelines for planning and reporting of health informatics studies, and semantic interoperability who cooperate with the Dutch ICT Institute in healthcare (NICTIZ), the Netherlands Federation of University Medical Center’s program Point-of-care clinical data capture and international Standard Development Organizations such as IHTSDO and ISO.
Centrum Wiskunde en Informatica (CWI)
CWI is a recognized leader in data science. Its Database Architectures group is focused on advancing the state of the art in large-scale data management software. It has pioneered so-called columnar databases via its award-winning open-source MonetDB system; which is increasingly merging with computational environments like R and numPy, enabling the solving of Big data problems. The group has a strong track record on integrating data management technologies in other science fields, e.g. by expediting the LOFAR astronomy computation pipeline in R with embedded MonetDB. The group has spawned multiple spin-offs (e.g. VectorWise) and around it data-intensive industry labs have emerged (Actian Amsterdam, Oracle Labs Netherlands).
The Information Access group investigates how users can influence data-driven algorithmic processes, and how to measure their fitness for use in the context of a specific task. We are currently designing research environments that combine the needs for open, transparent and collaborative science with the needs for protecting privacy, copyright or otherwise sensitive research data. Example application domains include the enrichment of political linked data, quantification of bias in historical search tools, and the design of metrics and visualizations to convey limitations of machine learning algorithms to domain experts in a transparent way. Partners include the National Library, Rijksmuseum, NIOD, Beeld en Geluid, spin-off company Spinque, and RDS partners UvA and VU.
The information-theoretic learning group is headed by VIDI-, VICI- and Van Dantzig Laureate Peter Grunwald. He focuses on theoretical research into sequential prediction and better accuracy guarantees for noisy data. Also, he was co-chair of COLT and UAI, two of the top-tier conferences in machine learning. The CWI Life Sciences research group has strong links to data and partners from (neuro-)biology and medicine, including the Dutch Cancer Institute (NKI), the Netherlands Institute for Neuroscience (NIN), and RDS partners AMC and VUmc.
Delft University of Technology (TUD)
The Faculty of Technology, Policy and Management (TPM) fosters a co-operative relationship between philosophy, the social sciences and the exact sciences/technology, in order to make a significant contribution to sustainable solutions for social problems in which technology plays an important role. Important research themes are “responsible innovation”, “value sensitive design”, “privacy by design”, and “social system interaction”. Jeroen van den Hoven is professor of moral philosophy at TU Delft and until quite recently Founding scientific director of 3TU.Ethics Centre of Excellence. He is founder and program and chair of the Program Committee of the Dutch Research Council’s program on Responsible Innovation (Maatschappelijk Verantwoord Innoveren, MVI). He is Editor-in-Chief of Ethics and Information Technology (Springer). Van den Hoven has received several grants for Ethics and IT, and responsible innovation. In 2012 he received the IFIP award for Societal Aspects of ICT and the World Technology Award for Ethics. He is member of the Ethics Advisory Group of the European Data Protection Supervisor in Brussels (2016-2017), and Chairman of the Expert group Big Data and Privacy of the Dutch Ministry of Economic Affairs (2015-2016). In the project he works closely together with Geert Jan Houben, professor of Web Information Systems and scientific director of Delft Data Science (DDS) and Big Data Science Chair of the Royal Netherlands Institute of Engineers, and Dirk Helbing, professor of Computational Social Science at ETHZ and at TPM, where he heads up the program Engineering Social Technologies for a Responsible Digital Future (10 PhD students).
Eindhoven University of Technology (TU/e)
The Architecture of Information Systems (AIS) group chaired by Wil Van der Aalst focuses on the interplay between information systems, event data, and processes. The group has a long tradition in the field of model-based process analysis (in particular Petri nets) and played an important role in developing Petri-net-based verification techniques and tools (e.g., Woflan) tailored toward workflow processes. AIS is one of the leading Business Process Management (BPM) groups in the world and developed software systems such as ExSpect, Woflan, CPN Tools, YAWL, Declare, and ProM. YAWL and ProM have become reference points in the larger BPM area. This is illustrated by the number of downloads 220.000 (YAWL) and 150.000 (ProM). AIS played a leading role in establishing process mining as a new research discipline that sits between model-based process analysis and data mining.
The Visualization (VIS) group chaired by Van Wijk aims at the development of new methods and techniques to obtain insight in large and complex data sets via interactive visualization. Since the start of the group in 1998, the primary focus has been on Information Visualization, which deals with abstract data like tables, hierarchies, and networks. VIS has achieved a leading position in this field. Their work has led to a large number of new methods and techniques, published in high quality journals and conferences, and numerous tools have been developed by the group (such as SequoiaView, downloaded over 1.000.000 times), were adopted and copied by many, and which led to three spin-off companies (MagnaView, SynerScope,
Leiden University (LU)
Data science research at Leiden is distinguished by its foundational nature, combining Statistics and Computer Science with a focus on principles and methods. Additionally, Leiden has many multidisciplinary collaborations in which data science and data domain knowledge are combined, with an emphasis on scientific data. Leiden has an excellent position: in the CWTS, ranking Mathematics and Computer Science & Engineering ranks 7th globally. In 2014, a large university-wide data science research program has been given support by the board of Leiden University. The Leiden Centre of Data Science (LCDS), initiated by the Faculty of Mathematics and Natural Sciences and co-directed by Professor of Applied Statistics Jacqueline Meulman and Professor in Computer Science and Medicine Joost Kok, supports a network of researchers from different scientific domains within the university with expertise in the field of data science. In the past years there have been many successful meetings organized by this network, support has been given to project applications and the LCDS gave advice on computer infrastructure. There is extensive data science knowledge within Leiden University and the LCDS network. Physics and Astronomy have a long history in data processing, the Computer Science Institute LIACS is built around “data processing and modeling” and many of the Leiden Innovational and ERC projects have a significant data science component. For his pioneering research in statistics, professor of Stochastics Aad van der Vaart received in 2015 the Spinoza prize.
Leiden University Medical Center (LUMC)
LUMC is a modern university medical center for research, education and patient care with a high quality profile and a strong scientific orientation. Research from pure fundamental medical research to applied clinical research, places LUMC among the top of the world. This enables LUMC to offer patient care and education that is in line with the latest international insights and standards – and helps it to improve medicine and healthcare both internally and externally. LUMC acts as a knowledge center for topics in the field of public health with an impact on society, it has a directive function in the region and it acts as a center for continuing education and further training for medical professionals. Eline Slagboom is LUMC’s professor of Molecular Epidemiology. The focus of her research in the past 10 years is on genetic, epigenetic, transcriptomic and metabolomic studies of healthy/unhealthy ageing and longevity in humans. These studies focus on metabolic disease and osteoarthritis and the metabolic component in healthy ageing.
Radboud University Nijmegen (RU)
The sections Data Science and Digital Security of the Institute for Computing and Information Sciences (iCIS) will mainly be involved in RDS. For the second time in a row, iCIS came out on top in the Dutch national research assessment for computer science. The section Data Science, currently headed by VICI laureate prof. Tom Heskes, develops theories and methods to learn from data and reason with it. In recent years, causal discovery, extracting causal relations from (big) data sets, has become a main focus area of the group, leading to various internationally recognized contributions: a best paper award at UAI, one of the top tier conferences in computer science, the Willem R. van Zwet Award for the best Dutch PhD thesis in statistics and operations research, and an NWO EW Top project together with prof. Aad van der Vaart. Prof. Elena Marchiori is a renowned expert in evolutionary computing and machine learning, in recent years focusing on clustering and network analysis. She is the winner of the prestigious EvoStar Award and is involved in a number of multi-disciplinary projects, including an NWO EW Top project on combining game theory and machine learning to design better algorithms for clustering data. The section Data Science collaborates with the Dutch Foundation for Neural Networks (SNN) Adaptive Intelligence, headed by Pionier laureate prof. Bert Kappen, which coordinates research on neural networks and machine learning in the Netherlands through the Machine Learning platform. Kappen is well-known for his research on Bayesian machine learning, stochastic control theory, and computational neuroscience.
The section Digital Security is one of the most prominent computer security groups in the Netherlands and works on a broad range of topics in computer security, including applied cryptography, security protocols, smartcards and RFID, but also security and correctness of software. The group also covers societal aspects of digital security, such as e-voting, road pricing, smart cards in public transport and banking, smart electricity meters, and electronic patient dossiers. Prof. Bart Jacobs is well-known for his contributions to societal debates (in the media, but also in parliament). He is a member of the national Cyber Security Board, that gives strategic advice to the government. The group’s work on Mifare Classic led to international attention and a high-profile court case setting an important legal precedent for security and cryptography research in the Netherlands. Jacobs is a member of the Academia Europaea and recipient of an ERC Advanced Investigator grant. Also nationally he has held various prestigious grants, including a NWO Top, Pionier (Vici), and KNAW fellowship.
Tilburg University (UvT)
Both the Tilburg Law School (TLS) and the Tilburg School of Economics and Management (TiSEM) are jointly involved in RDS. Both schools are part of the Data Science Center Tilburg (DSC/t). TLS, chaired by Professor Corien Prins is ranked second in the “Top International Law School” by the American Social Science Research Network SSRN. TLS hosts a variety of interdisciplinary institutes, including the Tilburg Institute for Law, Technology, and Society (TILT), established in 1994 by Prins. TILT is a top player in regulation of technology and its normative implications. Concrete research topics of TILT include e-government, e-commerce, e-health, trust, technology adoption and legitimacy, privacy, identity management, liability, cyber crime, public security, intellectual property rights, networks and innovation, and governance. TILT has organized and participated in international research projects, such as FP6 project PRIME and NoE FIDIS, FP7 projects PrimeLife, ENDORSE, Virtuoso, Robolaw, A4Cloud, FIStar, and μMole. The institute’s wide composition provides opportunities for cooperation in Big data privacy and data protection, intellectual property, contracts and liability, law enforcement. TILT is consistently ranked a top institute for research and education by national and international Legal Research and Education Assessment Committees. TILT hosts recent VICI (prof. Bert-Jaap Koops) and VENI (dr. Eleni Kosta) laureates. Prof. Koops has been awarded the Lorentz distinguished Fellowship 2016. According to the 2011-2015 UT Dallas Top 100 of Business School Research Rankings, TiSEM ranks no. 1 in the Netherlands, 3 in Europe and 32 worldwide. Dr. Lemmens is part of the Department of Marketing, which ranks no. 1 in Europe and 13 worldwide. It has a strong expertise in data science and Big data in the domain of customer and marketing analytics, economics, eye tracking and strategy. Lemmens has been awarded multiple grants (Marie Curie, Veni, Vidi) and prizes (IJRM best paper award, Erasmus Top Talent Researcher Awards) and is a visiting scholar at Harvard Business School.
University of Amsterdam (UvA)
At the UvA, interdisciplinary research in data science, bringing together computer science, humanities, information law, and social science, has successfully resulted in a wealth of grants and funded projects.
José van Dijck (Professor of Comparative Media Studies) has been highly influential in establishing this relatively new research field. In 2015, Communication and Media Studies ranked number 8 in the QS World Rankings. Van Dijck’s research is extensive, encompassing media technologies, digital culture, social media, popularization of science and medicine, and television and culture. She was instrumental in the settingup of CLARIAH, a consortium of humanities research institutes, which closely links with government organizations and industry across the Netherlands, focusing on tool development, e.g., for automatically enriching text and visual data with annotations. Van Dijck’s research is cross-discipline, e.g., the NWO project MediaNow (with De Rijke), which investigates the interactions between developers of search algorithms and professional users of audio-visual archives. In 2015, van Dijck was elected KNAW president.
The UvA Statistics group coordinates a research program in Mathematical Statistics and is renowned for their Nonparametric Bayesian methods. Group members have been awarded prestigious Veni, Vidi and Vici grants. Harry van Zanten (Group Leader), currently holds a Vici grant, was recipient of the 2010 Van Dantzig award, the highest Dutch prize in statistics and is an Elected Fellow of the Institute of Mathematical Statistics. The group has strong ties with other national and international groups in mathematical statistics.
The Informatics Institute at UvA performs curiosity-driven and use-inspired fundamental research in computer science. The 2015 quality assessment (2009-2014) by the VSNU, NWO and KNAW rated Computer Science research at the UvA world-leading in its relevance to society. At the Informatics Institute, Maarten de Rijke leads one of the world’s top academic research groups in information retrieval, focusing on largescale semantic and self-learning search algorithms. The retrieval and language technology developed by De Rijke’s group is being used internationally and has various spin-off initiatives. De Rijke is a Pionier personal innovational research incentives grant laureate. In addition, De Rijke has received grants and awards totaling over 60 M.Euro from Bloomberg, Elsevier, ESF, EU, Microsoft, National Institute for Sound and Vision, NWO, Yahoo, and Yandex. De Rijke is editor-in-chief of the leading journals and book series in information retrieval and director of Amsterdam Data Science.
Max Welling is Professor of Computer Science at UvA and also holds a professorship at the University of California Irvine and a senior fellowship at the Canadian Institute for Advanced Research. As research chair in Machine Learning at UvA, Welling’s research focuses on large-scale statistical learning. Welling holds a number of journal editor and board positions and has been a NIPS board member since 2015 (the largest conference in machine learning). Welling’s research has been awarded multiple grants including those from Google, Facebook, Yahoo, NSF (career grant), NIH, NWO and ONR-MURI. Welling is the director of the Master program in artificial intelligence at the UvA and he co-directs the Qualcomm-UvA deep learning lab.
Natali Helberger is professor of Information Law at the Institute for Information Law (IViR). Focus points of her research are the interface between technology and information law, user rights and the changing role of the user in information law and policy. Her interest in multi-disciplinary work is another feature that characterizes her work, and has resulted in various fruitful research cooperations with technologists, economics, communications and political scientists, including an ERC grant.
Vrije Universiteit Amsterdam (VU)
The Knowledge Representation and Reasoning (KRR) group, within Computer Science, is led by Professor Frank van Harmelen. The 2015 Dutch National research assessment rated Computer Science at the VU world-leading in its relevance to society. Since 2000, van Harmelen’s group has played a leading role in Semantic Web development. He was co-PI on the first European Semantic Web project, laying the foundations for the Web Ontology Language OWL, now a worldwide standard. He was one of the architects of Sesame, an RDF storage and retrieval engine, which is now in wide academic and industrial use (200,000+ downloads) and received a prestigious 10-year impact award. He is scientific director of the VU Network Institute, a collaboration of over 150 researchers across research domains. His group has close connections with Universities in China including Wuhan where he is guest professor. Annette ten Teije’s (associate professor) interests include approximate reasoning and formalization of medical knowledge. Ten Teije has close links with experts at the VUmc through previous collaborative projects, which will be maximized in this project through jointly supervised PhD positions. Within the same department, Hajo Reijers leads the Business Informatics group and focuses on business process management, workflow technology, business process improvement, and conceptual modeling. He is closely cooperating with companies from the services and healthcare domains (including AMC), is a part-time Professor in the group of Van der Aalst at the TU/e and is well connected to various international scholars.
Mathisca de Gunst, coordinates the Department of Mathematics’ Statistics for Life Sciences group which is a co-operation with the UvA Institute for Mathematics. Her focus on stochastic modeling and statistical analysis of biological processes, and specifically development, assessment and application of statistical models and tools in the area of life sciences.
Piek Vossen (2013 Spinoza laureate) leads the Computational Lexicology and Terminology Lab (CLTL). CLTL focuses on modeling the understanding of language by computers and this research has successfully developed technology for various languages, e.g., NewsReader and Open Dutch Wordnet. CLTL members lead the Global WordNet Association and develop the Collaborative Interlingual Index, facilitating interoperability between wordnets. Since its founding (2009), CLTL has acquired 4.3 M.Euro external funding, encompassing EU and NWO projects, Spinoza award, and the EnlightenYourResearch prize (2013, Vossen). CLTL closely collaborates with other groups within VU and across Europe, e.g., Fondazione Bruno Kessler (Italy) and the University of the Basque Country (Spain).
VU University Medical Center (VUmc)
VUmc encompasses patient care, excellence in higher education, and ground-breaking research. In RDS, Marieke van der Leenden and Mark van de Wiel will be involved. Marike van der Leeden’s research is focused on rehabilitation in rheumatic conditions and oncology and she is based within the multidisciplinary research Institute for Health and Care Research (EMGO) and the department of Rehabilitation Medicine and Reade (Center for Rehabilitation and Rheumatology). EMGO’s key mission is to generate, conduct and publish excellent research of international standing to improve public and occupational health, mental health, primary care, rehabilitation and long-term care. Mark van de Wiel is chair of the Statistics for Genomics unit, Department of Epidemiology and Biostatistics (VUmc) and Department of Mathematics (VU). As group leader, he is responsible for embedding biostatistical research in the field of translational genomics, requiring rapid adaptation of methods to new questions, new types of data and increased availability of large public genomic data sets. Since 2001, he has developed a plethora of statistical methods for data analysis in collaboration with researchers from several VUmc departments (Pathology, Medical Oncology, Head-and-neck surgery) and statisticians from all over the globe (Oslo, Cambridge UK, Copenhagen, Seoul). His group is one of the leading groups in The Netherlands in the “-omics” data science field, best exemplified by >30.000 downloads of their open source software packages per year (May 2015 – April 2016). Since 2010, Van de Wiel has received 2.0M.Euro external funding for various projects. Wessel van Wieringen (VUmc) is a long-standing collaborator of Van de Wiel, and an expert in estimating molecular networks from Big data. Both contribute to the thematic area responsible health (A2) of RDS.
To ensure that data science is sustainable in the longterm, we propose to develop approaches, techniques, software, and infrastructures that support green data science. This requires the interplay of key disciplines, e.g., data/process mining, digital humanities, ethics, information retrieval, knowledge representation, law, machine learning, natural language processing, security, statistics, and visualization. The figure below shows the disciplines and the key people involved. The diagram should not be taken literally as several disciplines are overlapping and several researchers are working in multiple scientific disciplines, e.g., in statistics and medicine. Some researchers/groups are specializing in a specific domain, e.g., Aurélie Lemmens in marketing, Corien Prins in e-government, and Nicolette de Keizer in medical informatics. The involvement of domain-experts is essential for the four thematic areas covered by RDS: science, health, business, and government. In each subproject, people from multiple organizations and/or disciplines are involved. PhDs will be jointly supervised by experts from different groups and institutions. Moreover, teams work together in the (sub)tracks and thematic areas. This ensures that we exploit the unique competencies of our team and maximally stimulate the exchange of ideas across disciplines and organizations.