Welcome to the Chair of Systems Design
Our research can be best described as data driven modeling of complex systems with particular emphasis on social, socio-technical, and socio-economic systems. We are a trully interdisciplinary team of about 15 people from various disciplines (statistical physics, applied mathematics, computer science, social science, engineering). And, yes, we do all the cool stuff, from big data analysis to multilayer network models, from social software engineering to predictions of scientific success - not to forget our research on polarization in political systems, cooperation in animal societies, and life cycles of R&D networks. Just click through our publications, funded projects, teaching or media coverage.
How can we quantify the significance of links in relational data?
In this short paper, we propose a new statistical modeling framework to address this challenge. It builds on generalized hypergeometric ensembles, a class of generative stochastic models that give rise to analytically tractable probability spaces of directed, multi-edge graphs. We show how this framework can be used to assess the significance of links in noisy relational data. We illustrate our method in two data sets capturing spatio-temporal proximity relations between actors in a social system. The results show that our analytical framework provides a new approach to infer significant links from relational data, with interesting perspectives for the mining of data on social systems.
In our recent preprint, we perform a large scale analysis of R&D networks using a data driven modeling approach. We monitor the selection of partners for R&D collaborations of firms both empirically, by analyzing a large data set of R&D alliances over 25 years, and theoretically, by utilizing an agent-based model of alliance formation. Using the weighted k-core decomposition method we derive a centrality-based career path for each firm, and analyzing coreness differences between firms and their partners, we identify a change in the way firms select partners.
We use the agent-based model to test whether this change in behavior can be attributed to strategic considerations, and we find that the observed behavior can be well reproduced without such considerations. This way we challenge the role of strategies in explaining macro patterns of collaborations.
We are proud that our KDD 2017 paper on the analysis of time-stamped and sequential network data has been covered by ETH News. Our work casts a critical light on the pervasive use of network analysis methods in various contexts, including infrastructure systems, information systems, and health. We provide a novel data mining framework that allows to overcome limitations of existing network-based techniques for time series data, improving our ability to model and analyze complex systems.
The article can be found here.
We are happy to announce that our work When is a network a network? Multi-Order Graphical Model Selection in Pathways and Temporal Networks has been accepted for publication as a research paper at KDD'17. A short promotional video is available at the KDD YouTube channel:
How do economic actors or scientists choose their collaboration partners? On one hand, one would argue that scientists as decision makers are quited different from firms. On the other hand, in order to reproduce macroscopic structure such as a collaboration network, we may not need to include all the microscopic details that distinguish economic from social agent.
In our recent preprint, we adopt a data-driven modeling approach to calibrate and validate a previously proposed agent-based model that abstract from these microscopic details, to capture only the essential features of the decision making process. The model is characterized by five parameters which relate to strategies adopted by economic actors or scientists when choosing their collaboration partners. Our results shed new light on the long-lasting question about the role of endogenous and exogenous factors in the formation of collaboration networks.
Graph- and network-analytic methods are widely applied to data which capture relations between elements. Despite this popularity, we still lack principled methods to decide when network abstractions are justified and when not.
A new data mining framework developed at our chair can be used to answer the question when it is justified to make a network abstraction of sequential data on pathways and temporal networks. Building on principled model selection and statistical inference techniques, it further allows to infer optimal higher-order network models, which capture both temporal and toplogical characteristics of sequential data.
The methods proposed in this work have been implemented in the OpenSource python package pathpy, which is available on gitHub.
The analysis of relational data from a graph or network perspective has become a cornerstone of data mining. However, for data sets where additional information like, e.g. the timing or ordering of relations are available, in a number of recent works we have shown that the network perspective can yield wrong results. In our latest work published in the European Physical Journal B we now offer a solution, namely the analysis of higher-order networks. We specifically show that this promising abstraction allows us to (i) generalize common path-based centrality measures to higher-order centralities, and that (ii) these higher-order measures better capture the real importance of nodes in time-evolving network topologies.
Ever wondered how to apply regression to Multiplex Networks? In our preprint we introduce a new statistical method to investigate the impact of dyadic relations on complex networks generated from repeated interactions. The method is based on generalised hypergeometric ensembles (gHypEs), a class of statistical network ensembles we have developed recently.
We represent different types of known relations between system elements by weighted graphs, separated in the different layers of a multiplex network. With our method we can regress the influence of each relational layer, the independent variables, on the interaction counts, the dependent variables. Moreover, we can test the statistical significance of the relations as explanatory variables for the observed interactions.