Welcome to the Chair of Systems Design

Our research can be best described as data driven modeling of complex systems, with particular emphasis on social, socio-technical, and socio-economic systems. We are a trully interdisciplinary team of about 15 people from various disciplines (statistical physics, applied mathematics, computer science, social science, engineering). And, yes, we do all the cool stuff, from big data analysis to multilayer network models, from social software engineering to predictions of scientific success - not to forget our research on polarization in political systems, cooperation in animal societies, and life cycles of R&D networks. Just click through our publications, funded projects, teaching, or media coverage. 

Google+

Open Doctoral Postions

We welcome applications for two open doctoral positions in the context of data-driven modeling of social systems. We offer excellent working conditions in a lively interdisciplinary team as well as a competitive salary.

More information on this position and how to apply is available here.

more»

From Relational Data to Graphs

How can we quantify the significance of links in relational data?

In this short paper, we propose a new statistical modeling framework to address this challenge. It builds on generalized hypergeometric ensembles, a class of generative stochastic models that give rise to analytically tractable probability spaces of directed, multi-edge graphs. We show how this framework can be used to assess the significance of links in noisy relational data. We illustrate our method in two data sets capturing spatio-temporal proximity relations between actors in a social system. The results show that our analytical framework provides a new approach to infer significant links from relational data, with interesting perspectives for the mining of data on social systems.

more»

How firms select their partners for R&D collaborations?

In our recent preprint, we perform a large scale analysis of R&D networks using a data driven modeling approach. We monitor the selection of partners for R&D collaborations of firms both empirically, by analyzing a large data set of R&D alliances over 25 years, and theoretically, by utilizing an agent-based model of alliance formation. Using the weighted k-core decomposition method we derive a centrality-based career path for each firm, and analyzing coreness differences between firms and their partners, we identify a change in the way firms select partners.

We use the agent-based model to test whether this change in behavior can be attributed to strategic considerations, and we find that the observed behavior can be well reproduced without such considerations. This way we challenge the role of strategies in explaining macro patterns of collaborations.

more»

Paper on Twitter presented at the IPP conference

The preprint related to our presentation at the The Internet, Policy & Politics Conference in Oxford is available at SocArxiv.

Check the online visualization of our results.

more»

Invited Talk at ICSE 2016

We are proud to announce that our article From Aristotle to Ringelmann: a large-scale analysis of team productivity and coordination in Open Source Software projects was invited as Journal-First contribution for a talk at ICSE 2016, the world's premier software engineering venue. In our work, which is part of our research line on social software engineering, we use data science techniques to provide quantitative evidence for Brook's law in Open Source communities. On May 18 2016 Ingo Scholtes will present our results in the main research track of ICSE 2016 in Austin, TX, USA.

more»

Data-driven modeling of collaboration networks: A cross-domain analysis

How do economic actors or scientists choose their collaboration partners? On one hand, one would argue that scientists as decision makers are quited different from firms. On the other hand, in order to reproduce macroscopic structure such as a collaboration network, we may not need to include all the microscopic details that distinguish economic from social agent.

In our recent preprint, we adopt a data-driven modeling approach to calibrate and validate a previously proposed agent-based model that abstract from these microscopic details, to capture only the essential features of the decision making process. The model is characterized by five parameters which relate to strategies adopted by economic actors or scientists when choosing their collaboration partners. Our results shed new light on the long-lasting question about the role of endogenous and exogenous factors in the formation of collaboration networks.

more»

When is a network a network?

Graph- and network-analytic methods are widely applied to data which capture relations between elements. Despite this popularity, we still lack principled methods to decide when network abstractions are justified and when not.

A new data mining framework developed at our chair can be used to answer the question when it is justified to make a network abstraction of sequential data on pathways and temporal networks. Building on principled model selection and statistical inference techniques, it further allows to infer optimal higher-order network models, which capture both temporal and toplogical characteristics of sequential data.

The methods proposed in this work have been implemented in the OpenSource python package pathpy, which is available on gitHub.

more»

What we miss in network analysis

The analysis of relational data from a graph or network perspective has become a cornerstone of data mining. However, for data sets where additional information like, e.g. the timing or ordering of relations are available, in a number of recent works we have shown that the network perspective can yield wrong results. In our latest work published in the European Physical Journal B we now offer a solution, namely the analysis of higher-order networks. We specifically show that this promising abstraction allows us to (i) generalize common path-based centrality measures to higher-order centralities, and that (ii) these higher-order measures better capture the real importance of nodes in time-evolving network topologies.

more»

talk at SIAM workshop

Dr. Rebekka Burkholz will attend the SIAM workshop on NETWORK SCIENCE, which takes place in Pittsburgh from 13-14th July. She will present a framework for cascade size calculations on random networks.

more»

Multiplex Network Regression: How do relations drive interactions?

Ever wondered how to apply regression to Multiplex Networks? In our preprint we introduce a new statistical method to investigate the impact of dyadic relations on complex networks generated from repeated interactions. The method is based on generalised hypergeometric ensembles (gHypEs), a class of statistical network ensembles we have developed recently.

We represent different types of known relations between system elements by weighted graphs, separated in the different layers of a multiplex network. With our method we can regress the influence of each relational layer, the independent variables, on the interaction counts, the dependent variables. Moreover, we can test the statistical significance of the relations as explanatory variables for the observed interactions.

more»

Quantifying and suppressing ranking bias

Every day scholars and online users explore available knowledge using recommender systems based on ranking algorithms. This challenge us to design more sophistcated filtering and ranking procedures to avoid biases that can systematically hide relevant contents.

In this work, we tackle this issue by quantifying and supressing biases of indicators of scientific impact. We use a large citation dataset from Microsoft Academic Graph and a new statistical framework based on the Mahalanobis distance to show that the rankings by well known indicators, including relative citation count and Google's PageRank score, are significantly biased by paper field and age. We propose a general normalization procedure motivated by the z-score which produces much less biased rankings when applied to citation count and PageRank score.

more»

Generalized Hypergeometric Ensembles

In this paper we introduce an ab initio class of statistical network ensembles based on a simple generative model of complex networks. We show that this class of ensembles provides a powerful framework for model selection in complex networks and a new approach to test the statistical signicance of community structures. The latest version of our paper"Generalized Hypergeometric Ensembles: Statistical Hypothesis Testing in Complex Networks" is available on ArXiv

more»

How many developers does it take to complete a project?

more»