Data Science
In our research, we address various issues pertaining to the broader topic of data science. We are particularly interested in the development of methods for the study of large relational datasets, complex systems analysis, and collections of time-stamped interactions from various disciplines.
The methods highlighted here range from network science tools, data scraping and inference, and disambiguation.
Related Publications
Empirical Networks are Sparse: Enhancing Multi-Edge Models with Zero-Inflation
arXiv Preprint - 2024
![](https://www.sg.ethz.ch/publications/2024/casiraghi2024sparse/sparsegraph_hued73241113957122c2d9e1459f33e651_2052950_200x200_fill_box_center_3.png)
Disentangling the Timescales of a Complex System: A Bayesian Approach to Temporal Network Analysis
ArXiv Preprint - 2024
![](https://www.sg.ethz.ch/publications/2024/casiraghi2024timescales/00_fig1_data_hue76e4387df5de63d7d24698193e75ab6_237618_200x200_fill_box_center_3.png)
Reconstructing signed relations from interaction data
Scientific Reports - 2023
![](https://www.sg.ethz.ch/publications/2022/andres2022reconstructing-signed-relations/results_hu350d3f01aafcc446c667297c85d71654_1880030_200x200_fill_box_center_3.png)
Predicting variable-length paths in networked systems using multi-order generative models.
Applied Network Science - 2023
![](https://www.sg.ethz.ch/publications/2020/gote2020predicting-sequences-of/gote2020predicting_hu9e63090a741b9cdf54ec09a012ac5953_22781_200x200_fill_box_center_3.png)
Detecting and Optimising Team Interactions in Software Development
arXiv - 2023
![](https://www.sg.ethz.ch/publications/2023/zingg2023detecting-team-interactions/bccm_networks_hu8cdfac9e18479bc5d92914173deb724f_45524_200x200_fill_box_center_3.png)
Locating Community Smells in Software Development Processes Using Higher-Order Network Centralities
Social Network Analysis and Mining - 2023
![](https://www.sg.ethz.ch/publications/2023/gote2023community-smells/gote2023locating_hu5dff860694b0468241b27a23ce63e576_300373_200x200_fill_box_center_3.png)
Big Data = Big Insights? Operationalising Brooks' Law in a Massive GitHub Data Set
2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE) - 2022
![](https://www.sg.ethz.ch/publications/2022/gote2022big_data_big_insights/gote2022big_hub9e7781f98e24918e865c306a722028e_378183_200x200_fill_box_center_3.png)
A network approach to expertise retrieval based on path similarity and credit allocation
Journal of Economic Interaction and Coordination - 2021
![](https://www.sg.ethz.ch/publications/2021/li2021a-network-approach/schema_hu331bf04c43ef8a5add6672ff2cfd3ad1_56684_200x200_fill_box_center_3.png)
gambit - An Open Source Name Disambiguation Tool for Version Control Systems
2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR) - 2021
![](https://www.sg.ethz.ch/publications/2021/gote2021gambit--an/gote2021gambit_huf6e8074b4afd2c3c25d2655a7cd9dc07_25336_200x200_fill_box_center_3.png)
The likelihood-ratio test for multi-edge network models
J. Phys. Complex. 2 035012 - 2021
![](https://www.sg.ethz.ch/publications/2021/casiraghi2021the-likelihoodratio-test/distributions_huf603faada49609c0ceb47257391b8350_34703_200x200_fill_box_center_3.png)
HYPA: Efficient Detection of Path Anomalies in Time Series Data on Networks
Proceedings of the 2020 SIAM International Conference on Data Mining - 2020
![](https://www.sg.ethz.ch/publications/2020/larock2020hypa-efficient-detection/nanumyan_hu6d8049127d4b5dfac1e349852a68c403_19252_200x200_fill_box_center_3.png)
A Gaussian Process-based Self-Organizing Incremental Neural Network
2019 International Joint Conference on Neural Networks (IJCNN) - 2019
![](https://www.sg.ethz.ch/publications/2019/wang2019a-gaussian-process-based/kdesoinn_hu1964e3c3ecf7e41583360a95b8aed6eb_42360_200x200_fill_box_center_3.png)
Quantifying Triadic Closure in Multi-Edge Social Networks
ACM - 2019
![](https://www.sg.ethz.ch/publications/2019/brandenberger2019quantifying-triadic-closure/triad_hua97c1c0a35d9ae4d43ca1bd2ab12c8e6_85147_200x200_fill_box_center_3.png)
git2net - An Open Source Package to Mine Time-Stamped Collaboration Networks from Large git Repositories
Proceedings of the 16th International Conference on Mining Software Repositories - 2019
From Relational Data to Graphs: Inferring Significant Links Using Generalized Hypergeometric Ensembles
Social Informatics: 9th International Conference, SocInfo 2017, Oxford, UK, September 13-15, 2017, Proceedings, Part II - 2017
![](https://www.sg.ethz.ch/publications/2017/casiraghi2017from-relational-data/relationaltographs_hueae535ea3f77db0d4b06d134bd669f1f_106778_200x200_fill_box_center_3.png)
Multiplex Network Regression: How do relations drive interactions?
arXiv e-print - 2017
![](https://www.sg.ethz.ch/publications/2017/casiraghi2017multiplex-network-regression/triplex_hu1452ecbfea0e36bff9ad87269693c4b3_246653_200x200_fill_box_center_3.png)