_______ _______ _______ _________ _______ ( ___ )( ____ \( ____ \\__ __/( ___ ) | ( ) || ( \/| ( \/ ) ( | ( ) | | | | || (_____ | (_____ | | | | | | | | | |(_____ )(_____ ) | | | | | | | | | | ) | ) | | | | | | | | (___) |/\____) |/\____) |___) (___| (___) | (_______)\_______)\_______)\_______/(_______)
OSSIO Project
The OSSIO project provides a comprehensive dataset that captures the intricate network of software dependencies underlying open source software development. Derived from detailed microdata on developers, packages, and inter-package dependencies, the dataset maps the global flows of code contributions. This resource enables rigorous analysis of digital value chains and the economic geography of software production.
Data
The OSSIO dataset is available for download in various disaggregations:
Research
-
Code without Borders? Global Value Chains in Open Source Software Development
Authors: Aaron Lohmann, Gábor Békés, Julian Hinz, and Miklós KorenThis paper investigates the global network of open source software production by constructing comprehensive OSS input-output tables. Leveraging granular data on developer contributions and inter-package dependencies, we reveal that software production is concentrated among a few dominant countries, yet significant knowledge flows occur between diverse regions. Our findings highlight the role of social connectedness in facilitating digital trade and provide new insights into the economic geography of open source ecosystems.
-
When dispersed teams are more successful: Theory and evidence from software
Authors: Gábor Békés, Julian Hinz, Miklós Koren, and Aaron LohmannWho works with whom when collaboration is not constrained by the need to be physically co-located? How does the success of projects depend on the geography of partners? We build a model of global team formation and collaboration. In the model, people have heterogeneous and partially observable skills, collaboration is costly across locations, and success depends on the best idea from a team. The model yields five testable implications: (I) collaboration is more likely among nearby persons, (II) there is selection, only highly skilled distant people form teams, (III) geographically diverse teams are more successful, (IV) for more complex projects, the diversity-success elasticity is higher, and (V) there is a non-linear relationship between team size and project quality. We use a very granular dataset of open-source software development to test these predictions with success linked to downstream usage of projects. The advantage of open-source software — next to its near total transparency of the production process — is the radical lack of physical needs that allows us to focus on interpersonal aspects only. Results shall generalize to other dispersed knowledge based activities.
-
Bugs
Authors: Miklós Koren, Gábor Békés, and Julian HinzHow can open source software be of high quality if it is produced by volunteers? What are the incentives to produce high quality software? And how does the market for software, including OSS, shape these incentives? We develop a model of software production and quality assurance. We use data from GitHub, the largest platform for sharing OSS, to validate the main predictions of the model and to gain insights about the magnitude of the effects.