Published January 16, 2024 | Version v1
Dataset Open

Motivation Research Using Labeling Functions - replication

Description

Motivation is an important factor in software development.
However, it is a subjective concept that is hard to quantify and study empirically.
Therefore, it seems that the wealth of data available about real software development projects in repositories such as GitHub cannot be used to study motivation.
We present a new methodology to overcome this difficulty, based on the use of labeling functions.
A labeling function is a validated heuristic that need only be better than a guess, computable on a dataset.
We define four labeling functions for motivation, for example working in diverse hours of the day, and show that they indeed correlate with motivation.
We then apply them to more than 150 thousand developers working on GitHub projects.
This enables us to characterize and compare the behaviors of developers who are motivated or less so.
The results indicate that the effect of motivation is indeed large, and allow us to build a model to predict developer retention in a project.

Files

motivation-influence.zip

Files (281.7 MB)

Name Size Download all
md5:7837a3600b99a37444380107067e554d
281.7 MB Preview Download