DataLad - Decentralized Management of Digital Objects for Open Science

Speaker: Adina Wagner

Track: Debian in Arts & Science

Type: Short talk (20 minutes)

Room: Talks

Time: Aug 25 (Tue): 18:00

Duration: 0:20

With a general awareness of a reproducibility crisis in many scientific areas and increasing importance of research data management in science and policy making, data-driven fields require convenient and scalable data management solutions. Standing on the shoulders of Git and git-annex (, Joey Hess), DataLad provides a decentralized solution that enables the joint management of code, data, and complete containerized computational environments in a scalable and distributed fashion. With features such as unambiguous version control, a wide spectrum of data transport mechanisms, convenient provenance capture, and re-execution for verification or as an alternative to storage and transport, it enables and facilitates many aspects of open and reproducible science: collaboration, sharing, analytical transparency, computational reproducibility of digital research objects, and disk-space aware storage and computing workflows on infrastructure that ranges from personal laptops up to supercomputers.

In this talk, we will introduce DataLad, present its main features which should be of interest to the audience regardless of their relation to any field of science, and share the process and status of its adoption in the neuroimaging community.