Towards transparency and knowledge exchange in ai-assisted data analysis code generation
Towards transparency and knowledge exchange in ai-assisted data analysis code generation"
- Select a language for the TTS:
- UK English Female
- UK English Male
- US English Female
- US English Male
- Australian Female
- Australian Male
- Language selected: (auto detect) - EN
Play all audios:
Access through your institution Buy or subscribe Generative artificial intelligence (AI) and large language models (LLMs) in particular are changing the way we do data science. Most
prominently, scientists use the technology for interacting with scientific data1, answering data analysis questions2,3, generating data analysis code4,5,6, and (re-)writing scientific
manuscripts7. Unfortunately, the prompts sent to LLMs are commonly not conserved, and thus, at the time of publication, it might be hard to differentiate human-made and AI-generated parts of
the scientific work. A professional peer-review system, for documenting how LLM-generated code was prompted for, and which human reviewed it, is not established in contemporary scientific
culture. However, such systems do exist for collaborative code editing involving multiple humans. For example, the source code repositories GitHub and GitLab are well-established in the
open-source software community for discussing issues and potential solutions, building code together, and for peer-reviewing content. As it was shown before that LLMs can solve real-world
GitHub issues8, developing an AI-assistant that interacts with humans directly within the GitHub platform is the obvious next step. Here, I present git-bob, a GitHub/GitLab-integration of an
LLM-based AI-assistant that can respond to GitHub issues, discuss potential solutions with humans iteratively, write code for them, and submit it as a pull-request to be reviewed by humans.
It is technically similar to various online services for data analysis such as the OpenAI ChatGPT Data Analyst or GitHub Copilot workflows, with three major differences. First, multiple
humans can interact with git-bob in one communication thread. This allows bringing together domain specialists, such as life scientists, data-analysts and the AI-assistant in one discussion,
stimulating knowledge exchange on how to interact properly with the AI-assistant. Second, discussions with git-bob and resulting code modifications are conserved in an online platform that
others can read and follow, making the interaction with the AI-assistant fully transparent. Third, git-bob is completely open-source and extensible. Other developers can read its built-in
system prompts and modify them to their needs. Developers can implement custom connectors to other LLM service providers and write plugins for their custom AI agents, which may deal with
GitHub issues differently. This is a preview of subscription content, access via your institution ACCESS OPTIONS Access through your institution Access Nature and 54 other Nature Portfolio
journals Get Nature+, our best-value online-access subscription $29.99 / 30 days cancel any time Learn more Subscribe to this journal Receive 12 digital issues and online access to articles
$99.00 per year only $8.25 per issue Learn more Buy this article * Purchase on SpringerLink * Instant access to full article PDF Buy now Prices may be subject to local taxes which are
calculated during checkout ADDITIONAL ACCESS OPTIONS: * Log in * Learn about institutional subscriptions * Read our FAQs * Contact customer support CODE AVAILABILITY The complete source code
of git-bob is available online at GitHub11: https://github.com/haesleinhuepf/git-bob REFERENCES * Royer, L. A. _Nat. Methods_ 20, 951–952 (2023). Article Google Scholar * Lai, Y. et al.
Preprint at https://arxiv.org/abs/2211.11501 (2022). * Lei, W. et al. _Nat. Methods_ 21, 1368–1370 (2024). Article Google Scholar * Royer, L. A. _Nat. Methods_ 21, 1371–1373 (2024).
Article Google Scholar * Haase, R., Tischer, C., Hériché, J.-K. & Scherf, N. Preprint at _bioRxiv_ https://doi.org/10.1101/2024.04.19.590278 (2024). * Chen, M. et al. Preprint at
https://arxiv.org/abs/2107.03374 (2021). * Lu, C. et al. Preprint at https://arxiv.org/abs/2408.06292 (2024). * Jimenez, C. E. et al. Preprint at https://arxiv.org/abs/2310.06770 (2024). *
Yin, Z. et al. Preprint at https://arxiv.org/abs/2305.18153 (2023). * About GitHub-hosted runners. _GitHub_
https://docs.github.com/en/actions/using-github-hosted-runners/using-github-hosted-runners/about-github-hosted-runners (accessed 14 October 2024). * Hasse, R. git-bob. _GitHub_
https://github.com/haesleinhuepf/git-bob (2024). Download references ACKNOWLEDGEMENTS I would like to thank E. K. Nicolay (UFZ Leipzig) and M. Lampert (TU Dresden) for testing git-bob in its
early days and for providing constructive feedback on the manuscript. I also would like to thank V. Hilsenstein for pushing for GitLab interoperability. I acknowledge the financial support
by the Federal Ministry of Education and Research of Germany and by Sächsische Staatsministerium für Wissenschaft, Kultur und Tourismus in the programme Center of Excellence for AI-research
“Center for Scalable Data Analytics and Artificial Intelligence Dresden/Leipzig”, project identification number: ScaDS.AI. I also acknowledge financial support from the Deutsche
Forschungsgemeinschaft (DFG, German Research Foundation) under the National Research Data Infrastructure – NFDI 46/1 – 501864659 - NFDI4BioImage. AUTHOR INFORMATION AUTHORS AND AFFILIATIONS
* Data Science Center, Leipzig University, Leipzig, Germany Robert Haase * Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI) Dresden / Leipzig, Leipzig, Germany
Robert Haase Authors * Robert Haase View author publications You can also search for this author inPubMed Google Scholar CORRESPONDING AUTHOR Correspondence to Robert Haase. ETHICS
DECLARATIONS COMPETING INTERESTS The author declares no competing interests. PEER REVIEW PEER REVIEW INFORMATION _Nature Computational Science_ thanks Virginie Uhlmann and the other,
anonymous, reviewer(s) for their contribution to the peer review of this work. SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION Supplementary Figures 1–6. RIGHTS AND PERMISSIONS Reprints
and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Haase, R. Towards transparency and knowledge exchange in AI-assisted data analysis code generation. _Nat Comput Sci_ 5, 271–272 (2025).
https://doi.org/10.1038/s43588-025-00781-1 Download citation * Published: 27 March 2025 * Issue Date: April 2025 * DOI: https://doi.org/10.1038/s43588-025-00781-1 SHARE THIS ARTICLE Anyone
you share the following link with will be able to read this content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the
Springer Nature SharedIt content-sharing initiative
Trending News
Doj warns consumers of covid-19 vaccine survey scamCriminals reportedly have begun asking people to complete a survey about a COVID-19 vaccine in exchange for cash, an iPa...
Oculofaciocardiodental syndrome caused by a novel bcor variantABSTRACT Oculofaciocardiodental syndrome is caused by variants in the BCL6 corepressor (BCOR) gene. We identified a nove...
Pruning: a class I act | Nature Reviews NeuroscienceAccess through your institution Buy or subscribe The authors used mice lacking the MHC class I molecules H2-Kb and H2-Db...
Channelnews : huawei mate xs available for pre-order in ozFrom the 14th of March Australians will be able to pre-order Huawei’s latest foldable smartphone, the Mate Xs. The model...
Virtual event: four former fda commissioners discuss the agency's future_Editor’s note: A recording of the event is embedded below._ Nothing brings wisdom like experience. STAT convenes four f...
Latests News
Towards transparency and knowledge exchange in ai-assisted data analysis code generationAccess through your institution Buy or subscribe Generative artificial intelligence (AI) and large language models (LLMs...
Jonah hill and rumored fiancée olivia millar welcome babyEXPLORE MORE Jonah Hill and his rumored fiancée, Olivia Millar, have welcomed their first child. A rep for the actor con...
Swimming pool still adds value to french property despite droughtsAN INCREASED SELLING PRICE MIGHT LOOK ATTRACTIVE BUT DO NOT FORGET RUNNING COSTS, INCREASED TAXES AND WATER BANS Swimmin...
How are pensions worked out if employment split between france and uk?THE UK AND FRANCE COORDINATE ON PENSIONS Reader's question: I am a French national, resident in England since 1985....
The page you were looking for doesn't exist.You may have mistyped the address or the page may have moved.By proceeding, you agree to our Terms & Conditions and our ...