Towards transparency and knowledge exchange in ai-assisted data analysis code generation

Nature

Towards transparency and knowledge exchange in ai-assisted data analysis code generation"


Play all audios:

Loading...

Access through your institution Buy or subscribe Generative artificial intelligence (AI) and large language models (LLMs) in particular are changing the way we do data science. Most


prominently, scientists use the technology for interacting with scientific data1, answering data analysis questions2,3, generating data analysis code4,5,6, and (re-)writing scientific


manuscripts7. Unfortunately, the prompts sent to LLMs are commonly not conserved, and thus, at the time of publication, it might be hard to differentiate human-made and AI-generated parts of


the scientific work. A professional peer-review system, for documenting how LLM-generated code was prompted for, and which human reviewed it, is not established in contemporary scientific


culture. However, such systems do exist for collaborative code editing involving multiple humans. For example, the source code repositories GitHub and GitLab are well-established in the


open-source software community for discussing issues and potential solutions, building code together, and for peer-reviewing content. As it was shown before that LLMs can solve real-world


GitHub issues8, developing an AI-assistant that interacts with humans directly within the GitHub platform is the obvious next step. Here, I present git-bob, a GitHub/GitLab-integration of an


LLM-based AI-assistant that can respond to GitHub issues, discuss potential solutions with humans iteratively, write code for them, and submit it as a pull-request to be reviewed by humans.


It is technically similar to various online services for data analysis such as the OpenAI ChatGPT Data Analyst or GitHub Copilot workflows, with three major differences. First, multiple


humans can interact with git-bob in one communication thread. This allows bringing together domain specialists, such as life scientists, data-analysts and the AI-assistant in one discussion,


stimulating knowledge exchange on how to interact properly with the AI-assistant. Second, discussions with git-bob and resulting code modifications are conserved in an online platform that


others can read and follow, making the interaction with the AI-assistant fully transparent. Third, git-bob is completely open-source and extensible. Other developers can read its built-in


system prompts and modify them to their needs. Developers can implement custom connectors to other LLM service providers and write plugins for their custom AI agents, which may deal with


GitHub issues differently. This is a preview of subscription content, access via your institution ACCESS OPTIONS Access through your institution Access Nature and 54 other Nature Portfolio


journals Get Nature+, our best-value online-access subscription $29.99 / 30 days cancel any time Learn more Subscribe to this journal Receive 12 digital issues and online access to articles


$99.00 per year only $8.25 per issue Learn more Buy this article * Purchase on SpringerLink * Instant access to full article PDF Buy now Prices may be subject to local taxes which are


calculated during checkout ADDITIONAL ACCESS OPTIONS: * Log in * Learn about institutional subscriptions * Read our FAQs * Contact customer support CODE AVAILABILITY The complete source code


of git-bob is available online at GitHub11: https://github.com/haesleinhuepf/git-bob REFERENCES * Royer, L. A. _Nat. Methods_ 20, 951–952 (2023). Article  Google Scholar  * Lai, Y. et al.


Preprint at https://arxiv.org/abs/2211.11501 (2022). * Lei, W. et al. _Nat. Methods_ 21, 1368–1370 (2024). Article  Google Scholar  * Royer, L. A. _Nat. Methods_ 21, 1371–1373 (2024).


Article  Google Scholar  * Haase, R., Tischer, C., Hériché, J.-K. & Scherf, N. Preprint at _bioRxiv_ https://doi.org/10.1101/2024.04.19.590278 (2024). * Chen, M. et al. Preprint at


https://arxiv.org/abs/2107.03374 (2021). * Lu, C. et al. Preprint at https://arxiv.org/abs/2408.06292 (2024). * Jimenez, C. E. et al. Preprint at https://arxiv.org/abs/2310.06770 (2024). *


Yin, Z. et al. Preprint at https://arxiv.org/abs/2305.18153 (2023). * About GitHub-hosted runners. _GitHub_


https://docs.github.com/en/actions/using-github-hosted-runners/using-github-hosted-runners/about-github-hosted-runners (accessed 14 October 2024). * Hasse, R. git-bob. _GitHub_


https://github.com/haesleinhuepf/git-bob (2024). Download references ACKNOWLEDGEMENTS I would like to thank E. K. Nicolay (UFZ Leipzig) and M. Lampert (TU Dresden) for testing git-bob in its


early days and for providing constructive feedback on the manuscript. I also would like to thank V. Hilsenstein for pushing for GitLab interoperability. I acknowledge the financial support


by the Federal Ministry of Education and Research of Germany and by Sächsische Staatsministerium für Wissenschaft, Kultur und Tourismus in the programme Center of Excellence for AI-research


“Center for Scalable Data Analytics and Artificial Intelligence Dresden/Leipzig”, project identification number: ScaDS.AI. I also acknowledge financial support from the Deutsche


Forschungsgemeinschaft (DFG, German Research Foundation) under the National Research Data Infrastructure – NFDI 46/1 – 501864659 - NFDI4BioImage. AUTHOR INFORMATION AUTHORS AND AFFILIATIONS


* Data Science Center, Leipzig University, Leipzig, Germany Robert Haase * Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI) Dresden / Leipzig, Leipzig, Germany


Robert Haase Authors * Robert Haase View author publications You can also search for this author inPubMed Google Scholar CORRESPONDING AUTHOR Correspondence to Robert Haase. ETHICS


DECLARATIONS COMPETING INTERESTS The author declares no competing interests. PEER REVIEW PEER REVIEW INFORMATION _Nature Computational Science_ thanks Virginie Uhlmann and the other,


anonymous, reviewer(s) for their contribution to the peer review of this work. SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION Supplementary Figures 1–6. RIGHTS AND PERMISSIONS Reprints


and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Haase, R. Towards transparency and knowledge exchange in AI-assisted data analysis code generation. _Nat Comput Sci_ 5, 271–272 (2025).


https://doi.org/10.1038/s43588-025-00781-1 Download citation * Published: 27 March 2025 * Issue Date: April 2025 * DOI: https://doi.org/10.1038/s43588-025-00781-1 SHARE THIS ARTICLE Anyone


you share the following link with will be able to read this content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the


Springer Nature SharedIt content-sharing initiative


Trending News

Doj warns consumers of covid-19 vaccine survey scam

Criminals reportedly have begun asking people to complete a survey about a COVID-19 vaccine in exchange for cash, an iPa...

Oculofaciocardiodental syndrome caused by a novel bcor variant

ABSTRACT Oculofaciocardiodental syndrome is caused by variants in the BCL6 corepressor (BCOR) gene. We identified a nove...

Pruning: a class I act | Nature Reviews Neuroscience

Access through your institution Buy or subscribe The authors used mice lacking the MHC class I molecules H2-Kb and H2-Db...

Channelnews : huawei mate xs available for pre-order in oz

From the 14th of March Australians will be able to pre-order Huawei’s latest foldable smartphone, the Mate Xs. The model...

Virtual event: four former fda commissioners discuss the agency's future

_Editor’s note: A recording of the event is embedded below._ Nothing brings wisdom like experience. STAT convenes four f...

Latests News

Towards transparency and knowledge exchange in ai-assisted data analysis code generation

Access through your institution Buy or subscribe Generative artificial intelligence (AI) and large language models (LLMs...

Jonah hill and rumored fiancée olivia millar welcome baby

EXPLORE MORE Jonah Hill and his rumored fiancée, Olivia Millar, have welcomed their first child. A rep for the actor con...

Swimming pool still adds value to french property despite droughts

AN INCREASED SELLING PRICE MIGHT LOOK ATTRACTIVE BUT DO NOT FORGET RUNNING COSTS, INCREASED TAXES AND WATER BANS Swimmin...

How are pensions worked out if employment split between france and uk?

THE UK AND FRANCE COORDINATE ON PENSIONS Reader's question: I am a French national, resident in England since 1985....

The page you were looking for doesn't exist.

You may have mistyped the address or the page may have moved.By proceeding, you agree to our Terms & Conditions and our ...

Top