Cross-Domain Evaluation of Transformer-Based Vulnerability Detection on Open & Industry Data
Abstract
A recommender system integrated into CI/CD pipelines uses fine-tuned CodeBERT to detect and localize vulnerabilities in code without disrupting workflows, showing improved performance with appropriate undersampling techniques.
Deep learning solutions for vulnerability detection proposed in academic research are not always accessible to developers, and their applicability in industrial settings is rarely addressed. Transferring such technologies from academia to industry presents challenges related to trustworthiness, legacy systems, limited digital literacy, and the gap between academic and industrial expertise. For deep learning in particular, performance and integration into existing workflows are additional concerns. In this work, we first evaluate the performance of CodeBERT for detecting vulnerable functions in industrial and open-source software. We analyse its cross-domain generalisation when fine-tuned on open-source data and tested on industrial data, and vice versa, also exploring strategies for handling class imbalance. Based on these results, we develop AI-DO(Automating vulnerability detection Integration for Developers' Operations), a Continuous Integration-Continuous Deployment (CI/CD)-integrated recommender system that uses fine-tuned CodeBERT to detect and localise vulnerabilities during code review without disrupting workflows. Finally, we assess the tool's perceived usefulness through a survey with the company's IT professionals. Our results show that models trained on industrial data detect vulnerabilities accurately within the same domain but lose performance on open-source code, while a deep learner fine-tuned on open data, with appropriate undersampling techniques, improves the detection of vulnerabilities.
Community
Cross-domain evaluation for PHP between industry and open source data for automated vulnerability detection.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Data and Context Matter: Towards Generalizing AI-based Software Vulnerability Detection (2025)
- ROSE: Transformer-Based Refactoring Recommendation for Architectural Smells (2025)
- SAEL: Leveraging Large Language Models with Adaptive Mixture-of-Experts for Smart Contract Vulnerability Detection (2025)
- Revisiting Pre-trained Language Models for Vulnerability Detection (2025)
- On the Evaluation of Large Language Models in Multilingual Vulnerability Repair (2025)
- Code Vulnerability Detection Across Different Programming Languages with AI Models (2025)
- Empirical Study of Code Large Language Models for Binary Security Patch Detection (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper