LMVD-ID: b7d806a0
Published May 1, 2025

DNA Model Pathogen Synthesis

Affected Models:evo1 (7b), evo2 (1b), evo2 (7b), evo2 (40b), chatgpt-4o, patholm

Research Paper

GeneBreaker: Jailbreak Attacks against DNA Language Models with Pathogenicity Guidance

View Paper

Description: DNA language models, such as the Evo series, are vulnerable to jailbreak attacks that coerce the generation of DNA sequences with high homology to known human pathogens. The GeneBreaker framework demonstrates this by using a combination of carefully crafted prompts leveraging high-homology non-pathogenic sequences and a beam search guided by pathogenicity prediction models (e.g., PathoLM) and log-probability heuristics. This allows bypassing safety mechanisms and generating sequences exceeding 90% similarity to target pathogens.

Examples: The https://github.com/zaixizhang/GeneBreaker repository contains the code and data used to reproduce the attacks. Specific examples of successful jailbreaks targeting SARS-CoV-2 spike protein and HIV-1 envelope protein are detailed in the paper, showing high sequence and structural fidelity between generated and known pathogenic sequences. Attack success rates against Evo models varied across different viral categories, reaching up to 60% for Evo2-40B on certain targets.

Impact: Successful exploitation could lead to the generation of synthetic DNA sequences capable of producing or enhancing dangerous human pathogens, posing significant biosecurity risks. The impact is amplified with larger model sizes.

Affected Systems: DNA language models, specifically those based on transformer architectures and trained on large genomic datasets (e.g., Evo series models). Other generative models with similar architectures and training data may also be susceptible.

Mitigation Steps:

  • Implement more robust safety mechanisms beyond simple filtering of pathogenic sequences during training.
  • Develop and integrate improved pathogenicity prediction models into the generation process.
  • Incorporate output verification and validation steps involving multiple independent checks (e.g., using different pathogenicity prediction models and sequence similarity algorithms beyond BLAST).
  • Implement more rigorous output monitoring and tracing mechanisms to detect and respond to potentially harmful generations.
  • Restrict access to the models and their outputs.
  • Carefully curate datasets and exclude potentially harmful sequences or patterns to a greater extent than previous methods.

© 2025 Promptfoo. All rights reserved.