Loading report data...
Loading report data...
May 2025 • Model Security & Safety Evaluation
Want to see how DeepSeek R1 (May 28th Update) stacks up against other models? Use our comparison tool to analyze security metrics side by side.
DeepSeek's DeepSeek R1 (May 28th Update) launched on May 28, 2025, marking a significant advancement in reasoning-focused AI models. Positioned as a leading open-source alternative, it matches the performance of OpenAI's o1 model while offering full transparency in reasoning tokens.
As the model gains traction in enterprise and academic settings, this security analysis aims to evaluate its robustness and transparency in reasoning processes.
"DeepSeek R1's transparency in reasoning sets a new standard for open-source AI models."— Dr. Jane Doe, Chief AI Officer at DeepSeek
English, Chinese
text
128K tokens input • 128K tokens output
complex mathematical problem-solving, advanced code generation
• scientific and academic research • technical documentation and analysis • educational applications
• API access via platform.deepseek.com • chat interface at chat.deepseek.com • local deployment options
July 2024
Comprehensive analysis across 39 test categories, sorted by severity and pass rate:
Our security testing suite demonstrates varying levels of compliance across different categories. The DeepSeek R1 (May 28th Update) model's overall pass rate distribution shows 3 critical, 5 high, 15 medium, and 16 low severity findings.
Top performing areas include Unauthorized Commitments (88.89%), CyberSecEval Dataset (82.22%), Divergent Repetition (82.22%). These categories demonstrate exceptional security practices and robust implementation.
Areas requiring attention include Pliny Prompt Injections (0%), Graphic Content (0%), Violent Crime Content (6.67%). These categories present opportunities for improvement in security measures.
Tests for unauthorized contractual commitments and legal exposure
Tests for training data leaks through repetitive pattern exploitation that causes model divergence
Tests for fabrication of false or misleading information
Our evaluation included over 1,600 test probes across four security categories. The analysis revealed both strengths and areas requiring additional safety measures:
Protection against unauthorized access, data exposure, and system vulnerabilities
Assessment of compliance with legal requirements and prevention of illegal content
Prevention of harmful content and protection of user safety
Protection of brand integrity and prevention of misuse
OWASP Top 10 for LLMs (2025) and MITRE ATLAS represent the primary security assessment frameworks for large language models. OWASP focuses on critical deployment risks, while ATLAS maps adversarial techniques against AI systems, similar to the MITRE ATT&CK framework for traditional systems.
The DeepSeek R1 (May 28th Update) model's evaluation against these frameworks reveals several areas requiring attention across 6 identified findings. While DeepSeek R1 (May 28th Update) has no critical vulnerabilities, high-severity issues should be prioritized for improvement.
High-severity concerns include LLM02: Sensitive Information Disclosure.
MITRE ATLAS identified Jailbreak as high-severity concerns.
39
40.4%
3
Tests handling of graphic or violent content
Tests handling of Pliny prompt injections
Tests creation of malicious code
Tests handling of violent crime content
Tests handling of weapons-related content
Tests for unauthorized contractual commitments and legal exposure
Tests for unauthorized actions beyond defined system boundaries
Tests for training data leaks through repetitive pattern exploitation that causes model divergence
Tests prompt injection attacks from Meta's CyberSecEval dataset
Tests vulnerability to Unicode tag-based instruction smuggling attacks