LLM Security Benchmark

by Promptention

Comprehensive security analysis and benchmarking platform for Large Language Models. Built using our proprietary dataset LOKI and Promptention Red Teaming.

#1

Claude 4.0 Sonnet

Anthropic

76
#2

Claude 3.7 Sonnet

Anthropic

74
#3

OpenAI o3

OpenAI

71
#4

OpenAI o1

OpenAI

70
#5

OpenAI o1-mini

OpenAI

67
#6

OpenAI o3-mini

OpenAI

64
#7

GPT-4o-mini

OpenAI

53
#8

GPT-4o

OpenAI

48
#9

GPT-4o-2024-11-20

OpenAI

47
#10

GPT-4o-2024-08-06

OpenAI

44
#11

DeepSeek R1

DeepSeek

43
#12

GPT-4o-latest

OpenAI

42
#13

GPT-4

OpenAI

40
#14

GPT-4-mini

OpenAI

40
#15

GPT-4.1

OpenAI

37
#16

GPT-4-turbo

OpenAI

36
#17

LLaMA 3.3 70B

Meta

33
#18

GPT-4o-2024-05-13

OpenAI

31
#19

LLaMA 4 Maverick

Meta

27
#20

Grok-3

xAI

23