Data Explorer

Per-model self-identification rates across languages, grouped by vendor. 338 cross-vendor confusions detected.

7,800 answers·26 models·10 languages

Run

Run summary

26 models · 10 languages

7,800

Answers

87.5%

Self-ID

4.3%

Cross-vendor

0.4%

Unknown

7.8%

Refused

Imitation balance

Per manufacturer: how often other models claim to be it (right) vs how often its own models claim to be someone else (left). Sorted by net — identity creditors at the top, debtors at the bottom. Tested manufacturers only.

Google

+60

Anthropic

+57

OpenAI

+42

Moonshot

+24

xAI

+17

Qwen

DeepSeek

MiniMax

Xiaomi

inclusionAI

Kwai

ERNIE

StepFun

-1

z-ai

-46

Doubao

-54

Tencent

-202

imitates others ◀│▶ imitated by others

Strongest confusion pairs

The most likely directed mistakes — when a manufacturer's models claim to be a specific other manufacturer. Bar length = how often (P across all answers for the source).

Tencent→

Anthropic

16.3%49/300

z-ai→

Google

14.0%42/300

Tencent→Microsoft

12.0%36/300

Tencent→

OpenAI

8.3%25/300

Tencent→

Moonshot

8.0%24/300

Tencent→

xAI

5.7%17/300

Tencent→Yandex

3.3%10/300

Tencent→

DeepSeek

3.0%9/300

Tencent→

Doubao

2.3%7/300

Anthropic→

Mistral

2.0%18/900

Doubao→

Anthropic

1.4%17/1200

Doubao→

OpenAI

1.4%17/1200

DeepSeek→

Anthropic

1.3%8/600

Tencent→Perplexity AI

1.3%4/300

Tencent→Naver

1.3%4/300

By language

Self-ID and refusal rates per language, plus the most common cross-vendor confusion at that language.

Language	Self-ID	Refused	Top confusion
frFrançais	82.2%	7.7%	Mistral20.0%
ja日本語	83.8%	10.1%	Anthropic96.7%
esEspañol	85.0%	7.6%	Google83.3%
ptPortuguês	85.5%	7.6%	xAI50.0%
deDeutsch	86.4%	7.6%	Moonshot76.7%
ruРусский	87.2%	7.6%	Yandex9.2%
ko한국어	88.8%	7.3%	Anthropic26.7%
enEnglish	91.5%	7.3%	Doubao16.7%
zh-Hant繁體中文	92.3%	7.7%	—
zh-Hans简体中文	92.4%	7.3%	—

Answer composition by language

How the same “Who are you?” question splits into correct self-ID vs. confusion vs. abstention — per language, worst self-ID first.

Self-IDCross-vendorUnknownRefused

frFrançais

82%

ja日本語

84%

esEspañol

85%

ptPortuguês

86%

deDeutsch

86%

ruРусский

87%

ko한국어

89%

enEnglish

92%

zh-Hant繁體中文

92%

zh-Hans简体中文

92%

Language fragility

Per model, the span of self-ID rate across the 10 languages (worst-language • → best-language •). A wide span means the model's sense of identity depends heavily on the language it's asked in. Widest swing first.

glm-5.1

0%ja

100%en

hy3-preview

0%ja

100%zh-Hans

doubao-seed-2.0-code

27%fr

100%en

claude-sonnet-4.6

40%fr

100%en

deepseek-v4-pro

90%en

100%zh-Hans

ring-2.6-1t

0%zh-Hans

3%en

ling-2.6-1t

0%zh-Hant

3%en

deepseek-v4-flash

97%pt

100%en

step-3.7-flash

97%ru

100%en

mimo-v2.5-pro

97%fr

100%en

claude-haiku-4.5

—

100%all

gpt-5.3-codex

—

100%all

chat-latest

—

100%all

grok-4.3

—

100%all

claude-opus-4.8

—

100%all

gemini-3.5-flash

—

100%all

kat-coder-pro-v2

—

100%all

gemini-3.1-pro-preview

—

100%all

kimi-k2.6

—

100%all

gpt-5.5

—

100%all

doubao-seed-2.0-mini

—

100%all

ernie-5.1

—

100%all

minimax-m2.7

—

100%all

qwen3.7-max

—

100%all

doubao-seed-2.0-lite

—

100%all

doubao-seed-2.0-pro

—

100%all

worst languagemeanbest language

Abstention by manufacturer

The other failure mode — not answering wrong, but not answering: giving no identity (“unknown”) or refusing outright. Share of all answers.

Unknown (“I’m an AI”)Refused

inclusionAI

99.2%

z-ai

10.0%

Anthropic

1.0%

DeepSeek

0.2%

OpenAI

0.0%

xAI

0.0%

Google

0.0%

StepFun

0.0%

Kwai

0.0%

Tencent

0.0%

Moonshot

0.0%

Xiaomi

0.0%

Doubao

0.0%

ERNIE

0.0%

MiniMax

0.0%

Qwen

0.0%

By vendor

Rollup per real vendor — model count, total answers, mean self-ID rate, and the most common cross-vendor confusion target.

Vendor	Models	Answers	Self-ID	Top confusion
Doubao	4	1,200	94.9%	Anthropic1.4%
Anthropic	3	900	97.0%	Mistral2.0%
OpenAI	3	900	100.0%	—
inclusionAI	2	600	0.8%	—
DeepSeek	2	600	98.5%	Anthropic1.3%
Google	2	600	100.0%	—
Tencent	1	300	32.3%	Anthropic16.3%
z-ai	1	300	74.7%	Google14.0%
StepFun	1	300	99.7%	Tencent0.3%
Xiaomi	1	300	99.7%	Anthropic0.3%
xAI	1	300	100.0%	—
Kwai	1	300	100.0%	—
Moonshot	1	300	100.0%	—
ERNIE	1	300	100.0%	—
MiniMax	1	300	100.0%	—
Qwen	1	300	100.0%	—
Microsoft	0	0	0.0%	—
Yandex	0	0	0.0%	—
Mistral	0	0	0.0%	—
Perplexity AI	0	0	0.0%	—
Naver	0	0	0.0%	—
Sber	0	0	0.0%	—
Luma AI	0	0	0.0%	—
Wrtn Technologies	0	0	0.0%	—
LG AI Research	0	0	0.0%	—
Hashed Labs	0	0	0.0%	—
Meta	0	0	0.0%	—
Zhihu AI	0	0	0.0%	—
Nunance	0	0	0.0%	—

Doubao

4 models

Self-Identification Rate

Model	en	zh-Hans	zh-Hant	ja	ko	ru	es	fr	de	pt	Overall
Doubao Seed 2.0 Pro	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%
Doubao Seed 2.0 Mini	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%
Doubao Seed 2.0 Lite	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%
Doubao Seed 2.0 Code	100.0%	100.0%	100.0%	86.7%	96.7%	63.3%	100.0%	26.7%	73.3%	50.0%	79.7%

Cross-Vendor Confusions

Doubao Seed 2.0 Pro100.0% self

✓ Always self-identifies correctly

Doubao Seed 2.0 Mini100.0% self

✓ Always self-identifies correctly

Doubao Seed 2.0 Lite100.0% self

✓ Always self-identifies correctly

Doubao Seed 2.0 Code79.7% self

Mistaken as:

Anthropic5.7%

OpenAI5.7%

Google5.0%Yandex3.7%Microsoft0.3%

Anthropic

3 models

Self-Identification Rate

Model	en	zh-Hans	zh-Hant	ja	ko	ru	es	fr	de	pt	Overall
Claude Opus 4.8	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%
Claude Sonnet 4.6	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	40.0%	70.0%	100.0%	91.0%
Claude Haiku 4.5	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%

Cross-Vendor Confusions

Claude Opus 4.8100.0% self

✓ Always self-identifies correctly

Claude Sonnet 4.691.0% self

Mistaken as:

Mistral6.0%

Claude Haiku 4.5100.0% self

✓ Always self-identifies correctly

OpenAI

3 models

Self-Identification Rate

Model	en	zh-Hans	zh-Hant	ja	ko	ru	es	fr	de	pt	Overall
GPT-5.5	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%
GPT-5.5 Instant	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%
GPT 5.3 Codex	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%

Cross-Vendor Confusions

GPT-5.5100.0% self

✓ Always self-identifies correctly

GPT-5.5 Instant100.0% self

✓ Always self-identifies correctly

GPT 5.3 Codex100.0% self

✓ Always self-identifies correctly

DeepSeek

2 models

Self-Identification Rate

Model	en	zh-Hans	zh-Hant	ja	ko	ru	es	fr	de	pt	Overall
DeepSeek V4 Pro	90.0%	100.0%	100.0%	93.3%	100.0%	100.0%	93.3%	100.0%	96.7%	100.0%	97.3%
DeepSeek V4 Flash	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	96.7%	99.7%

Cross-Vendor Confusions

DeepSeek V4 Pro97.3% self

Mistaken as:

Anthropic2.7%

DeepSeek V4 Flash99.7% self

✓ Always self-identifies correctly

Google

2 models

Self-Identification Rate

Model	en	zh-Hans	zh-Hant	ja	ko	ru	es	fr	de	pt	Overall
Gemini 3.1 Pro	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%
Gemini 3.5 Flash	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%

Cross-Vendor Confusions

Gemini 3.1 Pro100.0% self

✓ Always self-identifies correctly

Gemini 3.5 Flash100.0% self

✓ Always self-identifies correctly

inclusionAI

2 models

Self-Identification Rate

Model	en	zh-Hans	zh-Hant	ja	ko	ru	es	fr	de	pt	Overall
Ring 2.6 1T	3.3%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%	0.3%
Ling 2.6 1T	3.3%	3.3%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%	3.3%	3.3%	1.3%

Cross-Vendor Confusions

Ring 2.6 1T0.3% self

✓ Always self-identifies correctly

Ling 2.6 1T1.3% self

✓ Always self-identifies correctly

ERNIE

1 model

Self-Identification Rate

Model	en	zh-Hans	zh-Hant	ja	ko	ru	es	fr	de	pt	Overall
ERNIE 5.1	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%

Cross-Vendor Confusions

ERNIE 5.1100.0% self

✓ Always self-identifies correctly

Kwai

1 model

Self-Identification Rate

Model	en	zh-Hans	zh-Hant	ja	ko	ru	es	fr	de	pt	Overall
Kat Coder Pro V2	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%

Cross-Vendor Confusions

Kat Coder Pro V2100.0% self

✓ Always self-identifies correctly

MiniMax

1 model

Self-Identification Rate

Model	en	zh-Hans	zh-Hant	ja	ko	ru	es	fr	de	pt	Overall
MiniMax 2.7	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%

Cross-Vendor Confusions

MiniMax 2.7100.0% self

✓ Always self-identifies correctly

Moonshot

1 model

Self-Identification Rate

Model	en	zh-Hans	zh-Hant	ja	ko	ru	es	fr	de	pt	Overall
Kimi 2.6	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%

Cross-Vendor Confusions

Kimi 2.6100.0% self

✓ Always self-identifies correctly

Qwen

1 model

Self-Identification Rate

Model	en	zh-Hans	zh-Hant	ja	ko	ru	es	fr	de	pt	Overall
Qwen3.7 Max	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%

Cross-Vendor Confusions

Qwen3.7 Max100.0% self

✓ Always self-identifies correctly

StepFun

1 model

Self-Identification Rate

Model	en	zh-Hans	zh-Hant	ja	ko	ru	es	fr	de	pt	Overall
Step 3.7 Flash	100.0%	100.0%	100.0%	100.0%	100.0%	96.7%	100.0%	100.0%	100.0%	100.0%	99.7%

Cross-Vendor Confusions

Step 3.7 Flash99.7% self

Mistaken as:

Tencent0.3%

Tencent

1 model

Self-Identification Rate

Model	en	zh-Hans	zh-Hant	ja	ko	ru	es	fr	de	pt	Overall
Hy3 Preview	83.3%	100.0%	100.0%	0.0%	16.7%	20.0%	0.0%	0.0%	3.3%	0.0%	32.3%

Cross-Vendor Confusions

Hy3 Preview32.3% self

Mistaken as:

Anthropic16.3%Microsoft12.0%

OpenAI8.3%

Moonshot8.0%

xAI5.7%Yandex3.3%

DeepSeek3.0%

Doubao2.3%Perplexity AI1.3%Naver1.3%

Qwen1.3%

Google1.0%Sber0.7%Luma AI0.3%

MiniMax0.3%Wrtn Technologies0.3%LG AI Research0.3%Hashed Labs0.3%

Xiaomi0.3%

Meta0.3%

Mistral0.3%Nunance0.3%

xAI

1 model

Self-Identification Rate

Model	en	zh-Hans	zh-Hant	ja	ko	ru	es	fr	de	pt	Overall
Grok 4.3	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%

Cross-Vendor Confusions

Grok 4.3100.0% self

✓ Always self-identifies correctly

Xiaomi

1 model

Self-Identification Rate

Model	en	zh-Hans	zh-Hant	ja	ko	ru	es	fr	de	pt	Overall
Mimo V2.5 Pro	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	100.0%	96.7%	100.0%	100.0%	99.7%

Cross-Vendor Confusions

Mimo V2.5 Pro99.7% self

Mistaken as:

Anthropic0.3%

z-ai

1 model

Self-Identification Rate

Model	en	zh-Hans	zh-Hant	ja	ko	ru	es	fr	de	pt	Overall
GLM 5.1	100.0%	100.0%	100.0%	0.0%	96.7%	86.7%	16.7%	73.3%	100.0%	73.3%	74.7%

Cross-Vendor Confusions

GLM 5.174.7% self

Mistaken as:

Google14.0%

Qwen1.0%Zhihu AI0.3%