18 8 27

An Yang

yangapku

https://scholar.google.com/citations?user=vO9FZekAAAAJ

AI & ML interests

NLP and Deep Learning

Recent Activity

authored a paper 7 days ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

upvoted a paper 7 days ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

authored a paper 13 days ago

Qwen-Image Technical Report

View all activity

Organizations

authored a paper 7 days ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published 8 days ago • 82

upvoted a paper 7 days ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published 8 days ago • 82

authored 4 papers 13 days ago

upvoted a paper 13 days ago

Soft Adaptive Policy Optimization

Paper • 2511.20347 • Published 14 days ago • 33

liked a dataset 3 months ago

openai/healthbench

Preview • Updated Aug 27 • 519 • 104

liked a model 4 months ago

Qwen/Qwen3-4B-Thinking-2507

Text Generation • 4B • Updated Aug 6 • 715k • • 480

published 4 models 4 months ago

Qwen/Qwen3-4B-Thinking-2507-FP8

Text Generation • 4B • Updated Aug 6 • 184k • 40

Qwen/Qwen3-4B-Thinking-2507

Text Generation • 4B • Updated Aug 6 • 715k • • 480

Qwen/Qwen3-4B-Instruct-2507-FP8

Text Generation • 4B • Updated Sep 17 • 42.2k • 51

Qwen/Qwen3-4B-Instruct-2507

Text Generation • 4B • Updated Sep 17 • 6.29M • • 531

updated a collection 4 months ago

Qwen3

Collection

84 items • Updated Aug 6 • 1.47k

authored a paper 4 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 315

liked a model 5 months ago

Qwen/QwQ-32B

Text Generation • 33B • Updated Mar 11 • 56.4k • • 2.87k

liked a Space 5 months ago

Qwen3 Coder WebDev

🌍

930

Generate web code from descriptions

An Yang

AI & ML interests

Recent Activity

Organizations

yangapku's activity

Qwen3 Coder WebDev