dolphinlee 's Collections
System 2 Attention (is something you might need too)
Paper
• 2311.11829
• Published
• 43
TPTU-v2: Boosting Task Planning and Tool Usage of Large Language
Model-based Agents in Real-world Systems
Paper
• 2311.11315
• Published
• 7
Paper
• 2312.07000
• Published
• 15
Steering Llama 2 via Contrastive Activation Addition
Paper
• 2312.06681
• Published
• 14
Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations
Paper
• 2312.06674
• Published
• 8
Controlled Decoding from Language Models
Paper
• 2310.17022
• Published
• 15
Vision-Language Models as a Source of Rewards
Paper
• 2312.09187
• Published
• 12
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Paper
• 2401.06080
• Published
• 28
Contrastive Preference Optimization: Pushing the Boundaries of LLM
Performance in Machine Translation
Paper
• 2401.08417
• Published
• 37
Weaver: Foundation Models for Creative Writing
Paper
• 2401.17268
• Published
• 45
MobileLLM: Optimizing Sub-billion Parameter Language Models for
On-Device Use Cases
Paper
• 2402.14905
• Published
• 134
StarCoder 2 and The Stack v2: The Next Generation
Paper
• 2402.19173
• Published
• 152
MathScale: Scaling Instruction Tuning for Mathematical Reasoning
Paper
• 2403.02884
• Published
• 17
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper
• 2403.03507
• Published
• 189
User-LLM: Efficient LLM Contextualization with User Embeddings
Paper
• 2402.13598
• Published
• 21
CodecLM: Aligning Language Models with Tailored Synthetic Data
Paper
• 2404.05875
• Published
• 18