ginipick

76 10 359

AI & ML interests

None yet

Recent Activity

liked a Space about 10 hours ago

VIDraft/vkae

reacted to ginigen-ai's post with 🔥 about 15 hours ago

🧠 Does your LLM know when it's about to be wrong? Most leaderboards measure accuracy. We measure metacognition — whether a model catches its own errors. Benchmark + leaderboard + adapters, all open. 🎉 The surprise: even a K-AI #1 model (JGOS-31B-Citizen) is the strongest on multiple-choice traps (trap_rate 0.005 — ~2 misses in 400) yet blind to its own free-form mistakes (self-confidence AUROC = 0.5, pure random). A tiny base-frozen adapter recovers that signal. Two independent axes (never compared across a row): ① trap_rate — does it fall for tempting trap options? (lower = stronger) ② adapter gain Δ — how much a lightweight adapter catches errors the model itself misses. (higher = more adapter value) What's open: 📊 300+100 trap problems (each with a hidden trap + TICOS type) 🏆 24-model leaderboard 🧩 11 per-model adapters — adapters, NOT fine-tunes (base stays frozen; the adapter just reads the hidden state → P(wrong)) Submit any HF model → auto-scored daily at 09:00 KST and added to the board. 🏆 Leaderboard → https://huggingface.co/spaces/ginigen-ai/Metacognition-Leaderboard-Space 📊 Benchmark → https://huggingface.co/datasets/ginigen-ai/Metacognition-Bench 🧩 Adapters → https://huggingface.co/collections/FINAL-Bench/metacognition-adapters-6a42c032e6beb803dd032961 📊 Article → https://huggingface.co/blog/ginigen-ai/metacognition Benchmark by ginigen-ai · Adapters by FINAL-Bench (Darwin/Chimera platform + AETHER metacognition tech).

reacted to ginigen-ai's post with ❤️ about 15 hours ago

View all activity

Organizations

ginipick 's Spaces 139

Wan2.2 14B Fast

🎥

Wan_VO- generate a video from an image with a text prompt

FLUX LoRa the Explorer

🏆

Browse curated AI applications

FLUX Prompt Generator

💬

Generate detailed AI image prompts with optional photo caption

110

Realtime FLUX Image

💬

mcp_server & High quality Images in Realtime

DeepFake AI Tool for Videos & Images

AI News Daily

🌐

전세계 AI 트렌드 기사를 자동 수집 (최대 50개)

Leaderboard - FINAL Bench 'Metacognitive'

🚀

Metacognitive

Retane

😻

AGI Personal

👁

Run custom Python applications from secrets

Test123

⚡

Enhance prompts and generate original & improved images

AnyTalker

🎬

Let your character interact naturally

Ginigen

💻

ginigen.ai

AGI Personal

👁

Execute custom Python code from secrets

AGI SAJU

⚡

AGI 명리학 전문가 - 만세력, 사주, 운세

10 minute Marketing

🐨

Free AI blog generator using GPT-4.1 with search, file,image

VibeVoice Demo

👀

Launch VibeVoice demo for text-to-speech conversion

Change Hair

🏢

Change-Hair

562

FLUXllama gpt-oss

🏆

mcp_server & FLUX 4-bit Quantization + Enhanced

MiniMaxAI MiniMax M1 80k

😻

Generate text using a pre-trained AI model

Google Medgemma 4b It

🐨

Generate medical text using a pre-trained model

Microsoft NextCoder 32B

🌍

Generate code snippets using AI

Agentica Org DeepSWE Preview

😻

Generate text using DeepSWE-Preview model

ginipick

AI & ML interests

Recent Activity

Organizations

ginipick 's Spaces 139 Sort: Recently updated

Wan2.2 14B Fast

FLUX LoRa the Explorer

FLUX Prompt Generator

Realtime FLUX Image

Florence 2 Flux

ofai-llm

FaceFusion

AI News Daily

Leaderboard - FINAL Bench 'Metacognitive'

Retane

AGI Personal

Test123

AnyTalker

Ginigen

AGI Personal

AGI SAJU

10 minute Marketing

VibeVoice Demo

Change Hair

FLUXllama gpt-oss

MiniMaxAI MiniMax M1 80k

Google Medgemma 4b It

Microsoft NextCoder 32B

Agentica Org DeepSWE Preview

ginipick 's Spaces 139