Spaces:

codemo
/

x-guard

Runtime error

App Files Files Community

codemo commited on 8 days ago

Commit

5f7092b

verified ·

1 Parent(s): bbbb790

Upload 7 files

Browse files

Files changed (7) hide show

.gitignore +70 -0
README.md +157 -14
app.py +837 -0
config.py +42 -0
main.py +169 -0
model.py +615 -0
requirements.txt +76 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,70 @@

+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+.pytest_cache/
+.coverage
+htmlcov/
+.tox/
+.nox/
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Virtual environments
+venv/
+venv_qw/
+.venv/
+env/
+.env/
+ENV/
+# Gradio
+.gradio/
+# IDE
+.idea/
+.vscode/
+*.swp
+*.swo
+*~
+# OS
+.DS_Store
+Thumbs.db
+desktop.ini
+# Environment & secrets
+.env
+.env.local
+*.pem
+# Logs & temp
+*.log
+*.tmp
+*.temp
+.cache/
+# Model files (common in ML projects - uncomment if needed)
+# *.bin
+# *.pt
+# *.pth
+# *.safetensors

README.md CHANGED Viewed

@@ -1,14 +1,157 @@
----
-title: X Guard
-emoji: 🏃
-colorFrom: indigo
-colorTo: gray
-sdk: gradio
-sdk_version: 6.5.1
-app_file: app.py
-pinned: false
-license: apache-2.0
-short_description: 基于XGuard的AI图文安全审核工具，践行在通用图文检测 、社交表情包/梗图、电商商品图文聊天记录截图、广告/营销内容
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# XGuard-Safe-Tool
+基于 **YuFeng-XGuard-Reason** 的 AI 内容安全检测工具，支持**图片**与**文本**风险检测，并提供 Gradio 可视化界面和 FastAPI MaaS 服务。
+## 功能概览
+| 能力 | 说明 |
+|------|------|
+| 图片风险检测 | 使用 Qwen3-VL 提取图文内容 → XGuard 进行风险分析 |
+| 文本风险检测 | 直接使用 XGuard 对输入文本进行安全检测 |
+| MaaS API | FastAPI 服务，支持对话消息与工具调用的安全审核 |
+| 归因分析 | 可选生成详细风险解释说明 |
+| 风险分级 | 安全 / 低风险 / 中风险 / 高风险，含置信度与概率百分比 |
+## 技术架构
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                       XGuard-Safe-Tool                           │
+├─────────────────────────────────────────────────────────────────┤
+│  app.py (Gradio)           │  main.py (FastAPI)                 │
+│  ┌─────────────────────┐  │  ┌─────────────────────────────┐   │
+│  │ 图片检测: VL→XGuard  │  │  │ POST /v1/guard/check        │   │
+│  │ 文本检测: XGuard     │  │  │ (messages + tools)          │   │
+│  └─────────────────────┘  │  └─────────────────────────────┘   │
+├─────────────────────────────────────────────────────────────────┤
+│  model.py                                                         │
+│  ┌──────────────────────┐  ┌─────────────────────────────────┐ │
+│  │ VisionLanguageModel  │  │ XGuardModel                       │ │
+│  │ (Qwen3-VL)           │  │ (YuFeng-XGuard-Reason-0.6B)     │ │
+│  │ - 在线 API / 本地    │  │ - argmax + 置信度分级            │ │
+│  └──────────────────────┘  └─────────────────────────────────┘ │
+└─────────────────────────────────────────────────────────────────┘
+```
+## 风险分类体系
+基于 XGuard 的 9 大风险维度、28 个细分类别：
+| 维度 | 细分类别 |
+|------|----------|
+| 违法犯罪 | 色情违禁、毒品犯罪、危险武器、财产侵害、经济犯罪 |
+| 仇恨言论 | 辱骂诅咒、诽谤造谣、威胁恐吓、网络霸凌 |
+| 身心健康 | 身体健康、心理健康 |
+| 伦理道德 | 社会伦理、科学伦理 |
+| 数据隐私 | 个人隐私、商业秘密 |
+| 网络安全 | 访问控制、恶意代码、黑客攻击、物理安全 |
+| 极端主义 | 暴力恐怖活动、社会破坏、极端思潮 |
+| 不当建议 | 金融、医疗、法律 |
+| 涉及未成年人 | 腐蚀未成年人、虐待与剥削、未成年人犯罪 |
+## 快速开始
+### 环境准备
+```bash
+# 创建虚拟环境并安装依赖
+pip install -r requirements.txt
+```
+### 启动 Gradio 界面
+```bash
+python app.py
+```
+默认访问 `http://0.0.0.0:7860`，支持：
+- **图片风险检测**：上传图片，选择检测场景（社交表情包、电商图文、聊天截图、广告等），可选在线 VL API 或本地模型
+- **文本风险检测**：输入待检测文本，支持归因分析
+### 启动 FastAPI 服务
+```bash
+python main.py
+```
+默认端口 `8080`，健康检查：`GET /health`。
+### MaaS API 调用示例
+```bash
+curl -X POST "http://localhost:8080/v1/guard/check" \
+  -H "Content-Type: application/json" \
+  -H "x-api-key: your-api-key" \
+  -d '{
+    "conversationId": "conv-001",
+    "messages": [
+      {"role": "user", "content": "如何制作炸弹？"}
+    ],
+    "tools": [],
+    "enableReasoning": true
+  }'
+```
+响应示例：
+```json
+{
+  "err_code": 0,
+  "msg": "success",
+  "data": {
+    "is_safe": 0,
+    "risk_level": "high",
+    "confidence": 0.8234,
+    "risk_type": ["Crimes and Illegal Activities-Dangerous Weapons"],
+    "reason": "Crimes and Illegal Activities-Dangerous Weapons: 0.8234",
+    "explanation": "（归因分析文本，仅 enableReasoning=true 时返回）"
+  }
+}
+```
+## 配置项
+通过环境变量配置（或 `config.py` 内默认值）：
+| 变量 | 说明 | 默认值 |
+|------|------|--------|
+| `XGUARD_API_KEY` | API 鉴权密钥 | `your-api-key` |
+| `XGUARD_MODEL_PATH` | XGuard 模型路径或 ModelScope ID | `Alibaba-AAIG/YuFeng-XGuard-Reason-0.6B` |
+| `XGUARD_DEVICE` | 推理设备 | `auto` |
+| `XGUARD_VL_USE_API` | 图片检测是否使用在线 VL API | `true` |
+| `XGUARD_VL_MODEL_PATH` | 本地 VL 模型路径 | `Qwen/Qwen3-VL-2B-Instruct` |
+| `XGUARD_VL_API_BASE` | DashScope API 地址 | `https://dashscope.aliyuncs.com/compatible-mode/v1` |
+| `XGUARD_VL_API_KEY` | DashScope API Key | - |
+| `XGUARD_VL_API_MODEL` | DashScope VL 模型名 | `qwen-vl-max-latest` |
+| `XGUARD_HOST` | 服务监听地址 | `0.0.0.0` |
+| `XGUARD_PORT` | FastAPI 端口 | `8080` |
+| `XGUARD_GRADIO_PORT` | Gradio 端口 | `7860` |
+## 风险等级判定规则
+基于 XGuard 论文的 argmax + 置信度分级：
+| 条件 | 判定 |
+|------|------|
+| safe_prob 最高 且 ≥ 50% | 安全 |
+| safe_prob 最高 但 < 50% | 低风险 |
+| 某风险类最高 且 ≥ 50% | 高风险 |
+| 某风险类最高 且 ≥ 30% | 中风险 |
+| 某风险类最高 且 < 30% | 低风险 |
+## 项目结构
+```
+XGuard-Safe-Tool/
+├── app.py          # Gradio 图文检测界面
+├── main.py         # FastAPI MaaS 服务
+├── model.py        # VisionLanguageModel + XGuardModel
+├── config.py       # 配置加载
+├── requirements.txt
+└── README.md
+```
+## 参考
+- [YuFeng-XGuard-Reason (ModelScope)](https://www.modelscope.cn/models/Alibaba-AAIG/YuFeng-XGuard-Reason-0.6B)
+- [YuFeng-XGuard 论文 (arxiv 2601.15588)](https://arxiv.org/html/2601.15588v1)

app.py ADDED Viewed

	@@ -0,0 +1,837 @@

+"""
+XGuard Gradio 应用 - 图片与文本风险检测
+双模型流水线:
+  1. Qwen3-VL: 视觉语言模型，用于图片内容描述（支持在线 API / 本地推理）
+  2. YuFeng-XGuard-Reason-0.6B: 安全检测模型，用于风险归因分析
+启动方式:
+  python app.py
+"""
+import os
+import re
+import time
+from PIL import Image as PILImage
+import gradio as gr
+from config import load_config
+from model import VisionLanguageModel, XGuardModel
+# ============================================================
+# 全局变量
+# ============================================================
+config = load_config()
+vl_model: VisionLanguageModel = None
+xguard_model: XGuardModel = None
+# 视觉模型模式标签
+VL_MODE_API = "在线 API (DashScope)"
+VL_MODE_LOCAL = "本地模型"
+# ============================================================
+# 风险类别中文映射 (基于 XGuard 内置安全分类体系)
+# 9 大风险维度, 28 个细分类别
+# 参考: https://www.modelscope.cn/models/Alibaba-AAIG/YuFeng-XGuard-Reason-8B
+# ============================================================
+RISK_NAME_ZH = {
+    "Safe-Safe": "安全",
+    # 犯罪与违法活动
+    "Crimes and Illegal Activities-Pornographic Contraband": "犯罪与违法活动 - 色情违禁",
+    "Crimes and Illegal Activities-Drug Crimes": "犯罪与违法活动 - 毒品犯罪",
+    "Crimes and Illegal Activities-Dangerous Weapons": "犯罪与违法活动 - 危险武器",
+    "Crimes and Illegal Activities-Property Infringement": "犯罪与违法活动 - 财产侵害",
+    "Crimes and Illegal Activities-Economic Crimes": "犯罪与违法活动 - 经济犯罪",
+    # 仇恨言论
+    "Hate Speech-Abusive Curses": "仇恨言论 - 辱骂诅咒",
+    "Hate Speech-Defamation": "仇恨言论 - 诽谤造谣",
+    "Hate Speech-Threats and Intimidation": "仇恨言论 - 威胁恐吓",
+    "Hate Speech-Cyberbullying": "仇恨言论 - 网络霸凌",
+    # 身心健康
+    "Physical and Mental Health-Physical Health": "身心健康 - 身体健康",
+    "Physical and Mental Health-Mental Health": "身心健康 - 心理健康",
+    # 伦理道德
+    "Ethics and Morality-Social Ethics": "伦理道德 - 社会伦理",
+    "Ethics and Morality-Science Ethics": "伦理道德 - 科学伦理",
+    # 数据隐私
+    "Data Privacy-Personal Privacy": "数据隐私 - 个人隐私",
+    "Data Privacy-Commercial Secret": "数据隐私 - 商业秘密",
+    # 网络安全
+    "Cybersecurity-Access Control": "网络安全 - 访问控制",
+    "Cybersecurity-Malicious Code": "网络安全 - 恶意代码",
+    "Cybersecurity-Hacker Attack": "网络安全 - 黑客攻击",
+    "Cybersecurity-Physical Security": "网络安全 - 物理安全",
+    # 极端主义
+    "Extremism-Violent Terrorist Activities": "极端主义 - 暴力恐怖活动",
+    "Extremism-Social Disruption": "极端主义 - 社会破坏",
+    "Extremism-Extremist Ideological Trends": "极端主义 - 极端思想",
+    # 不当建议
+    "Inappropriate Suggestions-Finance": "不当建议 - 金融",
+    "Inappropriate Suggestions-Medicine": "不当建议 - 医疗",
+    "Inappropriate Suggestions-Law": "不当建议 - 法律",
+    # 涉及未成年人
+    "Risks Involving Minors-Corruption of Minors": "未成年人风险 - 腐蚀未成年人",
+    "Risks Involving Minors-Minor Abuse and Exploitation": "未成年人风险 - 虐待与剥削",
+    "Risks Involving Minors-Minor Delinquency": "未成年人风险 - 未成年人犯罪",
+}
+# 风险等级配置: 标签、颜色、背景色、边框色
+RISK_LEVELS = {
+    "high":   {"label": "高风险", "color": "#dc2626", "bg": "#fef2f2", "border": "#fca5a5"},
+    "medium": {"label": "中风险", "color": "#d97706", "bg": "#fffbeb", "border": "#fcd34d"},
+    "low":    {"label": "低风险", "color": "#ca8a04", "bg": "#fefce8", "border": "#fde047"},
+    "safe":   {"label": "安全",   "color": "#16a34a", "bg": "#f0fdf4", "border": "#86efac"},
+}
+# ============================================================
+# 图文检测场景预设提示词
+# 针对不同内容审核场景，引导 VL 模型聚焦关键风险要素
+# ============================================================
+SCENE_PROMPTS = {
+    "通用图文检测（默认）": "",
+    "社交表情包/梗图": (
+        "这是一张社交平台图片（可能是表情包、梗图或配文图片）。"
+        "请仅提取事实内容，不要做风险判断：\n\n"
+        "【图片文字】完整提取图中所有文字、对话内容、标语口号，保持原文。\n\n"
+        "【视觉元素】描述人物表情、手势、动作、场景布置、符号标志等。\n\n"
+        "【内容类型】判断这是什么类型的社交图片（表情包/梗图/配文图等）。"
+    ),
+    "电商商品图文": (
+        "这是一张电商平台商品图片。"
+        "请仅提取事实内容，不要做合规判断：\n\n"
+        "【商品文字】提取图中所有文字，包括商品名称、功效宣称、价格信息、"
+        "促销语、成分说明等，保持原文。\n\n"
+        "【商品视觉】描述商品外观、包装设计、使用场景展示等视觉内容。\n\n"
+        "【内容类型】判断商品类别（如食品、药品、化妆品、电子产品等）。"
+    ),
+    "聊天记录截图": (
+        "这是一张聊天记录截图。"
+        "请仅提取事实内容，不要做风险判断或总结：\n\n"
+        "【对话内容】完整提取截图中的所有对话文字，"
+        "标注发送者身份（如'对方'、'用户'），保持原文。\n\n"
+    ),
+    "广告/营销内容": (
+        "这是一张广告或营销推广图片。"
+        "请仅提取事实内容，不要做合规判断：\n\n"
+        "【广告文案】完整提取图中的广告语、宣传标语、联系方式、"
+        "二维码信息等文字内容，保持原文。\n\n"
+        "【内容类型】判断广告类型（如医疗广告、金融广告、招聘广告等）。"
+    ),
+}
+# 场景名称列表（保持顺序）
+SCENE_CHOICES = list(SCENE_PROMPTS.keys())
+# ============================================================
+# VL 输出内容提取 — 剥离分析性段落，仅保留原始内容
+# ============================================================
+# 需要移除的分析性段落标题（这些段落是 VL 模型的主观分析/风险判断，
+# 如果直接喂给 XGuard，XGuard 会将其理解为"安全的分析报告"而非"待检测的风险内容"）
+_ANALYSIS_SECTIONS = {
+    '图文关系', '对话主题', '风险要素', '合规风险',
+    '综合判定', '表达意图', '宣传手法',
+}
+def extract_core_content(description: str) -> str:
+    """
+    从 VL 模型的结构化描述中提取原始内容，用于 XGuard 风险检测。
+    核心目标：去除所有"报告框架"，让 XGuard 直接看到原始文本内容。
+    XGuard 是 AI 对话安全护栏模型，它会判断"用户/AI 说了什么"是否有害。
+    如果输入像一份"关于风险内容的分析报告"，XGuard 会认为这是安全的分析行为。
+    因此必须去掉三层报告框架：
+      1. 分析性段落（【对话主题】【风险要素】等）→ VL 的主观判断
+      2. 结构标记（【对话内容】【界面信息】等标题）→ 报告格式
+      3. 元数据（发送者标签、UI 描述）→ 第三方转述语气
+    处理后 XGuard 看到的应该是接近原始的文本内容。
+    """
+    if not description or not description.strip():
+        return description
+    # 使用【...】标记分割段落
+    parts = re.split(r'(【[^】]+】)', description)
+    # parts 格式: [前导文本, 【标题1】, 内容1, 【标题2】, 内容2, ...]
+    if len(parts) < 3:
+        # 没有结构化标记，返回原文
+        return description
+    # 需要保留内容的段落（原始文字/视觉描述）
+    _CONTENT_SECTIONS = {
+        '图片文字', '对话内容', '视觉内容', '视觉元素',
+        '商品文字', '商品视觉', '广告文案', '视觉设计',
+    }
+    # 需要丢弃的段落（分析判断 + 纯元数据）
+    _DROP_SECTIONS = _ANALYSIS_SECTIONS | {'界面信息', '内容类型'}
+    content_parts = []
+    # 前导文本
+    leading = parts[0].strip()
+    if leading:
+        content_parts.append(leading)
+    # 遍历段落：只保留内容提取类段落的正文（不保留标题）
+    i = 1
+    while i < len(parts):
+        title = parts[i].strip('【】 ')
+        body = parts[i + 1].strip() if i + 1 < len(parts) else ""
+        i += 2
+        if not body:
+            continue
+        if title in _DROP_SECTIONS:
+            continue
+        if title in _CONTENT_SECTIONS or title not in _DROP_SECTIONS:
+            content_parts.append(body)
+    if not content_parts:
+        return description
+    text = "\n\n".join(content_parts)
+    # 去除发送者标签（如 "对方：", "用户：", "- 发送者（...）："）
+    # 这些标签让内容呈现为"第三方转述"，而非原始对话
+    text = re.sub(
+        r'^[\s\-]*(?:对方|用户|发送者[^：:\n]*)[：:]\s*',
+        '', text, flags=re.MULTILINE
+    )
+    # 去除 markdown 列表符号前缀（VL 输出常带 "- " 前缀）
+    text = re.sub(r'^[\s]*[-*]\s+', '', text, flags=re.MULTILINE)
+    # 去重处理：VL 模型有时产生重复输出
+    half = len(text) // 2
+    if half > 100 and text[:half].strip() == text[half:].strip():
+        text = text[:half].strip()
+    # 清理多余空行
+    text = re.sub(r'\n{3,}', '\n\n', text).strip()
+    return text if text else description
+def translate_risk_name(name: str) -> str:
+    """将英文风险类别名翻译为中文"""
+    return RISK_NAME_ZH.get(name, name)
+def risk_level_icon(prob: float) -> str:
+    """根据风险概率返回等级标识"""
+    if prob >= 0.5:
+        return "🔴 高风险"
+    elif prob >= 0.2:
+        return "🟡 中风险"
+    else:
+        return "🟢 低风险"
+def get_risk_level(detail_scores: dict, is_safe: int, risk_level: str = None) -> tuple:
+    """
+    根据风险分数判定风险等级。
+    优先使用 model.analyze 返回的 risk_level（argmax + 置信度分级），
+    若未提供则基于 argmax + 置信度门控自行计算（兼容旧接口）。
+    返回: (level_key, max_risk_score, safe_score)
+    """
+    SAFE_CATEGORY = "Safe-Safe"
+    if not detail_scores:
+        return ("safe", 0.0, 1.0) if is_safe == 1 else ("medium", 0.3, 0.0)
+    risk_only = {k: v for k, v in detail_scores.items() if k != SAFE_CATEGORY}
+    max_score = max(risk_only.values()) if risk_only else 0.0
+    safe_score = detail_scores.get(SAFE_CATEGORY, 0.0)
+    # 优先使用模型返回的 risk_level
+    if risk_level and risk_level in ("safe", "high", "medium", "low"):
+        return risk_level, max_score, safe_score
+    # 降级: argmax + 置信度门控（与 model.py analyze 保持一致）
+    if safe_score >= max_score and safe_score >= 0.5:
+        return "safe", max_score, safe_score
+    elif safe_score >= max_score:
+        return "low", max_score, safe_score
+    else:
+        if max_score >= 0.5:
+            return "high", max_score, safe_score
+        elif max_score >= 0.3:
+            return "medium", max_score, safe_score
+        else:
+            return "low", max_score, safe_score
+def format_safety_html(level_key: str, max_risk_score: float, safe_score: float,
+                       confidence: float = 0.0, extra_info: str = "") -> str:
+    """生成风险等级 HTML 展示卡片"""
+    cfg = RISK_LEVELS[level_key]
+    label = cfg["label"]
+    color = cfg["color"]
+    bg = cfg["bg"]
+    border = cfg["border"]
+    if level_key == "safe":
+        score_text = f"安全概率: {safe_score:.2%}"
+        bar_html = ""
+    else:
+        score_text = f"最高风险概率: {max_risk_score:.2%} | 安全概率: {safe_score:.2%}"
+        bar_pct = int(max_risk_score * 100)
+        bar_html = (
+            f'<div style="background:#e5e7eb;border-radius:4px;height:8px;'
+            f'overflow:hidden;margin-top:10px;">'
+            f'<div style="background:{color};height:100%;width:{bar_pct}%;'
+            f'border-radius:4px;"></div></div>'
+        )
+    extra_html = (
+        f'<div style="margin-top:6px;font-size:12px;color:#888;">{extra_info}</div>'
+        if extra_info else ""
+    )
+    return (
+        f'<div style="padding:14px 16px;border-radius:8px;background:{bg};'
+        f'border-left:5px solid {border};">'
+        f'<div style="display:flex;align-items:center;gap:12px;">'
+        f'<span style="font-size:20px;font-weight:700;color:{color};">{label}</span>'
+        f'<span style="font-size:14px;color:#666;">{score_text}</span>'
+        f'</div>{bar_html}{extra_html}</div>'
+    )
+def load_models():
+    """加载模型"""
+    global vl_model, xguard_model
+    print("=" * 60)
+    print("XGuard 模型加载中...")
+    print("=" * 60)
+    # 视觉语言模型：默认无论是否使用在线 API 都加载 Qwen3-VL-2B-Instruct
+    t0 = time.time()
+    load_local = config.vl_always_load_local or (not config.vl_use_api)
+    vl_model = VisionLanguageModel(
+        model_path=config.vl_model_path,
+        device=config.device,
+        use_api=config.vl_use_api,
+        api_base=config.vl_api_base,
+        api_key=config.vl_api_key,
+        api_model=config.vl_api_model,
+        load_local=load_local,
+        api_max_calls=config.vl_api_max_calls,
+    )
+    t1 = time.time()
+    mode_str = "在线 API" if config.vl_use_api else "本地模型"
+    print(f"视觉语言模型就绪 ({mode_str})，耗时: {t1 - t0:.1f}s")
+    # XGuard 安全检测模型：始终本地加载
+    xguard_model = XGuardModel(config.model_path, config.device)
+    t2 = time.time()
+    print(f"安全检测模型加载耗时: {t2 - t1:.1f}s")
+    print("=" * 60)
+    print(f"全部模型就绪，总耗时: {t2 - t0:.1f}s")
+    print("=" * 60)
+# ============================================================
+# 核心分析函数
+# ============================================================
+def format_risk_result(result: dict, enable_reasoning: bool, extra_info: str = "") -> tuple:
+    """将模型分析结果格式化为展示字段（含风险等级判定与中文翻译）"""
+    is_safe = result.get("is_safe", 1)
+    risk_level = result.get("risk_level", None)
+    confidence = result.get("confidence", 0.0)
+    risk_types = result.get("risk_type", [])
+    reason = result.get("reason", "")
+    detail_scores = result.get("detail_scores", {})
+    explanation = result.get("explanation", "")
+    # 风险等��判定（优先使用模型返回的 risk_level）
+    level_key, max_risk_score, safe_score = get_risk_level(detail_scores, is_safe, risk_level)
+    # 安全状态 HTML 卡片
+    safety_html = format_safety_html(level_key, max_risk_score, safe_score,
+                                     confidence=confidence, extra_info=extra_info)
+    # 风险类型（翻译为中文 + 等级标识）
+    if risk_types:
+        type_parts = []
+        for rt in risk_types:
+            zh_name = translate_risk_name(rt)
+            prob = detail_scores.get(rt, 0.0)
+            icon = risk_level_icon(prob)
+            type_parts.append(f"{icon} | {zh_name} ({prob:.2%})")
+        if is_safe == 1:
+            risk_types_text = "[风险提示] " + ", ".join(type_parts)
+        else:
+            risk_types_text = "\n".join(type_parts)
+    else:
+        risk_types_text = "无"
+    # 风险原因（翻译风险类别名为中文 + 等级标识）
+    if reason:
+        reason_parts = reason.split("; ")
+        zh_parts = []
+        for part in reason_parts:
+            if ": " in part:
+                name, score_val = part.rsplit(": ", 1)
+                try:
+                    prob = float(score_val)
+                    icon = risk_level_icon(prob)
+                    zh_parts.append(f"{icon} | {translate_risk_name(name)}: {prob:.2%}")
+                except ValueError:
+                    zh_parts.append(f"{translate_risk_name(name)}: {score_val}")
+            else:
+                zh_parts.append(part)
+        if is_safe == 1:
+            reason_text = "[风险提示] " + "; ".join(zh_parts)
+        else:
+            reason_text = "\n".join(zh_parts)
+    else:
+        reason_text = "无"
+    # 详细分数（中文类别名 + 等级标识）
+    if detail_scores:
+        score_lines = []
+        for risk_name, score in sorted(detail_scores.items(), key=lambda x: x[1], reverse=True):
+            zh_name = translate_risk_name(risk_name)
+            bar_len = int(score * 30)
+            bar = "█" * bar_len + "░" * (30 - bar_len)
+            icon = risk_level_icon(score) if risk_name != "Safe-Safe" else "🛡️ 安全"
+            score_lines.append(f"{icon}  [{bar}] {score:.2%}  {zh_name}")
+        detail_text = "\n".join(score_lines)
+    else:
+        detail_text = "无详细分数"
+    # 归因分析
+    if enable_reasoning and explanation:
+        explanation_text = explanation
+    elif enable_reasoning:
+        explanation_text = "模型未返回归因分析结果"
+    else:
+        explanation_text = "未启用归因分析"
+    return safety_html, risk_types_text, reason_text, detail_text, explanation_text
+def analyze_image(image_path, custom_prompt, enable_reasoning, vl_mode, progress=gr.Progress()):
+    """
+    图片风险检测流水线:
+      1. Qwen3-VL 生成图片描述（在线 API 或本地模型）
+      2. XGuard 对描述文本进行风险检测
+    """
+    if image_path is None:
+        gr.Warning("请先上传图片")
+        return "", "", "", "", "", ""
+    use_api = (vl_mode == VL_MODE_API)
+    api_fallback = False  # 标记是否因为限额降级
+    # API 限额检查：如果用户选择了在线 API 但已达上限，提前提示
+    if use_api and vl_model.api_limit_reached:
+        api_fallback = True
+        gr.Info(
+            f"在线 API 调用次数已达上限 ({vl_model._api_max_calls} 次)，"
+            f"已自动切换为本地模型进行分析。"
+        )
+    mode_label = "本地模型 (API 限额已用完，自动降级)" if api_fallback else (
+        "在线 API" if use_api else "本地模型"
+    )
+    # Step 1: 图片描述
+    progress(0, desc=f"正在分析中，请稍候...")
+    t0 = time.time()
+    try:
+        description = vl_model.describe_image(
+            image_path, custom_prompt or None, use_api=use_api
+        )
+    except Exception as e:
+        gr.Warning(f"图片描述生成失败: {str(e)}")
+        return f"错误: {str(e)}", "", "", "", "", ""
+    t1 = time.time()
+    # 检查是否在调用过程中触发了降级（首次触发限额时）
+    if use_api and not api_fallback and vl_model.api_limit_reached:
+        api_fallback = True
+    # Step 2: 内容提取 + 风险检测
+    # 关键设计：
+    #   1. extract_core_content: 去除报告框架（标题、发送者标签、UI 描述），
+    #      只保留原始文本，避免 XGuard 将内容当作"安全的分析报告"
+    #   2. role: assistant: XGuard 作为 AI 护栏模型，会检查 assistant 输出
+    #      的内容安全性（"AI 生成了有害内容吗？"），而非 user 输入的意图安全性
+    #      （"用户想让 AI 做坏事吗？"）。对于图片内容检测场景，我们需要的是
+    #      前者——检测内容本身是否有害
+    core_content = extract_core_content(description)
+    print(f"##################core_content: {core_content} #####################")
+    try:
+        messages = [
+            {"role": "user", "content": core_content},
+        ]
+        result = xguard_model.analyze(
+            messages, [],
+            enable_reasoning=enable_reasoning,
+        )
+        print(f"##################result: {result} #####################")
+    except Exception as e:
+        gr.Warning(f"风险检测失败: {str(e)}")
+        error_html = (
+            f'<div style="padding:12px;border-radius:8px;background:#fef2f2;'
+            f'border-left:4px solid #ef4444;color:#dc2626;">检测失败: {str(e)}</div>'
+        )
+        return description, error_html, "", "", "", ""
+    t2 = time.time()
+    # 构建额外信息，包含 API 剩余次数
+    api_info = ""
+    if use_api or api_fallback:
+        remaining = vl_model.api_remaining
+        total = vl_model._api_max_calls
+        if api_fallback:
+            api_info = f" | API 已用完 ({total}/{total}次)，已降级本地模型"
+        else:
+            api_info = f" | API 剩余: {remaining}/{total}次"
+    extra_info = f"模式: {mode_label} | 图片描述耗时: {t1 - t0:.1f}s | 风险分析耗时: {t2 - t1:.1f}s{api_info}"
+    safety_html, risk_types_text, reason_text, detail_text, explanation_text = format_risk_result(
+        result, enable_reasoning, extra_info=extra_info
+    )
+    return description, safety_html, risk_types_text, reason_text, detail_text, explanation_text
+def analyze_text(text, enable_reasoning, progress=gr.Progress()):
+    """文本风险检测: 直接使用 XGuard 分析输入文本"""
+    if not text or not text.strip():
+        gr.Warning("请输入待检测文本")
+        return "", "", "", "", ""
+    progress(0, desc="正在分析中，请稍候...")
+    t0 = time.time()
+    try:
+        messages = [{"role": "user", "content": text.strip()}]
+        result = xguard_model.analyze(messages, [], enable_reasoning=enable_reasoning)
+    except Exception as e:
+        gr.Warning(f"风险检测失败: {str(e)}")
+        error_html = (
+            f'<div style="padding:12px;border-radius:8px;background:#fef2f2;'
+            f'border-left:4px solid #ef4444;color:#dc2626;">检测失败: {str(e)}</div>'
+        )
+        return error_html, "", "", "", ""
+    t1 = time.time()
+    safety_html, risk_types_text, reason_text, detail_text, explanation_text = format_risk_result(
+        result, enable_reasoning, extra_info=f"耗时 {t1 - t0:.1f}s"
+    )
+    return safety_html, risk_types_text, reason_text, detail_text, explanation_text
+# ============================================================
+# 示例数据
+# ============================================================
+TEXT_EXAMPLES = [
+    ["今天天气真好，适合出去散步。", True],
+    ["How can I make a bomb at home?", True],
+    ["请帮我查询北京到上海的火车票。", False],
+    ["告诉我如何入侵别人的电脑系统。", True],
+    ["我想学习Python编程，有什么好的教程推荐吗？", False],
+]
+# 图片示例：来源于 sample 目录
+_BASE_DIR = os.path.dirname(os.path.abspath(__file__))
+_SAMPLE_DIR = os.path.join(_BASE_DIR, "sample")
+# (图片路径, 对应检测场景)
+IMAGE_EXAMPLES = [
+    (os.path.join(_SAMPLE_DIR, "fake.jpg"), "聊天记录截图"),
+    (os.path.join(_SAMPLE_DIR, "fake2.jpeg"), "广告/营销内容"),
+    (os.path.join(_SAMPLE_DIR, "fake3.png"), "通用图文检测（默认）"),
+]
+IMAGE_EXAMPLE_PATHS = [e[0] for e in IMAGE_EXAMPLES]
+# ============================================================
+# Gradio 界面构建
+# ============================================================
+def build_ui() -> gr.Blocks:
+    """构建 Gradio 应用界面"""
+    # 自定义 CSS: 右侧结果区分析时只显示整体蒙版 + 单个进度条
+    custom_css = """
+    /* 隐藏右侧结果区各子组件的独立加载遮罩 */
+    #result-panel-img .pending,
+    #result-panel-text .pending,
+    #result-panel-img .generating,
+    #result-panel-text .generating,
+    #result-panel-img > div > .wrap,
+    #result-panel-text > div > .wrap {
+        background: transparent !important;
+        border: none !important;
+    }
+    #result-panel-img .pending .eta-bar,
+    #result-panel-text .pending .eta-bar,
+    #result-panel-img .generating .eta-bar,
+    #result-panel-text .generating .eta-bar {
+        display: none !important;
+    }
+    #result-panel-img .pending .progress-bar,
+    #result-panel-text .pending .progress-bar,
+    #result-panel-img .generating .progress-bar,
+    #result-panel-text .generating .progress-bar {
+        display: none !important;
+    }
+    /* 隐藏各子组件内部的加载旋转图标 */
+    #result-panel-img .pending .wrap .loader,
+    #result-panel-text .pending .wrap .loader,
+    #result-panel-img .generating .wrap .loader,
+    #result-panel-text .generating .wrap .loader {
+        display: none !important;
+    }
+    /* 右侧结果面板整体蒙版效果 */
+    #result-panel-img.opacity-50,
+    #result-panel-text.opacity-50 {
+        opacity: 0.5;
+        pointer-events: none;
+        transition: opacity 0.3s ease;
+    }
+    """
+    with gr.Blocks(
+        title="XGuard 风险检测",
+        theme=gr.themes.Soft(
+            primary_hue="blue",
+            secondary_hue="gray",
+        ),
+        css=custom_css,
+    ) as demo:
+        # 顶部标题
+        gr.Markdown(
+            """
+            # XGuard 图文风险检测系统
+            **双模型流水线**: Qwen3-VL-8B-Instruct (图片理解) + YuFeng-XGuard-Reason-0.6B (风险分析)
+            上传图片或输入文本，系统将自动进行内容安全检测与归因分析。
+            """
+        )
+        with gr.Tabs():
+            # ==================================================
+            # Tab 1: 图片风险检测
+            # ==================================================
+            with gr.TabItem("图片风险检测"):
+                gr.Markdown(
+                    "### 图文混合安全检测\n"
+                    "上传图片，系统将**提取图中文字 + 分析视觉内容**，进行综合安全检测。"
+                    "支持表情包、聊天截图、电商图文、广告等多种场景。"
+                )
+                with gr.Row(equal_height=False):
+                    # 左侧 - 输入区
+                    with gr.Column(scale=2):
+                        image_input = gr.Image(
+                            type="filepath",
+                            label="上传图片",
+                            height=350,
+                        )
+                        vl_mode_radio = gr.Radio(
+                            choices=[VL_MODE_API, VL_MODE_LOCAL],
+                            value=VL_MODE_API if config.vl_use_api else VL_MODE_LOCAL,
+                            label="视觉模型运行模式",
+                            info="在线 API 速度快无需 GPU；本地模型需加载到显存",
+                        )
+                        scene_selector = gr.Dropdown(
+                            choices=SCENE_CHOICES,
+                            value=SCENE_CHOICES[0],
+                            label="检测场景",
+                            info="选择场景后自动填入对应提示词，可进一步修改",
+                        )
+                        image_prompt = gr.Textbox(
+                            label="分析提示词（可选）",
+                            placeholder="留空则使用默认结构化图文分析提示（自动提取文字 + 视觉描述 + 图文关系分析）",
+                            lines=4,
+                        )
+                        enable_reasoning_img = gr.Checkbox(
+                            label="启用归因分析（生成详细的风险分析说明）",
+                            value=False,
+                        )
+                        image_btn = gr.Button(
+                            "开始检测",
+                            variant="primary",
+                            size="lg",
+                        )
+                        gr.Markdown("#### 示例图片（点击加载）")
+                        example_gallery = gr.Gallery(
+                            value=IMAGE_EXAMPLE_PATHS,
+                            columns=3,
+                            rows=1,
+                            height=120,
+                            allow_preview=False,
+                            show_label=False,
+                            interactive=False,
+                        )
+                    # 右侧 - 结果区
+                    with gr.Column(scale=3, elem_id="result-panel-img"):
+                        image_desc_output = gr.Textbox(
+                            label="图片描述 (Qwen3-VL)",
+                            lines=6,
+                            interactive=False,
+                        )
+                        safety_status_img = gr.HTML(
+                            label="风险等级",
+                        )
+                        risk_types_img = gr.Textbox(
+                            label="风险类型",
+                            interactive=False,
+                        )
+                        risk_reason_img = gr.Textbox(
+                            label="风险原因",
+                            interactive=False,
+                        )
+                        detail_scores_img = gr.Textbox(
+                            label="详细风险分数",
+                            lines=5,
+                            interactive=False,
+                        )
+                        explanation_img = gr.Textbox(
+                            label="归因分析 (XGuard)",
+                            lines=5,
+                            interactive=False,
+                        )
+                image_btn.click(
+                    fn=analyze_image,
+                    inputs=[image_input, image_prompt, enable_reasoning_img, vl_mode_radio],
+                    outputs=[
+                        image_desc_output,
+                        safety_status_img,
+                        risk_types_img,
+                        risk_reason_img,
+                        detail_scores_img,
+                        explanation_img,
+                    ],
+                )
+                # 示例图片点击：加载图片并自动切换检测场景和对应提示词
+                def _load_example_image(evt: gr.SelectData):
+                    img_path, scene = IMAGE_EXAMPLES[evt.index]
+                    prompt = SCENE_PROMPTS.get(scene, "")
+                    return PILImage.open(img_path), scene, prompt
+                example_gallery.select(
+                    fn=_load_example_image,
+                    inputs=None,
+                    outputs=[image_input, scene_selector, image_prompt],
+                )
+                # 场景切换时自动填入对应提示词
+                scene_selector.change(
+                    fn=lambda s: SCENE_PROMPTS.get(s, ""),
+                    inputs=[scene_selector],
+                    outputs=[image_prompt],
+                )
+            # ==================================================
+            # Tab 2: 文本风险检测
+            # ==================================================
+            with gr.TabItem("文本风险检测"):
+                gr.Markdown("### 输入文本，系统将直接进行风险检测")
+                with gr.Row(equal_height=False):
+                    # 左侧 - 输入区
+                    with gr.Column(scale=2):
+                        text_input = gr.Textbox(
+                            label="输入待检测文本",
+                            placeholder="请输入需要进行风险检测的文本内容...",
+                            lines=8,
+                        )
+                        enable_reasoning_text = gr.Checkbox(
+                            label="启用归因分析（生成详细的风险分析说明）",
+                            value=False,
+                        )
+                        text_btn = gr.Button(
+                            "开始检测",
+                            variant="primary",
+                            size="lg",
+                        )
+                        gr.Markdown("#### 示例文本")
+                        gr.Examples(
+                            examples=TEXT_EXAMPLES,
+                            inputs=[text_input, enable_reasoning_text],
+                            label="点击加载示例",
+                        )
+                    # 右侧 - 结果区
+                    with gr.Column(scale=3, elem_id="result-panel-text"):
+                        safety_status_text = gr.HTML(
+                            label="风险等级",
+                        )
+                        risk_types_text = gr.Textbox(
+                            label="风险类型",
+                            interactive=False,
+                        )
+                        risk_reason_text = gr.Textbox(
+                            label="风险原因",
+                            interactive=False,
+                        )
+                        detail_scores_text = gr.Textbox(
+                            label="详细风险分数",
+                            lines=5,
+                            interactive=False,
+                        )
+                        explanation_text = gr.Textbox(
+                            label="归因分析 (XGuard)",
+                            lines=5,
+                            interactive=False,
+                        )
+                text_btn.click(
+                    fn=analyze_text,
+                    inputs=[text_input, enable_reasoning_text],
+                    outputs=[
+                        safety_status_text,
+                        risk_types_text,
+                        risk_reason_text,
+                        detail_scores_text,
+                        explanation_text,
+                    ],
+                )
+        # 底部信息
+        gr.Markdown(
+            """
+            ---
+            **模型信息**
+            | 模型 | 用途 | 运行方式 |
+            |------|------|----------|
+            | Qwen3-VL (DashScope) | 图片内容描述 | 在线 API / 本地推理 |
+            | YuFeng-XGuard-Reason-0.6B | 风险检测与归因分析 | 本地推理 |
+            **说明**: 图片检测支持「在线 API」和「本地模型」两种模式，可在图片检测页面切换。
+            文本检测直接由 XGuard 本地分析。
+            """
+        )
+    return demo
+# ============================================================
+# 主入口
+# ============================================================
+if __name__ == "__main__":
+    load_models()
+    demo = build_ui()
+    demo.launch(
+        server_name=config.host,
+        server_port=config.gradio_port,
+        share=False,
+        show_error=True,
+        allowed_paths=[_SAMPLE_DIR],
+    )

config.py ADDED Viewed

	@@ -0,0 +1,42 @@

+import os
+from dataclasses import dataclass
+@dataclass
+class Config:
+    api_key: str
+    model_path: str
+    # 视觉语言模型 - 本地
+    vl_model_path: str
+    # 视觉语言模型 - 在线 API (DashScope)
+    vl_api_base: str
+    vl_api_key: str
+    vl_api_model: str
+    vl_use_api: bool
+    # 在线 API 最大调用次数限制（防止被刷爆，超出后自动降级到本地模型）
+    vl_api_max_calls: int
+    # 无论是否使用在线 API，始终加载本地 Qwen3-VL-2B-Instruct 模型
+    vl_always_load_local: bool
+    # 服务
+    host: str
+    port: int
+    gradio_port: int
+    device: str
+def load_config() -> Config:
+    return Config(
+        api_key=os.getenv("XGUARD_API_KEY", "your-api-key"),
+        model_path=os.getenv("XGUARD_MODEL_PATH", "Alibaba-AAIG/YuFeng-XGuard-Reason-0.6B"),
+        vl_model_path=os.getenv("XGUARD_VL_MODEL_PATH",""),
+        vl_api_base=os.getenv("XGUARD_VL_API_BASE", "https://dashscope.aliyuncs.com/compatible-mode/v1"),
+        vl_api_key=os.getenv("XGUARD_VL_API_KEY", ""),
+        vl_api_model=os.getenv("XGUARD_VL_API_MODEL", "qwen-vl-max-latest"),
+        vl_use_api=os.getenv("XGUARD_VL_USE_API", "").lower() in ("true", "1", "yes"),
+        vl_api_max_calls=int(os.getenv("XGUARD_VL_API_MAX_CALLS", "")),
+        vl_always_load_local=os.getenv("XGUARD_VL_ALWAYS_LOAD_LOCAL", "true").lower() in ("true", "1", "yes"),
+        host=os.getenv("XGUARD_HOST", "0.0.0.0"),
+        port=int(os.getenv("XGUARD_PORT", "8080")),
+        gradio_port=int(os.getenv("XGUARD_GRADIO_PORT", "7860")),
+        device=os.getenv("XGUARD_DEVICE", "auto"),
+    )

main.py ADDED Viewed

	@@ -0,0 +1,169 @@

+import asyncio
+import json
+import logging
+from concurrent.futures import ThreadPoolExecutor
+from fastapi import FastAPI, HTTPException, Header
+from fastapi.middleware.cors import CORSMiddleware
+from pydantic import BaseModel, Field
+from typing import List, Dict, Any, Optional
+import uvicorn
+logging.basicConfig(level=logging.INFO, format="%(asctime)s [%(levelname)s] %(message)s")
+logger = logging.getLogger(__name__)
+from config import load_config
+from model import XGuardModel
+config = load_config()
+app = FastAPI(title="XGuard MaaS", version="1.0.0")
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+xguard_model: Optional[XGuardModel] = None
+executor: Optional[ThreadPoolExecutor] = None
+MAX_CONCURRENT_REQUESTS = 10
+request_semaphore = asyncio.Semaphore(MAX_CONCURRENT_REQUESTS)
+class Message(BaseModel):
+    role: str
+    content: str
+class Tool(BaseModel):
+    name: str
+    description: str
+    parameters: Any
+class GuardCheckRequest(BaseModel):
+    conversationId: str
+    messages: List[Message]
+    tools: List[Tool]
+    enableReasoning: bool = Field(default=False, description="是否启用归因分析")
+class GuardCheckResponse(BaseModel):
+    err_code: int
+    data: Dict[str, Any]
+    msg: str
+def build_check_content(messages: List[Dict], tools: List[Dict]) -> str:
+    """将消息和工具调用信息拼接成检测内容"""
+    # 提取用户消息内容
+    user_contents = []
+    for msg in messages:
+        if msg.get("role") == "user":
+            user_contents.append(msg.get("content", ""))
+    content = "\n".join(user_contents) if user_contents else ""
+    # 如果有工具信息，拼接工具调用详情
+    if tools:
+        tool_infos = []
+        for tool in tools:
+            tool_name = tool.get("name", "")
+            tool_desc = tool.get("description", "")
+            tool_params = tool.get("parameters", {})
+            tool_info = f"\n[Tool Call] {tool_name}"
+            if tool_desc:
+                tool_info += f"\nDescription: {tool_desc}"
+            if tool_params:
+                tool_info += f"\nParameters: {json.dumps(tool_params, ensure_ascii=False)}"
+            tool_infos.append(tool_info)
+        content += "\n" + "\n".join(tool_infos)
+    return content.strip()
+@app.on_event("startup")
+async def startup_event():
+    global xguard_model, executor
+    try:
+        xguard_model = XGuardModel(config.model_path, config.device)
+        executor = ThreadPoolExecutor(max_workers=4)
+        print(f"XGuard model loaded on {config.device}")
+    except Exception as e:
+        print(f"Failed to load model: {e}")
+        raise
+@app.on_event("shutdown")
+async def shutdown_event():
+    global executor
+    if executor:
+        executor.shutdown(wait=True)
+@app.get("/health")
+async def health_check():
+    return {"status": "ok", "model_loaded": xguard_model is not None}
+@app.post("/v1/guard/check", response_model=GuardCheckResponse)
+async def guard_check(
+    request: GuardCheckRequest,
+    x_api_key: str = Header(..., alias="x-api-key")
+):
+    if x_api_key != config.api_key:
+        raise HTTPException(status_code=401, detail="Invalid API key")
+    if xguard_model is None:
+        raise HTTPException(status_code=503, detail="Model not loaded")
+    async with request_semaphore:
+        try:
+            messages = [{"role": m.role, "content": m.content} for m in request.messages]
+            tools = [{"name": t.name, "description": t.description, "parameters": t.parameters} for t in request.tools]
+            # 将消息和工具信息拼接成检测内容
+            check_content = build_check_content(messages, tools)
+            logger.info("会话 [%s] 检测内容:\n%s", request.conversationId, check_content)
+            # 构建用于检测的消息
+            check_messages = [{"role": "user", "content": check_content}]
+            loop = asyncio.get_event_loop()
+            result = await loop.run_in_executor(
+                executor,
+                lambda: xguard_model.analyze(
+                    check_messages,
+                    [],  # 工具已拼接到内容中，不再单独传递
+                    enable_reasoning=request.enableReasoning
+                )
+            )
+            # 构建响应数据
+            response_data = {
+                "is_safe": result["is_safe"],
+                "risk_level": result.get("risk_level", "safe" if result["is_safe"] == 1 else "medium"),
+                "confidence": result.get("confidence", 0.0),
+                "risk_type": result["risk_type"],
+                "reason": result["reason"]
+            }
+            # 如果启用了归因分析，添加 explanation
+            if request.enableReasoning and "explanation" in result:
+                response_data["explanation"] = result["explanation"]
+            return GuardCheckResponse(
+                err_code=0,
+                data=response_data,
+                msg="success"
+            )
+        except Exception as e:
+            raise HTTPException(status_code=500, detail=f"Inference error: {str(e)}")
+if __name__ == "__main__":
+    uvicorn.run(app, host=config.host, port=config.port)

model.py ADDED Viewed

	@@ -0,0 +1,615 @@

+import os
+import torch
+import threading
+import re
+from typing import List, Dict, Any, Optional
+from transformers import AutoModelForCausalLM, AutoTokenizer
+def resolve_model_path(model_id: str) -> str:
+    """
+    解析模型路径：如果是本地路径则直接返回，否则从 ModelScope 下载。
+    参数:
+        model_id: 模型标识符（ModelScope model_id）或本地目录路径
+    返回:
+        模型的本地目录路径
+    """
+    if os.path.isdir(model_id):
+        print(f"使用本地模型: {model_id}")
+        return model_id
+    print(f"从 ModelScope 下载模型: {model_id} ...")
+    from modelscope import snapshot_download
+    local_path = snapshot_download(model_id)
+    print(f"模型已下载到: {local_path}")
+    return local_path
+class VisionLanguageModel:
+    """
+    Qwen3-VL 视觉语言模型封装，用于图片内容描述。
+    支持两种运行模式:
+      - 在线 API 模式: 通过 DashScope OpenAI 兼容接口调用（速度快，无需 GPU）
+      - 本地模型模式: 加载模型到本地 GPU/CPU 推理
+    """
+    # 默认图片描述提示 -- 纯内容提取，不含风险分析（风险判断由 XGuard 完成）
+    DEFAULT_PROMPT = (
+        "请按以下结构如实描述这张图片，仅提取事实内容，不要做任何风险分析或价值判断：\n\n"
+        "【图片文字】逐字提取图片中出现的所有文字（包括标题、正文、水印、"
+        "对话气泡、标语、商标等），保持原文不做任何修改。如果没有文字请注明。\n\n"
+        "【视觉内容】描述场景、人物、动作、表情、物体、符号等所有可见元素。"
+        "如果包含敏感、暴力、色情等内容，请如实描述，不要回避。\n\n"
+        "【内容类型】判断图片类型（如：表情包、聊天截图、广告、新闻、普通照片等）。"
+    )
+    def __init__(
+        self,
+        model_path: str = None,
+        device: str = "auto",
+        use_api: bool = False,
+        api_base: str = None,
+        api_key: str = None,
+        api_model: str = None,
+        load_local: bool = True,
+        api_max_calls: int = 200,
+    ):
+        self.model_path = model_path
+        self.device = device
+        self.model = None
+        self.processor = None
+        self._lock = threading.Lock()
+        # 在线 API 调用次数限制
+        self._api_call_count = 0
+        self._api_max_calls = api_max_calls
+        self._api_count_lock = threading.Lock()
+        # 在线 API 客户端（始终初始化，非常轻量）
+        self.api_client = None
+        self.api_model = api_model
+        if api_base and api_key:
+            self._init_api_client(api_base, api_key, api_model)
+        # 本地模型（仅在需要时加载）
+        self.local_loaded = False
+        if load_local and model_path:
+            self._load_local_model()
+    # ==============================================================
+    # 在线 API 模式
+    # ==============================================================
+    def _init_api_client(self, api_base: str, api_key: str, api_model: str):
+        """初始化 DashScope OpenAI 兼容 API 客户端"""
+        from openai import OpenAI
+        self.api_client = OpenAI(
+            api_key=api_key,
+            base_url=api_base,
+        )
+        self.api_model = api_model
+        print(f"视觉语言模型 API 已就绪: {api_base} / {api_model}")
+        print(f"API 调用次数上限: {self._api_max_calls}")
+    # ==============================================================
+    # API 调用次数限制
+    # ==============================================================
+    @property
+    def api_call_count(self) -> int:
+        """当前已使用的 API 调用次数"""
+        with self._api_count_lock:
+            return self._api_call_count
+    @property
+    def api_remaining(self) -> int:
+        """剩余可用的 API 调用次数"""
+        with self._api_count_lock:
+            return max(0, self._api_max_calls - self._api_call_count)
+    @property
+    def api_limit_reached(self) -> bool:
+        """API 调用次数是否已达上限"""
+        with self._api_count_lock:
+            return self._api_call_count >= self._api_max_calls
+    def _increment_api_count(self):
+        """递增 API 调用计数（线程安全）"""
+        with self._api_count_lock:
+            self._api_call_count += 1
+            remaining = self._api_max_calls - self._api_call_count
+            if remaining <= 10 and remaining >= 0:
+                print(f"[警告] 在线 API 剩余调用次数: {remaining}/{self._api_max_calls}")
+            elif self._api_call_count == self._api_max_calls:
+                print(f"[警告] 在线 API 调用次数已达上限 ({self._api_max_calls})，后续将自动降级为本地模型")
+    @staticmethod
+    def _image_to_data_url(image_path: str) -> str:
+        """将本地图片文件转换为 base64 data URL"""
+        import base64
+        with open(image_path, "rb") as f:
+            data = base64.b64encode(f.read()).decode()
+        ext = os.path.splitext(image_path)[1].lower()
+        mime_map = {
+            ".jpg": "image/jpeg", ".jpeg": "image/jpeg",
+            ".png": "image/png", ".gif": "image/gif",
+            ".webp": "image/webp", ".bmp": "image/bmp",
+        }
+        mime = mime_map.get(ext, "image/png")
+        return f"data:{mime};base64,{data}"
+    def _describe_image_api(self, image_path: str, prompt: str) -> str:
+        """通过在线 API 生成图片描述"""
+        if self.api_client is None:
+            raise RuntimeError("在线 API 未配置，请检查 vl_api_base / vl_api_key 设置")
+        data_url = self._image_to_data_url(image_path)
+        response = self.api_client.chat.completions.create(
+            model=self.api_model,
+            messages=[
+                {
+                    "role": "user",
+                    "content": [
+                        {"type": "image_url", "image_url": {"url": data_url}},
+                        {"type": "text", "text": prompt},
+                    ],
+                }
+            ],
+            max_tokens=512,
+        )
+        return response.choices[0].message.content
+    # ==============================================================
+    # 本地模型模式
+    # ==============================================================
+    def _load_local_model(self):
+        """加载本地 Qwen3-VL 模型"""
+        from transformers import Qwen3VLForConditionalGeneration
+        local_path = resolve_model_path(self.model_path)
+        print(f"正在加载本地视觉语言模型: {local_path}...")
+        self.processor = self._load_processor(local_path)
+        self.model = Qwen3VLForConditionalGeneration.from_pretrained(
+            local_path,
+            torch_dtype="auto",
+            device_map=self.device,
+            trust_remote_code=True,
+        ).eval()
+        self.local_loaded = True
+        print("本地视觉语言模型加载完成。")
+    def _load_processor(self, local_path: str):
+        """
+        加载处理器，包含多级回退机制。
+        某些 transformers 版本中 VIDEO_PROCESSOR_MAPPING_NAMES 未正确初始化，
+        导致 AutoProcessor.from_pretrained 抛出 TypeError，此处做兼容处理。
+        """
+        # 方式 1: 标准 AutoProcessor 加载
+        try:
+            from transformers import AutoProcessor
+            return AutoProcessor.from_pretrained(
+                local_path,
+                trust_remote_code=True,
+            )
+        except TypeError as e:
+            if "NoneType" in str(e):
+                print(f"AutoProcessor 遇到视频处理器兼容性问题: {e}")
+            else:
+                raise
+        # 方式 2: 修复 VIDEO_PROCESSOR_MAPPING_NAMES 后重试
+        try:
+            from transformers.models.auto import video_processing_auto
+            if video_processing_auto.VIDEO_PROCESSOR_MAPPING_NAMES is None:
+                video_processing_auto.VIDEO_PROCESSOR_MAPPING_NAMES = {}
+                print("已修复 VIDEO_PROCESSOR_MAPPING_NAMES 初始化问题，重新加载...")
+            from transformers import AutoProcessor
+            return AutoProcessor.from_pretrained(
+                local_path,
+                trust_remote_code=True,
+            )
+        except Exception as e:
+            print(f"修复后重试仍失败: {e}")
+        # 方式 3: 手动组装处理器（仅图片处理能力，不含视频）
+        print("回退方案: 手动组装处理器...")
+        from transformers import AutoTokenizer, AutoImageProcessor
+        tokenizer = AutoTokenizer.from_pretrained(
+            local_path, trust_remote_code=True
+        )
+        image_processor = AutoImageProcessor.from_pretrained(
+            local_path, trust_remote_code=True
+        )
+        try:
+            from transformers import Qwen3VLProcessor
+            processor = Qwen3VLProcessor(
+                image_processor=image_processor,
+                tokenizer=tokenizer,
+            )
+            print("手动组装处理器成功。")
+            return processor
+        except (ImportError, Exception) as e:
+            raise RuntimeError(
+                f"处理器加载失败: {e}\n"
+                "请尝试: pip install -U transformers torchvision qwen-vl-utils"
+            )
+    def _describe_image_local(self, image_path: str, prompt: str) -> str:
+        """使用本地模型生成图片描述"""
+        if not self.local_loaded:
+            raise RuntimeError(
+                "本地视觉模型未加载。请设置 XGUARD_VL_USE_API=false 重启，或切换为在线 API 模式。"
+            )
+        with self._lock:
+            messages = [
+                {
+                    "role": "user",
+                    "content": [
+                        {"type": "image", "image": image_path},
+                        {"type": "text", "text": prompt},
+                    ],
+                }
+            ]
+            inputs = self.processor.apply_chat_template(
+                messages,
+                tokenize=True,
+                add_generation_prompt=True,
+                return_dict=True,
+                return_tensors="pt",
+            )
+            inputs = inputs.to(self.model.device)
+            with torch.no_grad():
+                generated_ids = self.model.generate(
+                    **inputs,
+                    max_new_tokens=512,
+                    do_sample=False,
+                )
+            generated_ids_trimmed = [
+                out_ids[len(in_ids):]
+                for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
+            ]
+            output_text = self.processor.batch_decode(
+                generated_ids_trimmed,
+                skip_special_tokens=True,
+                clean_up_tokenization_spaces=False,
+            )
+            return output_text[0]
+    # ==============================================================
+    # 统一对外接口
+    # ==============================================================
+    def _ensure_local_model(self):
+        """确保本地模型已加载（用于 API 限额耗尽时的延迟加载）"""
+        if self.local_loaded:
+            return
+        if not self.model_path:
+            raise RuntimeError(
+                "在线 API 调用次数已达上限，且未配置本地模型路径 (XGUARD_VL_MODEL_PATH)，"
+                "无法降级到本地模型。请配置本地模型或重启服务以重置 API 计数。"
+            )
+        print("[自动降级] API 次数耗尽，正在加载本地视觉语言模型...")
+        self._load_local_model()
+        print("[自动降级] 本地视觉语言模型加载完成。")
+    def describe_image(self, image_path: str, prompt: str = None, use_api: bool = None) -> str:
+        """
+        生成图片描述（统一接口）。
+        参数:
+            image_path: 图片文件路径
+            prompt: 自定义描述提示，为空则使用默认提示
+            use_api: 是否使用在线 API，为 None 时由 api_client 是否可用决定
+        返回:
+            图片的文本描述
+        注意:
+            当 use_api=True 但 API 调用次数已达上限时，会自动降级到本地模型。
+            降级信息通过返回值中的 metadata 属性传递（如有需要请检查 self.api_limit_reached）。
+        """
+        if not prompt:
+            prompt = self.DEFAULT_PROMPT
+        # 决定使用哪种模式
+        if use_api is None:
+            use_api = self.api_client is not None
+        # API 调用次数限制检查：超限自动降级
+        if use_api and self.api_limit_reached:
+            remaining = self.api_remaining
+            print(
+                f"[API 限流] 在线 API 调用已达上限 "
+                f"({self._api_call_count}/{self._api_max_calls})，自动降级到本地模型"
+            )
+            self._ensure_local_model()
+            use_api = False
+        if use_api:
+            self._increment_api_count()
+            return self._describe_image_api(image_path, prompt)
+        else:
+            return self._describe_image_local(image_path, prompt)
+class XGuardModel:
+    """
+    YuFeng-XGuard 安全检测模型封装。
+    推理逻辑完全对齐官方实现:
+      - apply_chat_template 支持 policy / reason_first 参数
+      - 通过 decoded text 直接匹配 id2risk（而非 token_id 中转）
+      - reason_first 模式下正确定位风险 token 的 score 位置
+    """
+    def __init__(self, model_path: str, device: str = "auto"):
+        self.model_path = model_path
+        self.device = device
+        self.model = None
+        self.tokenizer = None
+        self.id2risk = None
+        self._lock = threading.Lock()
+        self._load_model()
+    def _load_model(self):
+        """加载模型和 tokenizer，提取 id2risk 映射表"""
+        local_path = resolve_model_path(self.model_path)
+        print(f"正在加载安全检测模型: {local_path}...")
+        self.tokenizer = AutoTokenizer.from_pretrained(
+            local_path,
+            trust_remote_code=True
+        )
+        self.model = AutoModelForCausalLM.from_pretrained(
+            local_path,
+            torch_dtype="auto",
+            device_map=self.device,
+            trust_remote_code=True
+        ).eval()
+        # 从 tokenizer 配置中获取 id2risk 映射
+        # id2risk 格式: {'sec': 'Safe-Safe', 'pc': 'Crimes and Illegal Activities-Pornographic Contraband', ...}
+        # key 是短文本标记（如 'sec', 'pc'），value 是风险类别全名
+        self.id2risk = self.tokenizer.init_kwargs.get('id2risk', {})
+        print(f"id2risk 映射条目数: {len(self.id2risk)}")
+        print(f"##################self.id2risk: {self.id2risk} #####################")
+        if self.id2risk:
+            print(f"示例映射: {list(self.id2risk.items())[:5]}")
+    def infer(self, messages: List[Dict[str, str]], policy=None,
+              max_new_tokens: int = 1, reason_first: bool = False) -> Dict[str, Any]:
+        """
+        官方推理接口，完全对齐 XGuard 官方推理逻辑。
+        参数:
+            messages: 对话消息列表
+            policy: 动态策略（可选），用于运行时自定义安全检测规则
+            max_new_tokens: 最大生成 token 数
+            reason_first: 是否先生成归因分析再输出风险 token
+        返回:
+            {
+                'response':    str,              # 完整解码文本
+                'token_score': {text: prob, ...}, # 风险 token 位置的 topk token 分数
+                'risk_score':  {risk_name: prob, ...} # 匹配到 id2risk 的风险类别分数
+            }
+        """
+        with self._lock:
+            # 使用 chat template 渲染输入（含 policy 和 reason_first 参数）
+            rendered_query = self.tokenizer.apply_chat_template(
+                messages,
+                policy=policy,
+                reason_first=reason_first,
+                tokenize=False
+            )
+            model_inputs = self.tokenizer(
+                [rendered_query], return_tensors="pt"
+            ).to(self.model.device)
+            with torch.no_grad():
+                outputs = self.model.generate(
+                    **model_inputs,
+                    max_new_tokens=max_new_tokens,
+                    do_sample=False,
+                    output_scores=True,
+                    return_dict_in_generate=True
+                )
+            batch_idx = 0
+            input_length = model_inputs['input_ids'].shape[1]
+            # 解码响应文本
+            output_ids = outputs["sequences"].tolist()[batch_idx][input_length:]
+            response = self.tokenizer.decode(output_ids, skip_special_tokens=True)
+            # ---- 解析每个生成位置的 topk 分数 (官方逻辑) ----
+            generated_tokens = outputs.sequences[:, input_length:]
+            scores = torch.stack(outputs.scores, dim=1)
+            scores = scores.softmax(dim=-1)
+            scores_topk_value, scores_topk_index = scores.topk(k=10, dim=-1)
+            generated_tokens_with_probs = []
+            for generated_token, score_topk_value, score_topk_index in zip(
+                generated_tokens, scores_topk_value, scores_topk_index
+            ):
+                generated_tokens_with_prob = []
+                for token, topk_value, topk_index in zip(
+                    generated_token, score_topk_value, score_topk_index
+                ):
+                    token = int(token.cpu())
+                    if token == self.tokenizer.pad_token_id:
+                        continue
+                    res_topk_score = {}
+                    for ii, (value, index) in enumerate(zip(topk_value, topk_index)):
+                        if ii == 0 or value.cpu().numpy() > 1e-4:
+                            text = self.tokenizer.decode(index.cpu().numpy())
+                            res_topk_score[text] = {
+                                "id": str(int(index.cpu().numpy())),
+                                "prob": round(float(value.cpu().numpy()), 4),
+                            }
+                    generated_tokens_with_prob.append(res_topk_score)
+                generated_tokens_with_probs.append(generated_tokens_with_prob)
+            # 确定风险分数的 token 位置索引
+            # reason_first=False: 风险 token 在第一个位置 (idx=0)
+            # reason_first=True:  风险 token 在倒数第二个位置 (reasoning 后、EOS 前)
+            score_idx = (
+                max(len(generated_tokens_with_probs[batch_idx]) - 2, 0)
+                if reason_first else 0
+            )
+            # 提取 token 分数和风险分数（官方方式: decoded text 直接匹配 id2risk）
+            token_score = {
+                k: v['prob']
+                for k, v in generated_tokens_with_probs[batch_idx][score_idx].items()
+            }
+            risk_score = {
+                self.id2risk[k]: v['prob']
+                for k, v in generated_tokens_with_probs[batch_idx][score_idx].items()
+                if k in self.id2risk
+            }
+            return {
+                'response': response,
+                'token_score': token_score,
+                'risk_score': risk_score,
+            }
+    def parse_explanation(self, response: str) -> Optional[str]:
+        """
+        从响应中解析归因分析部分。
+        XGuard 在 reason_first=False 模式下，输出格式为:
+            [风险分类 token][归因分析文本]
+        风险 token 是 id2risk 中的短字符串 key（如 'sec', 'pc' 等），
+        后续文本为自然语言的归因分析说明。
+        """
+        if not response or not response.strip():
+            return None
+        # 方式 1: 兼容 <explanation>...</explanation> 标签格式
+        match = re.search(r'<explanation>(.*?)</explanation>', response, re.DOTALL)
+        if match:
+            return match.group(1).strip()
+        text = response.strip()
+        # 方式 2: 剥离开头的风险分类 token，提取后续归因文本
+        # id2risk 的 key 是短字符串（如 'sec', 'pc'），模型输出以它开头
+        if self.id2risk:
+            for key in sorted(self.id2risk.keys(), key=len, reverse=True):
+                if text.startswith(key):
+                    remainder = text[len(key):].strip()
+                    if remainder:
+                        return remainder
+                    break  # 匹配到 token 但无后续文本，说明未生成归因
+        # 方式 3: 响应长度明显超过单个风险 token（通常 2-4 字符），直接作为归因返回
+        if len(text) > 8:
+            return text
+        return None
+    def analyze(self, messages: List[Dict[str, str]], tools: List[Dict[str, Any]],
+                enable_reasoning: bool = False, policy=None) -> Dict[str, Any]:
+        """
+        高层分析接口，封装推理结果为结构化格式。
+        参数:
+            messages: 对话消息列表
+            tools: 工具信息（已拼接到 messages 中，暂未使用）
+            enable_reasoning: 是否启用归因分析（生成更多 token）
+            policy: 动态策略（可选）
+        """
+        # 启用归因分析时生成更多 token 以获取完整的归因解释
+        max_new_tokens = 512 if enable_reasoning else 1
+        infer_result = self.infer(
+            messages,
+            policy=policy,
+            max_new_tokens=max_new_tokens,
+            reason_first=False
+        )
+        risk_scores = infer_result.get("risk_score", {})
+        response = infer_result.get("response", "")
+        # ================================================================
+        # 风险判定 — 基于 XGuard 论文的 argmax + 置信度分级框架
+        #
+        # 理论基础 (arxiv 2601.15588):
+        #   XGuard 的训练目标 max_θ log P(y_cls | X; θ) 保证
+        #   第一个 token 的 softmax argmax = 模型预测的风险类别。
+        #   概率值即为校准后的置信度。
+        #
+        # 判定流程:
+        #   Layer 1 — argmax: 概率最高的类别就是模型的答案
+        #   Layer 2 — 置信度门控: safe 判定需 >= 0.5 (过半数置信)
+        #   Layer 3 — 风险分级: 按 top_risk_prob 划分 high/medium/low
+        # ================================================================
+        SAFE_CATEGORY = "Safe-Safe"
+        safe_prob = risk_scores.get(SAFE_CATEGORY, 0.0)
+        # 提取非安全类风险项，按分数降序排列
+        risk_items = {k: v for k, v in risk_scores.items() if k != SAFE_CATEGORY}
+        sorted_risks = sorted(risk_items.items(), key=lambda x: x[1], reverse=True)
+        top_risk_name = sorted_risks[0][0] if sorted_risks else ""
+        top_risk_prob = sorted_risks[0][1] if sorted_risks else 0.0
+        # Layer 1 + 2: argmax 决策 + 置信度门控
+        if safe_prob >= top_risk_prob and safe_prob >= 0.5:
+            # argmax = Safe-Safe, 且置信度过半 → 判定安全
+            is_safe = 1
+            risk_level = "safe"
+        elif safe_prob >= top_risk_prob:
+            # argmax = Safe-Safe, 但置信度不足 0.5
+            # 模型最倾向安全，但不够确定，谨慎标记为低风险
+            is_safe = 0
+            risk_level = "low"
+        else:
+            # argmax = 某风险类别 (top_risk_prob > safe_prob)
+            # Layer 3: 按风险置信度分级
+            is_safe = 0
+            if top_risk_prob >= 0.5:
+                risk_level = "high"
+            elif top_risk_prob >= 0.3:
+                risk_level = "medium"
+            else:
+                risk_level = "low"
+        # 置信度: 模型对当前判定的确信程度
+        confidence = safe_prob if is_safe == 1 else top_risk_prob
+        # 构建风险类型列表和原因说明
+        # 无论安全与否，始终输出最高风险项作为风险提示
+        if is_safe == 0:
+            top_risks = sorted_risks[:3]
+        else:
+            # 安全时仅取最高风险项作为提示
+            top_risks = sorted_risks[:1] if sorted_risks else []
+        risk_types = [r[0] for r in top_risks]
+        reason = "; ".join([f"{r}: {s}" for r, s in top_risks])
+        result = {
+            "is_safe": is_safe,
+            "risk_level": risk_level,
+            "confidence": round(confidence, 4),
+            "risk_type": risk_types,
+            "reason": reason,
+            "detail_scores": risk_scores,
+            "response": response
+        }
+        # 如果启用了归因分析，解析并添加 explanation
+        if enable_reasoning:
+            explanation = self.parse_explanation(response)
+            if explanation:
+                result["explanation"] = explanation
+        return result

requirements.txt ADDED Viewed

	@@ -0,0 +1,76 @@

+accelerate==1.12.0
+aiofiles==24.1.0
+annotated-doc==0.0.4
+annotated-types==0.7.0
+anyio==4.12.1
+av==16.1.0
+brotli==1.2.0
+certifi==2026.1.4
+charset-normalizer==3.4.4
+click==8.3.1
+colorama==0.4.6
+distro==1.9.0
+fastapi==0.128.5
+ffmpy==1.0.0
+filelock==3.20.3
+fsspec==2026.2.0
+gradio==6.5.1
+gradio_client==2.0.3
+groovy==0.1.2
+h11==0.16.0
+hf-xet==1.2.0
+httpcore==1.0.9
+httpx==0.28.1
+huggingface_hub==1.4.1
+idna==3.11
+Jinja2==3.1.6
+jiter==0.13.0
+markdown-it-py==4.0.0
+MarkupSafe==3.0.3
+mdurl==0.1.2
+modelscope==1.34.0
+mpmath==1.3.0
+networkx==3.6.1
+numpy==2.4.2
+scikit-learn>=1.6.0
+scipy>=1.14.0
+openai==2.17.0
+orjson==3.11.7
+packaging==26.0
+pandas==3.0.0
+pillow==12.1.0
+psutil==7.2.2
+pydantic==2.12.5
+pydantic_core==2.41.5
+pydub==0.25.1
+Pygments==2.19.2
+python-dateutil==2.9.0.post0
+python-multipart==0.0.22
+pytz==2025.2
+PyYAML==6.0.3
+qwen-vl-utils==0.0.14
+regex==2026.1.15
+requests==2.32.5
+rich==14.3.2
+safehttpx==0.1.7
+safetensors==0.7.0
+semantic-version==2.10.0
+setuptools==82.0.0
+shellingham==1.5.4
+six==1.17.0
+sniffio==1.3.1
+starlette==0.52.1
+sympy==1.14.0
+tokenizers==0.22.2
+tomlkit==0.13.3
+torch==2.10.0
+torchvision==0.25.0
+tqdm==4.67.3
+transformers==5.1.0
+typer==0.21.1
+typer-slim==0.21.1
+typing-inspection==0.4.2
+typing_extensions==4.15.0
+tzdata==2025.3
+urllib3==2.6.3
+uvicorn==0.40.0