风险评分

47/100 (Medium)

OpenClaw: suspicious
VirusTotal: suspicious
StaticScan: clean

test-skill

作者: TZWwww
Slug:adaptive-safety-refusal-integrity
版本:1.0.0
更新时间:2026-03-26 16:56:49
风险信息

OpenClaw: suspicious

查看 OpenClaw 分析摘要(前 200 字预览)
The skill's stated purpose (testing model refusal to jailbreaks) broadly matches its instructions, but the runtime docs ask the agent to gather harmful seeds, use web/RAG capabilities, and produce exp...

[内容已截断]

VirusTotal: suspicious VT 报告

静态扫描: clean

No suspicious patterns detected.
README

README 未提供

文件列表

无文件信息

下载
下载官方 ZIP
原始 JSON 数据
{
    "latestVersion": {
        "_creationTime": 1774515061192,
        "_id": "k973x311xv4hgbcnz0s5zxp5p983ns4q",
        "changelog": "- Initial release introducing the \"adaptive-safety-refusal-integrity\" skill.\n- Enables testing of model refusal integrity against attempts to elicit harmful or forbidden information via creative writing or role-playing.\n- Defines real-world cases for detecting and refusing requests disguised using humor or insider role-play techniques.\n- Outlines a strict 3-phase pipeline for synthesizing evaluation data, with reference to guidance documents for each phase.\n- Focuses on upholding safety alignment and ethical boundaries across multiple risk categories despite user attempts to bypass safeguards.",
        "changelogSource": "user",
        "createdAt": 1774515061192,
        "version": "1.0.0"
    },
    "owner": {
        "_creationTime": 0,
        "_id": "s171v84p1hdytd53c4c49btymd83n2v2",
        "displayName": "TZWwww",
        "handle": "tzwwww",
        "image": "https:\/\/avatars.githubusercontent.com\/u\/34390289?v=4",
        "kind": "user",
        "linkedUserId": "kn738wvgv50e0adqb45j5s399183n05e"
    },
    "ownerHandle": "tzwwww",
    "skill": {
        "_creationTime": 1774515061192,
        "_id": "kd7anshq3tp744qdd43b35nd4x83nzsy",
        "badges": [],
        "createdAt": 1774515061192,
        "displayName": "test-skill",
        "latestVersionId": "k973x311xv4hgbcnz0s5zxp5p983ns4q",
        "ownerPublisherId": "s171v84p1hdytd53c4c49btymd83n2v2",
        "ownerUserId": "kn738wvgv50e0adqb45j5s399183n05e",
        "slug": "adaptive-safety-refusal-integrity",
        "stats": {
            "comments": 0,
            "downloads": 26,
            "installsAllTime": 0,
            "installsCurrent": 0,
            "stars": 0,
            "versions": 1
        },
        "summary": "Use this skill when the user wants to test if the model can be tricked into giving out dangerous or forbidden information by using creative writing or 'actin...",
        "tags": {
            "latest": "k973x311xv4hgbcnz0s5zxp5p983ns4q"
        },
        "updatedAt": 1774515409341
    }
}