Llm Evaluation

风险评分

100/100 (Very Low)

OpenClaw： benign
VirusTotal： benign
StaticScan： clean

作者： codenova58

Slug：llm-evaluation

版本：1.0.0

更新时间：2026-03-25 08:51:54

风险信息

OpenClaw： benign

查看 OpenClaw 分析摘要

Instruction-only workflow doc for LLM evaluation that does not request credentials, install code, or perform unexpected actions — internally consistent with its stated purpose.

VirusTotal： benign VT 报告

静态扫描： clean

No suspicious patterns detected.

README

README 未提供

文件列表

无文件信息

下载

下载官方 ZIP

原始 JSON 数据

{
    "latestVersion": {
        "_creationTime": 1774397771175,
        "_id": "k976vhy9v77gts7570jmabcts983k7wq",
        "changelog": "llm-evaluation 1.0.0\n\n- Initial release of a comprehensive workflow for deep LLM evaluation.\n- Covers definition of quality dimensions, dataset\/rubric development, automatic and human evaluation, regression gates, and online validation.\n- Guidance on when and how to apply the workflow, including trigger conditions and risk management.\n- Includes detailed stage-by-stage practices, checklists, and tips for robust, reproducible model assessment.\n- Tailored for use cases such as prompt\/model updates, CI for LLM outputs, RAG, and agent evaluation.",
        "changelogSource": "auto",
        "createdAt": 1774397771175,
        "version": "1.0.0"
    },
    "owner": {
        "_creationTime": 0,
        "_id": "s173fekm3yw84k7gp861dme7dd83gvyf",
        "displayName": "codenova58",
        "handle": "codenova58",
        "image": "https:\/\/avatars.githubusercontent.com\/u\/191358186?v=4",
        "kind": "user",
        "linkedUserId": "kn75mxt9vgwk4tg86kc540f9zd8330gr"
    },
    "ownerHandle": "codenova58",
    "skill": {
        "_creationTime": 1774397771175,
        "_id": "kd740yz5gwp4x8vzvysrh65awh83jx8v",
        "badges": [],
        "createdAt": 1774397771175,
        "displayName": "Llm Evaluation",
        "latestVersionId": "k976vhy9v77gts7570jmabcts983k7wq",
        "ownerPublisherId": "s173fekm3yw84k7gp861dme7dd83gvyf",
        "ownerUserId": "kn75mxt9vgwk4tg86kc540f9zd8330gr",
        "slug": "llm-evaluation",
        "stats": {
            "comments": 0,
            "downloads": 5,
            "installsAllTime": 0,
            "installsCurrent": 0,
            "stars": 0,
            "versions": 1
        },
        "summary": "Deep LLM evaluation workflow—quality dimensions, golden sets, human vs automatic metrics, regression suites, offline\/online signals, and safe rollout gates f...",
        "tags": {
            "latest": "k976vhy9v77gts7570jmabcts983k7wq"
        },
        "updatedAt": 1774399914610
    }
}