docs: add PTOAS usability evaluation skill#682
Draft
HecreReed wants to merge 6 commits into
Draft
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive usability evaluation framework for the PTOAS repository, including a new skill definition, detailed metrics for operator reproduction and implementation, and an evidence checklist. The review feedback suggests refining the skill description for brevity and removing redundant search commands in the evidence checklist to improve reliability and clarity.
| @@ -0,0 +1,62 @@ | |||
| --- | |||
| name: ptoas-usability-eval | |||
| description: Evaluate PTOAS repository usability for operator reproduction/deployment and the PTOAS-supported subset of basic operator implementation. Use PTOAS repo docs, scripts, samples, and CI config as evidence; score scene 01 as the primary template, score only the build/run/validation subset of scene 04, and mark unsupported scene 02/03/05/06 items as N/A. | |||
There was a problem hiding this comment.
此 description 字段内容过长,更像是一段摘要。根据 YAML frontmatter 的惯例,description 字段通常是更简洁的单行描述,而将详细说明放在文档正文中。建议将其缩短以提高可读性和规范性。
Suggested change
| description: Evaluate PTOAS repository usability for operator reproduction/deployment and the PTOAS-supported subset of basic operator implementation. Use PTOAS repo docs, scripts, samples, and CI config as evidence; score scene 01 as the primary template, score only the build/run/validation subset of scene 04, and mark unsupported scene 02/03/05/06 items as N/A. | |
| description: Evaluate PTOAS repository usability for operator reproduction/deployment (scene 01) and a subset of basic implementation (scene 04). |
|
|
||
| ```bash | ||
| rg -n "构建|运行测试|上板验证|compile-only|generate_testcase|run_remote_npu_validation" README.md docs test .github | ||
| rg --files test/samples |
Codex Review该评论由 review 机器人自动更新。
SummaryReview failed at stage Findings未生成结构化 findings,因为 review 过程提前失败。 Log Tail |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
skills/ptoas-usability-eval/01 算子复现部署and the PTOAS-supported subset of04 算子基本功能实现Directory Layout
skills/ptoas-usability-eval/.codex/skills/ptoas-usability-eval/.cursor/skills/ptoas-usability-eval/.trae/skills/ptoas-usability-eval/.claude/skills/ptoas-usability-eval/Why
PTOAS does not cleanly map to all six operator usability scenarios. This skill constrains evaluation to the parts the repo can actually evidence from its own docs, scripts, samples, and CI configuration, and marks unsupported areas as
N/Ainstead of forcing misleading scores.The layered model is important because
bishengand CANN compile-only belong to a Linux+CANN environment, while NPU board validation belongs to a device-equipped server environment. The skill now requires evaluators to declare the covered layer first and marks higher layers as未实测instead of treating a missing local environment as a PTOAS usability failure.Included content
SKILL.md: trigger conditions, workflow, output shape, scope boundaries, and layer gatingreferences/scope.md: scenario mapping, exclusions, and evaluation layersreferences/evidence-checklist.md: canonical repo evidence sources, search order, and layer mappingreferences/metrics-01.md: scoring guidance for01 算子复现部署references/metrics-04.md: scoring guidance for the PTOAS-supported subset of04agents/openai.yaml: UI metadata where applicableREADME.md: explains the neutral source copy and per-client entrypointsNotes