hvac-marketing-skills

ben/hvac-marketing-skills

Fork 0

Commit graph

Author	SHA1	Message	Date
Corey Haines	926c624d07	fix: address eval review - assertion mismatches and factual error - marketing-psychology eval 4: BJ Fogg assertion did not match expected_output which lists Goal-Gradient Effect. Fixed. - sales-enablement eval 2: all 6 categories assertion contradicted expected_output which only categorizes the 3 given objections. Fixed. - ad-creative eval 5: TikTok hard limit corrected to recommended (80 chars recommended, 100 max) per SKILL.md. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 15:51:28 -08:00
Corey Haines	7e7e7a09d8	fix: align eval assertions with SKILL.md content per Codex review Fixes 5 issues identified by independent Codex review: - product-marketing-context: match auto-draft workflow, section flexibility - marketing-psychology: replace phantom models with actual SKILL.md models - ad-creative: correct RSA pinning guidance to match skill - free-tool-strategy: boundary test now defers to related skill (page-cro) - paywall-upgrade-cro: boundary test references only related skills Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 14:07:38 -08:00
Corey Haines	11e9ea811f	feat: add evals for all 29 remaining skills (197 total evals across 32 skills) Each skill now has 5-8 evals covering: - Core framework usage with realistic prompts - Casual trigger phrase variants - Sub-type and section-specific coverage - Boundary tests (skill deferral to related skills) - Structured assertions for grading Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 13:37:01 -08:00

Author

SHA1

Message

Date

Corey Haines

926c624d07

fix: address eval review - assertion mismatches and factual error

- marketing-psychology eval 4: BJ Fogg assertion did not match expected_output
  which lists Goal-Gradient Effect. Fixed.
- sales-enablement eval 2: all 6 categories assertion contradicted expected_output
  which only categorizes the 3 given objections. Fixed.
- ad-creative eval 5: TikTok hard limit corrected to recommended (80 chars
  recommended, 100 max) per SKILL.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-04 15:51:28 -08:00

Corey Haines

7e7e7a09d8

fix: align eval assertions with SKILL.md content per Codex review

Fixes 5 issues identified by independent Codex review:
- product-marketing-context: match auto-draft workflow, section flexibility
- marketing-psychology: replace phantom models with actual SKILL.md models
- ad-creative: correct RSA pinning guidance to match skill
- free-tool-strategy: boundary test now defers to related skill (page-cro)
- paywall-upgrade-cro: boundary test references only related skills

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-04 14:07:38 -08:00

Corey Haines

11e9ea811f

feat: add evals for all 29 remaining skills (197 total evals across 32 skills)

Each skill now has 5-8 evals covering:
- Core framework usage with realistic prompts
- Casual trigger phrase variants
- Sub-type and section-specific coverage
- Boundary tests (skill deferral to related skills)
- Structured assertions for grading

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-04 13:37:01 -08:00

3 commits