# Tool Call Benchmark Report

**Model:** google/gemini-3.1-pro-preview  
**Date:** 2026-04-14 11:54:03  
**Suite:** full  
**Score:** 18/18 (100.0%)
  
**First-attempt accuracy:** 18/18 (100.0%)
  
*2 test(s) skipped — not counted in score*

## Key Metrics

| Metric | Value |
|--------|-------|
| Hits | 18 |
| Misses | 0 |
| Skips | 2 |
| Misfires | 0 |
| Total attempts | 18 |
| Clean attempts | 18 |
| Total duration | 5.1s |

## Summary

| Category | Passed | Total | Misfires | Duration | Score |
|----------|--------|-------|----------|----------|-------|
| Bash Execution | 4 | 4 | 0 | 0.2s | 100% |
| File Operations | 6 | 6 | 0 | 0.3s | 100% |
| MCP Tool Calls | 2 | 2 | 0 | 3.0s | 100% |
| Skill Invocations | 3 | 3 | 0 | 1.5s | 100% |
| Generation | 3 | 3 | 0 | 0.2s | 100% |

## Detailed Results

### Bash Execution

| ID | Test | Status | Time | Attempts | Notes |
|----|------|--------|------|----------|-------|
| TC-B01 | Echo exact string | ✓ PASS | 27ms | 1 |  |
| TC-B02 | Python arithmetic | ✓ PASS | 39ms | 1 |  |
| TC-B03 | Node JSON output | ✓ PASS | 70ms | 1 |  |
| TC-B04 | Pipeline command | ✓ PASS | 34ms | 1 |  |

### File Operations

| ID | Test | Status | Time | Attempts | Notes |
|----|------|--------|------|----------|-------|
| TC-F01 | Write file | ✓ PASS | 50ms | 1 |  |
| TC-F02 | Read file back | ✓ PASS | 50ms | 1 |  |
| TC-F03 | Edit file | ✓ PASS | 50ms | 1 |  |
| TC-F04 | Verify edit | ✓ PASS | 50ms | 1 |  |
| TC-F05 | Glob find | ✓ PASS | 50ms | 1 |  |
| TC-F06 | Grep search | ✓ PASS | 50ms | 1 |  |

### MCP Tool Calls

| ID | Test | Status | Time | Attempts | Notes |
|----|------|--------|------|----------|-------|
| TC-M01 | ToolSearch — fetch deferred schema | ⊘ SKIP | 0ms | 0 | ToolSearch tool not available in harness |
| TC-M02 | Context7 — resolve library | ✓ PASS | 1500ms | 1 |  |
| TC-M03 | Context7 — query docs | ✓ PASS | 1500ms | 1 |  |
| TC-M04 | ToolSearch — keyword search | ⊘ SKIP | 0ms | 0 | ToolSearch tool not available in harness |

### Skill Invocations

| ID | Test | Status | Time | Attempts | Notes |
|----|------|--------|------|----------|-------|
| TC-S01 | Invoke current-datetime | ✓ PASS | 500ms | 1 |  |
| TC-S02 | Invoke brand-guidelines | ✓ PASS | 500ms | 1 |  |
| TC-S03 | Invoke chart-taste | ✓ PASS | 500ms | 1 |  |

### Generation

| ID | Test | Status | Time | Attempts | Notes |
|----|------|--------|------|----------|-------|
| TC-G01 | Create PDF via Python | ✓ PASS | 41ms | 1 |  |
| TC-G02 | Verify PDF exists | ✓ PASS | 23ms | 1 |  |
| TC-G03 | SVG to PNG generation | ✓ PASS | 109ms | 1 |  |

---

*Generated by `/oneshot-tool-call` benchmark*
