TLDR: Use hooks for typechecking and filter the output to minmax the context. Basic Claude Code Hook to run Pyright | Settings Config
I’m a big fan of the Types-As-Guardrails-for-AI pattern. After switching to Claude Code from Cursor, the thing I miss most is how Cursor would auto typecheck and resolve errors before completing any task or moving on to the next. Claude Code can kind of do that, but you have to put a gun to it’s head.
The longer you wait to type check, the harder it is to recover. Imagine you just spent a week1 on a feature, you check your types, it tells you something you don’t want to hear. You can either:
- spend another half a week refactoring or
- slap an
Anyand move on.
claude-sonnet-4-20250514 behaves exactly the same way. You tell it to do one last “check” at the end and it’s much more likely to do the 30 second fix. That’s probably why we have red squiggles in our IDE, for that immediate feedback loop. I wanted to get the squiggles back.
The problems I set out to solve:
- I needed a reliable way to run typechecking, automatically.
- The agent can’t be responsible for running typechecking, it pollutes the context
- wastes money in token costs
- makes the agent dumber over time
This is specifically for python and pyright, but the hook can easily be applied to typescript or any other language that has a typechecker.
Solution 1
The simplest possible solution to this is to stuff instructions in A̶G̶E̶N̶T̶S̶.̶m̶d̶ CLAUDE.md: “always run typechecking”. Maybe put some 🚨 ⚠️❗ emojis. ALL CAPS!!! Maybe it listens. Maybe it doesn’t. If you look at what Claude Code beams up the wire with proxy, it ends it’s <system-reminder> with:
IMPORTANT: this context may or may not be relevant to your tasks. You should not respond to this context unless it is highly relevant to your task
But even if this is Good Enough, let’s look at the actual cost of this in terms of context and tokens. Here’s an example file full of type errors.
For a file like this, an example exchange looks like:
- user: add a “hello world” print statement to @main.py then run pyright and fix any errors, regardless of whether or not you caused them. Do not use Any or ignore.
- assistant: I’ll add a “hello world” print statement to main.py, run pyright, and fix any errors found.
- assistant: [creates todos: add hello world, run pyright, fix errors]
- assistant: [marks “add hello world” as in_progress]
- assistant: [edits main.py to insert
print("hello world")] - assistant: [marks “add hello world” as completed, “run pyright” as in_progress]
- assistant: [runs
pyright main.py] - assistant: [pyright finds 4 errors: unused import, wrong return types, wrong argument type]
- assistant: [marks “run pyright” as completed, “fix errors” as in_progress]
- assistant: Found 4 pyright errors. I’ll fix them one by one:
- assistant: [fixes all 4 errors: removes unused import, fixes return types, fixes function call]
- assistant: [runs
pyright main.pyto verify] - assistant: [pyright shows: 0 errors, 0 warnings, 0 informations]
- assistant: [marks all todos completed]
- assistant: Done. Added “hello world” print statement to main.py and fixed all 4 pyright errors: removed unused import, fixed return types in get_name() and get_age(), and fixed greet() call argument type.
And for a file that doesn’t have any type errors, an example exchange looks like:
- user: add a “hello world” print statement to @main.py then run pyright and fix any errors, regardless of whether or not you caused them. Do not use Any or ignore.
- assistant: I’ll help you add a “hello world” print statement to main.py and then run pyright to fix any errors.
- assistant: [creates todos: add hello world, run pyright, fix errors]
- assistant: [marks “add hello world” as in_progress]
- assistant: [edits main.py to insert
print("hello world")afterresult = greet("Bob")] - assistant: [marks “add hello world” as completed, “run pyright” as in_progress]
- assistant: [runs
pyright main.py] - assistant: [pyright shows: 0 errors, 0 warnings, 0 informations]
- assistant: [marks all todos completed]
- assistant: Done! I’ve added the “hello world” print statement to main.py and ran pyright. No errors were found, so no fixes were needed.
Peep the full Claude Code Export | Claude Code State File
Problems
In both examples, the agent needs to manually run the pyright command. This adds a brand new tool call like:
"content": [
{
"type": "tool_use",
"id": "toolu_019sfATskmNKzB1oWXNkbPyo",
"name": "Bash",
"input": {
"command": "pyright main.py",
"description": "Run pyright type checker on main.py"
}
}
],
This is completely unnecessary and exceptionally unnecessary for files that don’t have any type errors. So even if we could guarantee the agent would always manually run pyright (we can’t and it won’t), it would still be wasting tokens. Also consider that the default state of your repo should always be free from type errors.
Solution 2 - Hooks
Hooks are Claude Code’s solution to this problem. Hooks let us execute commands after Claude does a thing. They’re also stupidedly hard for me to use for some reason. For our use case, we want the PostToolUse hook:
"hooks": {
"PostToolUse": [
{
"matcher": "Write|Edit|MultiEdit",
"hooks": [
{
"type": "command",
"command": "python $CLAUDE_PROJECT_DIR/.claude/hooks/pyright.py"
}
]
}
]
}
This forces Claude to run a python script every time it creates or edits a file.
import json
import sys
import subprocess
from typing import TypedDict
class HookInput(TypedDict):
session_id: str
transcript_path: str
cwd: str
hook_event_name: str
tool_name: str
tool_input: dict
tool_response: dict
def main():
"""
Main program - runs pyright on Python files after Claude edits them
"""
try:
# Read hook event data from Claude
input_data = json.load(sys.stdin)
# Extract file path from tool input
tool_input = input_data.get('tool_input', {})
file_path = tool_input.get('file_path')
# Only run pyright on Python files
if not file_path or not (file_path.endswith('.py') or file_path.endswith('.pyi')):
print(f"Skipping non-Python file: {file_path}")
sys.exit(0)
# Run pyright with normal output
result = subprocess.run(
['pyright', file_path],
capture_output=True,
text=True
)
# Output raw pyright response
if result.returncode != 0 and result.stdout:
print("There are several type issues that MUST be fixed, regardless of whether or not you made them:", file=sys.stderr)
print(result.stdout, file=sys.stderr)
sys.exit(2)
print("✅ No type errors found", file=sys.stderr)
sys.exit(0)
except Exception as e:
print(f"Hook error: {e}", file=sys.stderr)
sys.exit(0) # Don't block Claude's workflow
if __name__ == "__main__":
main()
Breaking this down a bit, the HookInput comes from Claude and looks something like:
{
"session_id": "abc123",
"transcript_path": "/Users/.../.claude/projects/.../00893aaf-19fa-41d2-8238-13269b9b3ca0.jsonl",
"cwd": "/Users/...",
"hook_event_name": "PostToolUse",
"tool_name": "Write",
"tool_input": {
"file_path": "/path/to/file.txt",
"content": "file content"
},
"tool_response": {
"filePath": "/path/to/file.txt",
"success": true
}
}
We take the input and do a little validation—we don’t want to run pyright on files that aren’t python. We might also want to exclude certain directories or files.
Then we run pyright, if there are any results, we send them back to Claude with a message:
There are several type issues that MUST be fixed, regardless of whether or not you made them: [ list of errors ]
Claude likes to do this thing where it won’t touch problems it didn’t create. I guess humans do too.
Then, most importantly, we use exit code 2 so that Claude can actually see the output. Otherwise, it might look like your hook is running, but Claude won’t actually see the output.
Now if we run this, we get something that looks like:
- user: add a “hello world” print statement to @main.py
- assistant: [edits main.py to add
print("Hello world")at the end] - system: [post-tool hook runs pyright and finds 4 errors: missing import, wrong return types, wrong argument type]
- assistant: [creates todos for fixing all 4 pyright errors]
- assistant: [marks first todo as in_progress]
- assistant: [removes
import unused_importline] - system: [post-tool hook runs pyright and finds 3 remaining errors]
- assistant: [marks first todo completed, second todo as in_progress]
- assistant: [changes
return 42toreturn "42"in get_name function] - system: [post-tool hook runs pyright and finds 2 remaining errors]
- assistant: [marks second todo completed, third todo as in_progress]
- assistant: [changes
return "hello"toreturn 42in get_age function] - system: [post-tool hook runs pyright and finds 1 remaining error]
- assistant: [marks third todo completed, fourth todo as in_progress]
- assistant: [changes
greet(123)togreet("123")] - system: [post-tool hook runs pyright successfully with no errors]
- assistant: [marks all todos completed]
Peep the full Claude Code Export | Claude Code State File
If we look at the actual network traffic going to anthropic, all of these system calls get added to the conversation as “Edit operation feedback” directly after the user tool call response. There’s no extra tool calls.
{
"role": "user",
"content": [
{
"tool_use_id": "toolu_01MVm2tTJTVoWb9RrJD9sjFw",
"type": "tool_result",
"content": "The file /Users/mark.cianfrani/Code/playground/a-more-efficient-pyright/main.py has been updated. Here's the result of running `cat -n` on a snippet of the edited file:\n 1\u2192from typing import List, Dict\n 2\u2192import unused_import\n 3\u2192\n 4\u2192print(\"Hello world\")\n 5\u2192\n 6\u2192def add(a, b):\n 7\u2192 return a + b\n 8\u2192"
},
{
"type": "text",
"text": "Edit operation feedback:\n- [python3 $CLAUDE_PROJECT_DIR/.claude/hooks/pyright-raw.py]: There are several type issues that MUST be fixed, regardless of whether or not you made them:\n/Users/mark.cianfrani/Code/playground/a-more-efficient-pyright/main.py\n /Users/mark.cianfrani/Code/playground/a-more-efficient-pyright/main.py:2:8 - error: Import \"unused_import\" could not be resolved (reportMissingImports)\n /Users/mark.cianfrani/Code/playground/a-more-efficient-pyright/main.py:13:12 - error: Type \"Literal[42]\" is not assignable to return type \"str\"\n \u00a0\u00a0\"Literal[42]\" is not assignable to \"str\" (reportReturnType)\n /Users/mark.cianfrani/Code/playground/a-more-efficient-pyright/main.py:16:12 - error: Type \"Literal['hello']\" is not assignable to return type \"int\"\n \u00a0\u00a0\"Literal['hello']\" is not assignable to \"int\" (reportReturnType)\n /Users/mark.cianfrani/Code/playground/a-more-efficient-pyright/main.py:21:16 - error: Argument of type \"Literal[123]\" cannot be assigned to parameter \"name\" of type \"str\" in function \"greet\"\n \u00a0\u00a0\"Literal[123]\" is not assignable to \"str\" (reportArgumentType)\n4 errors, 0 warnings, 0 informations \n\n",
"cache_control": {
"type": "ephemeral"
}
}
]
}
And if there aren’t any type errors in the file, the LLM sees nothing. No context cost.
Problems
So that’s great, we don’t waste any extra calls, but we can optimize this even further.
When we run pyright on our file, we see this:
/Users/mark.cianfrani/Code/playground/a-more-efficient-pyright/main.py
/Users/mark.cianfrani/Code/playground/a-more-efficient-pyright/main.py:2:8 - error: Import "unused_import" could not be resolved (reportMissingImports)
/Users/mark.cianfrani/Code/playground/a-more-efficient-pyright/main.py:11:12 - error: Type "Literal[42]" is not assignable to return type "str"
"Literal[42]" is not assignable to "str" (reportReturnType)
/Users/mark.cianfrani/Code/playground/a-more-efficient-pyright/main.py:14:12 - error: Type "Literal['hello']" is not assignable to return type "int"
"Literal['hello']" is not assignable to "int" (reportReturnType)
/Users/mark.cianfrani/Code/playground/a-more-efficient-pyright/main.py:19:16 - error: Argument of type "Literal[123]" cannot be assigned to parameter "name" of type "str" in function "greet"
"Literal[123]" is not assignable to "str" (reportArgumentType)
4 errors, 0 warnings, 0 informations
The default output is full of noise. The full file path is repeated everywhere. Just tons of duplication and redundancy. Type errors kind of suck to read as a human anyway, but for an LLM, it’s wasting tokens and making it dumber over time. This is the kind of thing we could use a subagent for to help with the context part, but we’d still be spending unnecessary tokens.
Solution 3 - The Slim Hook
At the absolute very least, we should strip out the full file paths on every error. This runs after every edit so the context is already implied. We can build on our hook and strip out all of the noise and save a little money on tokens. pyright has a json output that we can tap into:
uv run pyright --outputjson main.py
{
"file": "/Users/mark.cianfrani/Code/playground/a-more-efficient-pyright/main-with.errors.py",
"severity": "error",
"message": "Import \"unused_import\" could not be resolved",
"range": {
"start": {
"line": 2,
"character": 7
},
"end": {
"line": 2,
"character": 20
}
},
"rule": "reportMissingImports"
},
Off the bat, here’s what’s irrelevant to me:
- Exact line numbers - we can find the code via context. Claude should have the file in context anyway.
- Rule names - Kind of obvious from the message
- Severity - Everything needs to be fixed and if it doesn’t it should be ignored at the config level (YMMV)
We can update our hook to output only the most essential information:
import json
import sys
import subprocess
from typing import TypedDict
class PyrightError(TypedDict):
message: str
context: str
class PyrightOutput(TypedDict):
errors: list[PyrightError]
class HookInput(TypedDict):
session_id: str
transcript_path: str
cwd: str
hook_event_name: str
tool_name: str
tool_input: dict
tool_response: dict
def parse_pyright_json(json_data: dict) -> PyrightOutput:
"""
Convert verbose pyright JSON output to minimal AI-friendly format
Args:
json_data: Raw pyright JSON output
Returns:
PyrightOutput with only essential data for AI fixing
"""
errors: list[PyrightError] = []
for diagnostic in json_data.get('generalDiagnostics', []):
file_path = diagnostic['file']
message = diagnostic['message']
# Extract minimal source context around the error
context = extract_source_context(
file_path,
diagnostic['range']['start']['line']
)
errors.append({
"message": message,
"context": context
})
return {"errors": errors}
def extract_source_context(file_path: str, error_line: int) -> str:
"""Extract minimal but sufficient source code context around the error line"""
try:
with open(file_path, 'r') as f:
lines = f.readlines()
# Smart context extraction - just what's needed to understand the error
context_lines = []
# Always include the error line
if error_line < len(lines):
context_lines.append(lines[error_line].rstrip())
# For function-related errors, include function signature
for i in range(max(0, error_line - 5), error_line):
line = lines[i].strip()
if line.startswith('def ') or line.startswith('class '):
context_lines.insert(0, lines[i].rstrip())
break
# For import errors, that's all we need
if 'import' in context_lines[0] if context_lines else '':
return context_lines[0]
return '\n'.join(context_lines)
except (FileNotFoundError, IndexError):
return f"Could not read file: {file_path}"
def main():
"""
Main program - runs pyright on Python files after Claude edits them
"""
try:
# Read hook event data from Claude
input_data = json.load(sys.stdin)
# Extract file path from tool input
tool_input = input_data.get('tool_input', {})
file_path = tool_input.get('file_path')
# Only run pyright on Python files
if not file_path or not (file_path.endswith('.py') or file_path.endswith('.pyi')):
print(f"Skipping non-Python file: {file_path}")
sys.exit(0)
# Run pyright with JSON output for parsing
result = subprocess.run(
['pyright', file_path, '--outputjson'],
capture_output=True,
text=True
)
# Parse pyright output and format concisely
pyright_output = result.stderr or result.stdout
if pyright_output:
pyright_data = json.loads(pyright_output)
errors = pyright_data.get('generalDiagnostics', [])
if errors:
print("Pyright found type errors that you MUST fix, regardless of whether or not you caused them. Do not use Any or ignore:", file=sys.stderr)
for error in errors:
line_num = error['range']['start']['line'] + 1 # Convert to 1-based
message = error['message'].split('\n')[0] # First line only. YMMV
print(f" Line {line_num}: {message}", file=sys.stderr)
sys.exit(2)
print("✅ No type errors found", file=sys.stderr)
sys.exit(0)
except Exception as e:
print(f"Hook error: {e}", file=sys.stderr)
sys.exit(0) # Don't block Claude's workflow
if __name__ == "__main__":
main()
Token Counts and A Word On Cost
It’s pretty difficult to compare apples to apples and give a number as far as cost goes. Claude Code doesn’t give a lot of precise information about the cost of using it. There’s two ways we can get a rough estimate:
- Using /context in the Claude Code session
- Parsing the jsonl logs that get saved in your .claude/projects directory.
The problem is that the two produce wildly different numbers. The closest thing we have to an official count is the “/context” command. But that only tells us Message tokens. To calculate cost, we need to know the breakdown of what kind of tokens are used. Input tokens cost less than output tokens. Cached tokens cost less than freshies.
The majority of token usage in Claude Code is cached and therefore, pretty cheap. So as a strict cost savings measure, the slim hook approach might not be giving us all that much.
Claude Code /context export
| Method | Tokens/Session | Savings vs Prompt |
|---|---|---|
| Prompt | ~3.5k | — |
| Raw Hook | ~2.3k | ~35% |
| Slim Hook | ~1.6k | ~55% |
JSONL parsing
| Method | Usage Blocks | Input Tokens | Cache Creation Tokens | Cache Read Tokens | Output Tokens |
|---|---|---|---|---|---|
| Prompt | 11 | 43 | 37,529 | 168,389 | 0 |
| Raw Hook | 4 | 17 | 6,003 | 66,329 | 0 |
| Slim Hook | 2 | 7 | 5,245 | 30,035 | 0 |
Session data can be found in the repo
With lots of handwaving:
If we assume ~3.5k tokens per Prompt session vs ~1.6k tokens per Slim Hook session:
- At 3 sessions/day, 5 days/week
- Prompt ≈ 52.5k tokens/week
- Slim Hook ≈ 24k tokens/week
- Savings ≈ 28.5k tokens/week (~55%)
There’s absolutely a reduction in token usage, even if it’s hard to measure exactly, but the biggest value add for me is I get to stop having to say “run pyright. NO ANY. NO IGNORE” every hour of the day. The slim hook will need more handholding and awareness. If we remove too much context, the LLM will just make another tool call anyway.
I think it’ll be interesting to watch how tools evolve overtime, favoring concise human-readable outputs over verbose technical jargon. If that’s all we get from this AI bubble, I’ll be happy. I consider that a win even if you remove AI from the conversation completely.
Footnotes
-
1 Week in Old World Time = 1 hour in AI World Time ↩