* integration: improve ability to test individual models
Add OLLAMA_TEST_MODEL env var to run integration tests against a
single model.
Enhance vision tests: multi-turn chat with cached image tokens, object
counting, spatial reasoning, detail recognition, scene understanding, OCR, and
multi-image comparison.
Add tool calling stress tests with complex agent-style prompts, large
system messages, and multi-turn tool response handling.
* review comments