A model-agnostic verification-and-refinement pipeline was developed to improve the performance of large language models on International Mathematical Olympiad (IMO) problems, achieving an accuracy of approximately 85.7% on the 2025 competition. This approach significantly outperformed the baseline accuracies of the models Gemini 2.5 Pro, Grok-4, and GPT-5, highlighting the importance of effective methodologies alongside powerful base models for solving complex mathematical tasks.