The Low-to-high Multi-Level Transformer (LMLT) introduces a novel approach for image super-resolution that reduces the complexity and inference time associated with existing Vision Transformer models. By employing attention mechanisms with varying feature sizes and integrating results from lower heads into higher heads, LMLT effectively captures both local and global information, mitigating issues related to window boundaries in self-attention. Experimental results indicate that LMLT outperforms state-of-the-art methods while significantly reducing GPU memory usage.