The article presents a framework for continuous visual autoregressive generation via score maximization, which is theoretically grounded in strictly proper scoring rules. It highlights the use of likelihood-free learning with an energy Transformer, showcasing competitive performance in generation quality and inference efficiency while addressing limitations of existing methods. The repository includes instructions for setting up the environment, training models, and evaluating performance using provided scripts.