D2E (Desktop to Embodied AI) presents a framework that effectively transfers learned sensorimotor patterns from desktop gaming environments to real-world robotic tasks. It utilizes the OWA Toolkit for data collection, the Generalist-IDM for pseudo-labeling, and Vision-Action Pretraining to achieve high success rates in manipulation and navigation benchmarks. The framework demonstrates promising results for zero-shot generalization across diverse gaming environments, establishing desktop pretraining as a viable method for embodied AI.