1 link tagged with all of: language-models + efficiency-leverage + mixture-of-experts
Click any tag below to further narrow down your results
Links
Mixture-of-Experts (MoE) architectures enhance the efficiency of large language models (LLMs) by separating parameters from computational costs. This study introduces the Efficiency Leverage (EL) metric to quantify the computational advantage of MoE models and establishes a unified scaling law that predicts EL based on configuration parameters, demonstrating that a model with significantly fewer active parameters can achieve comparable performance to a larger dense model while using less computational resources.
mixture-of-experts ✓
efficiency-leverage ✓
+ scaling-laws
language-models ✓
+ computational-resources