5 links
tagged with all of: machine-learning + robustness
Click any tag below to further narrow down your results
Links
StableToken is introduced as a noise-robust semantic speech tokenizer that addresses the fragility of existing tokenizers when faced with irrelevant acoustic perturbations. By leveraging a multi-branch architecture and a consensus-driven bit-wise voting mechanism, StableToken significantly enhances token stability and improves the performance of SpeechLLMs across various tasks, reducing Unit Edit Distance under noisy conditions.
Noisy labels can hinder the training of deep neural networks, leading to inaccuracies. The proposed $\epsilon$-softmax method modifies the softmax layer's outputs to approximate one-hot vectors with a controllable error, enhancing noise tolerance while maintaining a balance between robustness and effective learning through a combination with symmetric loss functions. Extensive experiments indicate its effectiveness in addressing both synthetic and real-world label noise.
Self-play has proven to be a highly effective approach for training autonomous driving systems, achieving state-of-the-art performance without using human data. Utilizing the Gigaflow simulator, the study generated an impressive 1.6 billion kilometers of driving scenarios, resulting in a policy that demonstrates exceptional robustness and realism, averaging 17.5 years of continuous driving between incidents in simulation.
The paper addresses the challenge of evaluating classifiers in the presence of missing labels, particularly in scenarios where data is Missing Not At Random (MNAR). It introduces a multiple imputation method to derive robust metrics such as precision, recall, and ROC-AUC, providing both point estimates and predictive distributions. The authors demonstrate the accuracy of these distributions and establish their Gaussian nature, along with finite-sample convergence bounds and a robustness proof under a realistic error model.
Large Language Model (LLM) judges are essential for safety evaluations in AI systems, but their reliability is questionable due to challenges like prompt sensitivity and vulnerability to adversarial attacks. The study reveals significant performance variations in these judges, indicating that they may not provide accurate assessments, leading to a false sense of security regarding model safety.