Quit Emailing Yourself

# research → llms

2 links tagged with all of: research + llms

Click any tag below to further narrow down your results

Links

The many masks LLMs wear

This article explores the difficulties developers face in maintaining consistent personalities for large language models (LLMs). It highlights instances where chatbots have deviated from their intended roles and the ongoing research to improve their behavior and reliability.

Saved by tldr-importer · Last saved February 14, 2026 · 7 min read

+ chatbots llms ✓ + personality + safety research ✓

TrustAIRLab/HateBenchSet · Datasets at Hugging Face

The HateBenchSet is a dataset designed to benchmark hate speech detectors on content generated by various large language models (LLMs). It comprises 7,838 samples across 34 identity groups, including 3,641 labeled as hate and 4,197 as non-hate, with careful annotation performed by the authors to avoid exposing human subjects to harmful content. The dataset aims to facilitate research into LLM-driven hate campaigns and includes predictions from several hate speech detectors.

Saved by tldr-importer · Last saved October 29, 2025 · 4 min read

+ hate-speech + datasets llms ✓ + benchmark research ✓