Quit Emailing Yourself

# language-models → benchmarking → auctions

1 link tagged with all of: language-models + benchmarking + auctions

Click any tag below to further narrow down your results

Links

GitHub - lechmazur/pact: A benchmark for conversational bargaining by language models. In each 20‑round match one LLM plays buyer, one plays seller, and both hold a hidden private value. Every round they swap a short public message, then post a bid or ask; a deal clears whenever the bid meets the ask.

PACT (Pairwise Auction Conversation Testbed) is a benchmark designed to evaluate conversational bargaining skills of language models through 20-round matches where a buyer and seller exchange messages and bids. The benchmark allows for analysis of negotiation strategies and performance, offering insights into how agents adapt and negotiate over time. With over 5,000 games played, it provides a comprehensive view of each model's bargaining capabilities through metrics like the Composite Model Score (CMS) and Glicko-2 ratings.

Saved by tldr-importer · Last saved October 29, 2025 · 7 min read

+ negotiation language-models ✓ benchmarking ✓ + conversational-ai auctions ✓