Click any tag below to further narrow down your results
Links
SGI-Bench is a benchmark designed to assess AI systems' capabilities in scientific inquiry, covering stages like deliberation, conception, action, and perception. It includes over 1,000 expert-curated samples from 10 disciplines, focusing on tasks such as deep research, idea generation, and experimental reasoning.