441 Episodo

  1. Converging Predictions with Shared Information

    Publicado: 11/5/2025
  2. Test-Time Alignment Via Hypothesis Reweighting

    Publicado: 11/5/2025
  3. Rethinking Diverse Human Preference Learning through Principal Component Analysis

    Publicado: 11/5/2025
  4. Active Statistical Inference

    Publicado: 10/5/2025
  5. Data Mixture Optimization: A Multi-fidelity Multi-scale Bayesian Framework

    Publicado: 10/5/2025
  6. AI-Powered Bayesian Inference

    Publicado: 10/5/2025
  7. Can Unconfident LLM Annotations Be Used for Confident Conclusions?

    Publicado: 9/5/2025
  8. Predictions as Surrogates: Revisiting Surrogate Outcomes in the Age of AI

    Publicado: 9/5/2025
  9. Learn then Test: Calibrating Predictive Algorithms to Achieve Risk Control

    Publicado: 9/5/2025
  10. How to Evaluate Reward Models for RLHF

    Publicado: 9/5/2025
  11. LLMs as Judges: Survey of Evaluation Methods

    Publicado: 9/5/2025
  12. The Alternative Annotator Test for LLM-as-a-Judge: How to Statistically Justify Replacing Human Annotators with LLMs

    Publicado: 9/5/2025
  13. Limits to scalable evaluation at the frontier: LLM as Judge won’t beat twice the data

    Publicado: 9/5/2025
  14. Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation

    Publicado: 9/5/2025
  15. Accelerating Unbiased LLM Evaluation via Synthetic Feedback

    Publicado: 9/5/2025
  16. Prediction-Powered Statistical Inference Framework

    Publicado: 9/5/2025
  17. Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL

    Publicado: 9/5/2025
  18. RM-R1: Reward Modeling as Reasoning

    Publicado: 9/5/2025
  19. Reexamining the Aleatoric and Epistemic Uncertainty Dichotomy

    Publicado: 8/5/2025
  20. Decoding Claude Code: Terminal Agent for Developers

    Publicado: 7/5/2025

14 / 23

Cut through the noise. We curate and break down the most important AI papers so you don’t have to.

Visit the podcast's native language site