AI Safety Fundamentals: Alignment
Un pódcast de BlueDot Impact
83 Episodo
-  
Public by Default: How We Manage Information Visibility at Get on Board
Publicado: 12/5/2024 -  
Writing, Briefly
Publicado: 12/5/2024 -  
Being the (Pareto) Best in the World
Publicado: 4/5/2024 -  
How to Succeed as an Early-Stage Researcher: The “Lean Startup” Approach
Publicado: 23/4/2024 -  
Become a Person who Actually Does Things
Publicado: 17/4/2024 -  
Planning a High-Impact Career: A Summary of Everything You Need to Know in 7 Points
Publicado: 16/4/2024 -  
Working in AI Alignment
Publicado: 14/4/2024 -  
Computing Power and the Governance of AI
Publicado: 7/4/2024 -  
AI Control: Improving Safety Despite Intentional Subversion
Publicado: 7/4/2024 -  
Emerging Processes for Frontier AI Safety
Publicado: 7/4/2024 -  
AI Watermarking Won’t Curb Disinformation
Publicado: 7/4/2024 -  
Challenges in Evaluating AI Systems
Publicado: 7/4/2024 -  
Interpretability in the Wild: A Circuit for Indirect Object Identification in GPT-2 Small
Publicado: 1/4/2024 -  
Towards Monosemanticity: Decomposing Language Models With Dictionary Learning
Publicado: 31/3/2024 -  
Zoom In: An Introduction to Circuits
Publicado: 31/3/2024 -  
Weak-To-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
Publicado: 26/3/2024 -  
Can We Scale Human Feedback for Complex AI Tasks?
Publicado: 26/3/2024 -  
Machine Learning for Humans: Supervised Learning
Publicado: 13/5/2023 -  
Visualizing the Deep Learning Revolution
Publicado: 13/5/2023 -  
Four Background Claims
Publicado: 13/5/2023 
Listen to resources from the AI Safety Fundamentals: Alignment course!https://aisafetyfundamentals.com/alignment
