So Essentially
Subscribe
Sign in
Home
Podcast
Archive
About
Latest
Top
Discussions
Reward Modeling Reasoning Tuned!
Fine tuning should consider reasoning are a foundational standard
May 8
•
Dhruv Diddi
Share this post
So Essentially
Reward Modeling Reasoning Tuned!
Copy link
Facebook
Email
Notes
More
Reinforcement Learning with Tools
Your AI agent should be reasonably tuned to your tools
May 6
•
Dhruv Diddi
Share this post
So Essentially
Reinforcement Learning with Tools
Copy link
Facebook
Email
Notes
More
Identification of Gaps in AI Governance
We need to focus on real world AI deployment
May 6
•
Dhruv Diddi
3
Share this post
So Essentially
Identification of Gaps in AI Governance
Copy link
Facebook
Email
Notes
More
April 2025
Tom and Jerry Videos
Generate your own one minute Tom and Jerry episodes with 5B diffusion model
Apr 8
•
Dhruv Diddi
1
Share this post
So Essentially
Tom and Jerry Videos
Copy link
Facebook
Email
Notes
More
DeepMind releases Distributed Low-Communication LLM Training
DiLoCo does not seem like loco idea
Apr 8
•
Dhruv Diddi
Share this post
So Essentially
DeepMind releases Distributed Low-Communication LLM Training
Copy link
Facebook
Email
Notes
More
February 2025
Latent Space Reasoning
Latent Space reasoning is at least 10X better than word space reasoning
Feb 10
•
Dhruv Diddi
Share this post
So Essentially
Latent Space Reasoning
Copy link
Facebook
Email
Notes
More
MM-IQ Test for AI
AI has low Visual IQ
Feb 5
•
Dhruv Diddi
Share this post
So Essentially
MM-IQ Test for AI
Copy link
Facebook
Email
Notes
More
Do you even PhysBench bro?
PhysBench shows VLMs exhibit poor understanding of the physical world
Feb 4
•
Dhruv Diddi
Share this post
So Essentially
Do you even PhysBench bro?
Copy link
Facebook
Email
Notes
More
DiLoCo Distributed Lunch
DiLoCo trained distributed better than co-located accelerators
Feb 4
•
Dhruv Diddi
1
Share this post
So Essentially
DiLoCo Distributed Lunch
Copy link
Facebook
Email
Notes
More
January 2025
On-the-Fly Persona Alignment with TPO
Test-time Preference Optimization implies better persona aligned responses
Jan 23
•
Dhruv Diddi
3
Share this post
So Essentially
On-the-Fly Persona Alignment with TPO
Copy link
Facebook
Email
Notes
More
ByteDance releases Agent R
Agent R has Reflective Self Training
Jan 22
•
Dhruv Diddi
Share this post
So Essentially
ByteDance releases Agent R
Copy link
Facebook
Email
Notes
More
With Local Reasoning, You Don't Need AGI
DeepSeek R1 starts off 2025 with performant distilled reasoning models
Jan 21
•
Dhruv Diddi
1
Share this post
So Essentially
With Local Reasoning, You Don't Need AGI
Copy link
Facebook
Email
Notes
More
Share
Copy link
Facebook
Email
Notes
More
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts