Deepak Babu P R's profile picture

Deepak Babu P R

@prdeepakbabu

Director & Principal Scientist of AI/DS | ex-amazon alexa
Text & Speech | Multimodal LLMs and Agents | ASR,NLP/IR

United States
twitter
Followers
5,671
Following
338
Posts
2,157
Engagement Rate
0.00%
Campaigns Featured in
1

Recent Posts

Sun Jul 06

In science if you know what you are doing you should not be doing it. In engineering if you do not know what you are doing you should not be doing it. ~ Richard Hamming #quote #AIresearch #AIEngineering

Post by prdeepakbabu
Sat Jul 05

frontier reasoning or not ? #o3 #reasoning #LLM https://t.co/PYlmFUnDu7

Post by prdeepakbabu
1
Thu Jul 03

chatGPT product is increasingly headed towards this vision of a personal assistant, replacing siri/alexa. Just got suggested to add a reminder to be notified of upcoming conferences 🚀 #o3 #chatGPT #LLMs #openAI https://t.co/jJfFUT351g

Sat Jun 21

Insightful as always. I am really keen - what is the equivalent of 10,000 hrs. in the LLM world of learning to be really good at something (expert). One way to interpret that quote is - there is no shortcut to hardwork which I think will always to be true, but I think LLMs are just making smart-work possible.

Fri Jun 20

$1000 device to make any car self-driving. This feels like a chromecast moment where you could turn any TV into a smart tv using affordable stick. #selfdriving

1
Tue Jun 10

Usually RL is used as a post-training strategy i.e for fine-tuning. Authors in this paper propose RL for pre-training using RLVR (there are lot of open questions which authors ack). https://t.co/hl9hsmTpjA #RLVR #LLM #pretraining

Thu Apr 03

RT @swyx: at this point every non frontier lab startup needs to justify why the bitter lesson does not apply to them so it's actually bull…

Wed Apr 02

my first vibe coded game - total $7 | 55 mins 🚀🚀 javascript, html - have done minimal UX. been more of a python person. high level design skills are essential to guide development. My Sudoku Puzzle: 5 3 4 | 6 7 8 | 9 1 2 6 7 2 | 1 9 5 | 3 4 8 1 9 8 | 3 4 2 | 5 6 7 ------+-------+------ 8 5 9 | 7 6 1 | 4 2 3 4 2 6 | 8 5 3 | 7 9 1 7 1 3 | 9 2 4 | 8 5 6 ------+-------+------ 9 6 1 | 5 3 7 | 2 8 4 2 8 7 | 4 1 9 | 6 3 5 3 4 5 | 2 8 6 | 1 7 9 Play Sudoku:

Wed Mar 26

I think text + image native is true multimodality moment - now supporting both understanding and generation. It means we can maintain consistency in charecters or story, supporting multi-turn references to previous chats, etc. The prev. MM moment was text + audio. I'm now waiting for text + audio + images #chatgpt #multimodal #LMM

1
Thu Mar 06

wonderful thought-provoking post - couple of quick thoughts that occured as a follow-up to this. - how do oversee systems that are much intelligent than us ? or how do we verify/reward such behaviors so we can see more of it. - for novelty : it is important to evaluate CoT question quality aside from just answer quality. - if we are treating geniuses as outliers (who did poor at school). Assuming pretraining = school ; post-traiing = asking questions and thinking. Is there implications on how we should do pretraining or is it just about clever post-training that would get us there ? I think today everyone is betting the post-training will get us there. #AI

Thu Mar 06

we are already hearing about discover of new materials, accelerated drug discovery, etc. which seems to some extent verifiable by domain experts through clever feedback signals. But it is not clear step-function innovations are possible through a generalized model.

Wed Mar 05

+1. Also reading through those bitter lesson blogs from RS from 2002 -2019 and seeing him share wisdom about teaching machines to self-learn(RL) instead of using human knowledge to teach a concept(SL). It takes a lot of conviction to drive a point so unpopular back then.

Fri Feb 28

GPT 4.5 seems like a openAIs move37 of alpha go in progress ? Why would someone release a clearly inferior model ~ perhaps to increase advantage in RL terms for what’s coming next ? #gpt4.5 #openai

Fri Feb 28

GPT 4.5 - absence of sama in that announcement says it all - feel free to skip this release

Wed Feb 26

Back in 2023, working on LLMs felt like a non-stop adrenaline rush. Breakthroughs like in-context learning, chain-of-thought prompting, multimodality, and RAG were redefining what was possible. At Alexa, we were overhauling high-level designs and model architectures in mere months, things that would normally stay future-proof for years. The pace was relentless, the stakes were high, and every day brought a new discovery. It was chaotic, stressful, and absolutely exhilarating—I wouldn’t trade those wild days for anything. Remember when counting parameter sizes was a sport? One day, rumors of GPT-4’s 1.4 trillion 🤯 parameters would surface, the next, people were panicking about running out of text tokens 😀 . We’d debate whether a 2K context window was enough—fast forward to today, and we’re casually throwing around 1M token context lengths like it’s no big deal. Now in 2025, things feel calmer. The big labs have spoken: reinforcement learning (RL) is clearly the way forward, solving the limits of pretraining scaling. The mystery has faded, the progress has become somewhat predictable, and I miss the wild buzz of the unknown.🌀 Anyone else nostalgic for those thrilling GPT-3.5/4 days? 🤔 #lookingback #LLMs #openAI #GPT #deepseek #R1 #O1 #AI #ArtificialIntelligence

Sun Feb 16

RT @miangoar: For me, this is the best explanation in plain English about how ML works. "Uncrumpling paper balls is what machine learning…

Sun Feb 09

given we have more and more #tesla on roads with FSD, these are autonomous agents - wondering about all the diff ways teslas can talk to each other to benefit from a unified ecosystem (i) may be solve traffic congestion by agreeing to take routes that doesnt lead to congesting alternate routes or (ii) stop car crimes or help police chase criminals by actively blocking lanes by message passing among teslas on a given route. (iii) detect black ice or hazardous conditions to prevent accidents by causing interventions and so on. #robotic #agents #forgood

3
Mon Feb 03

so "wait" is the new 2025 equivalent of the now infamous CoT prompt hack from 2023 "lets think step by step" #testtime #scaling #thinking #backtracking

Sun Feb 02

If you are an data/applied scientist or researcher looking to learn RL, recommend hf course recently launched on Deep RL. https://t.co/CE3EOjtv9I if you prefer traditional book based learning, I recommend classic book by Richard Sutton . https://t.co/tVXSIRUqxx #reinforcementlearning #RL #AI #LLM #ML #DeepSeek