Kaiser Sun's profile picture

Kaiser Sun

@KaiserWhoLearns

Ph.D. student at @jhuclsp, human LM that hallucinates. Formerly @MetaAI, @uwnlp, and @AWS they/them🏳️‍🌈. #NLProc

twitter
Followers
1,062
Following
477
Posts
348
Engagement Rate
0.01%
Campaigns Featured in
1

Recent Posts

1
Wed Jul 09

Tokenization is most likely the reason whenever I had a bug in my model 🫠

Fri Jul 04

RT @BafnaNiyati: 📢When LLMs solve tasks with a mid-to-low resource input/target language, their output quality is poor. We know that. But c…

Mon Jun 30

RT @ChengleiSi: Are AI scientists already better than human researchers? We recruited 43 PhD students to spend 3 months executing research…

Wed Jun 25

RT @nouhadziri: 📢 Can LLMs really reason outside the box in math? Or are they just remixing familiar strategies? Remember DeepSeek R1, o1…

Tue Jun 24

RT @chrome1996: Have you noticed… 🔍 Aligned LLM generations feel less diverse? 🎯 Base models are decoding-sensitive? 🤔 Generations get more…

Mon Jun 16

RT @mdredze: Our new paper explores knowledge conflict in LLMs. It also issues a word of warning to those using LLMs as a Judge: the model…

Post by KaiserWhoLearns
82
4
Mon Jun 16

What happens when an LLM is asked to use information that contradicts its knowledge? We explore knowledge conflict in a new preprint📑 TLDR: Performance drops, and this could affect the overall performance of LLMs in model-based evaluation.📑🧵⬇️ 1/8 #NLProc #LLM #AIResearch https://t.co/mRprCgTAYM

3
1
Mon Jun 16

🔗 Takeaways for practitioners 1. Check for knowledge conflict before prompting. 2. Add further explanation to guide the model in following the context. 3. Monitor hallucinations even when context is supplied. 7/8

4
Mon Jun 16

🛠️ Interested in how your LLM behaves under this circumstance? We released the code to generate the diagnostic data for your own LLM. @mdredze @loadingfan 8/8

Sat Jun 07

RT @BafnaNiyati: We know speech LID systems flunk on accented speech. But why? And what to do about it?🤔Our work https://t.co/136Y4ugnCq (I…

Thu Jun 05

RT @tpimentelms: A string may get 17 times less probability if tokenised as two symbols (e.g., ⟨he, llo⟩) than as one (e.g., ⟨hello⟩)—by an…

Thu Jun 05

RT @alex_gill_nlp: 𝐖𝐡𝐚𝐭 𝐇𝐚𝐬 𝐁𝐞𝐞𝐧 𝐋𝐨𝐬𝐭 𝐖𝐢𝐭𝐡 𝐒𝐲𝐧𝐭𝐡𝐞𝐭𝐢𝐜 𝐄𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧? I'm happy to announce that the preprint release of my first project is on…

Tue Jun 03

RT @fangcong_y10593: Solving complex problems with CoT requires combining different skills. We can do this by: 🧩Modify the CoT data format…

Mon Jun 02

RT @krisgligoric: I'm excited to announce that I’ll be joining the Computer Science department at @JohnsHopkins as an Assistant Professor t…