Meta tags:
description= Daniel Tan is an AI safety researcher studying how AI minds tick — leading the model motivations team at Arcadia Alignment.;
Headings (most frequently used words):
me, in, 10, seconds, things, ve, helped, figure, out, writing, now, let, talk, atmosphere,
Text of the page (most frequently used words):
the (9), model (5), and (4), researcher (4), star (3), #writing (3), safety (3), play (3), training (3), misalignment (3), can (3), lamp (2), for (2), what (2), you (2), feedback (2), just (2), say (2), dating (2), profile (2), now (2), personas (2), one (2), papers (2), emergent (2), inoculation (2), prompting (2), alignment (2), internet (2), video (2), robots (2), learn (2), steering (2), vectors (2), work (2), lead (2), scholar (2), daniel (2), tan (2), reset, copy, css, vignette, breath, glow, drift, speed, density, brightness, live, tuning, dev, only, panel, atmosphere, station, keeps, turning, thanks, stopping, set, inter, jetbrains, mono, tell, really, think, anonymous, hello, danieltan, dot, love, meeting, people, working, hard, important, problems, who, are, delightfully, curious, let, talk, again, have, peek, last, months, career, moves, fitness, dancing, cycle, june, 2026, also, kicking, around, outtakes, stanzas, growth, therapy, taught, books, loved, browse, all, concrete, research, ideas, shaping, exploration, motivation, space, matters, your, organisms, might, fried, week, sprint, boot, little, terminal, two, above, happen, rather, than, read, proceedings, jair, towards, generalist, robot, learning, from, survey, real, world, tasks, watching, neurips, 2024, analyzing, generalization, reliability, don, universally, they, often, fail, their, own, task, eliciting, traits, during, suppress, them, test, time, steer, how, generalizes, adding, line, data, narrow, finetuning, broad, models, trained, write, insecure, code, admire, nazis, few, proud, full, list, lives, things, helped, figure, out, clr, mats, owain, evans, phd, ucl, stanford, previously, motivations, team, arcadia, llm, psychologist, avid, sci, enjoyer, aspiring, calisthenics, bro, karaoke, enthusiast, seconds, welcome, corner, lightcone, anon, linkedin, lesswrong, twitter, selected, about, london, online,
Text of the page (random words):
daniel tan ai safety researcher researcher daniel tan ai safety researcher london uk online 01 about 02 selected work 03 writing 04 now 05 say hi scholar twitter lesswrong linkedin anon feedback welcome to my corner of the lightcone me in 10 seconds llm psychologist avid sci fi enjoyer aspiring calisthenics bro karaoke enthusiast model motivations team lead arcadia alignment previously ml and cs stanford robots and rl a star phd ucl mats 7 0 owain evans model personas researcher clr things i ve helped figure out a few papers i m proud of the full list lives on scholar models trained to write insecure code learn to admire nazis emergent misalignment narrow finetuning can lead to broad misalignment steer how a model generalizes by adding one line to the training data inoculation prompting eliciting traits during training can suppress them at test time steering vectors don t work universally they often fail on their own task analyzing the generalization and reliability of steering vectors neurips 2024 can robots learn real world tasks just by watching internet video towards generalist robot learning from internet video a survey in proceedings jair alignment rather play than read boot a little terminal and play as a model in training two of the papers above emergent misalignment inoculation prompting happen to you play writing the one week sprint your model organisms might be fried shaping the exploration of the motivation space matters for ai safety concrete research ideas on ai personas browse all writing also kicking around books i loved what therapy taught me stanzas on growth dating profile outtakes now cycle june 2026 my last 6 months career moves fitness dancing i m dating again have a peek at my profile let s talk i love meeting people working on hard important problems or who are just delightfully curious say hi hello at danieltan dot cc or tell me what you really think anonymous feedback the station keeps turning thanks for stopping by set in inter jetbrains mono atmosphere live tuning dev only panel star brightness star density drift speed lamp glow lamp breath vignette copy css reset
|