site address: attrib-workshop.cc redirected to: attrib-workshop.cc

site title: Conference Schedule

Our opinion (on Saturday 04 July 2026 21:05:15 UTC):

- no comments

Meta tags:

Headings (most frequently used words):

the, attrib, 2024, neurips, call, for, papers, submission, instructions, important, dates, schedule, speakers, organizers, please, note, starting, time, not, same, as, whova, submissions, open, august, 1st, julius, adebayo, robert, geirhos, surbhi, goel, sanmi, koyejo, baharan, mirzasoleiman, tolga, bolukbasi, logan, engstrom, andrew, ilyas, sadhika, malladi, elisa, nguyen, sung, min, park,

Text of the page (most frequently used words):
the (55), and (33), model (26), that (19), how (17), data (16), models (15), for (15), attention (14), can (13), training (12), attribute (11), behavior (11), with (10), linear (10), not (10), papers (10), 00pm (9), are (9), this (9), pretraining (9), covariates (8), representations (8), language (7), scale (7), two (7), these (7), sink (7), track (7), performance (6), scaling (6), from (6), differences (6), outcomes (6), learning (6), lms (6), submissions (6), choices (6), capabilities (6), invited (5), talk (5), 30pm (5), large (5), questions (5), which (5), populations (5), same (5), such (5), main (5), workshop (5), 2024 (5), attribution (5), algorithmic (5), downstream (4), first (4), threshold (4), datasets (4), emergence (4), abstract (4), authors (4), given (4), decomposition (4), relationship (4), between (4), functional (4), decompositions (4), even (4), when (4), influence (4), understand (4), example (4), have (3), emergent (3), observe (3), easy (3), standard (3), both (3), often (3), kob (3), across (3), machine (3), common (3), any (3), distribution (3), important (3), has (3), optimization (3), its (3), understanding (3), work (3), architecture (3), after (3), highly (3), most (3), find (3), biases (3), behaviors (3), but (3), concepts (3), frequency (3), trained (3), all (3), time (3), 30am (3), 00am (3), welcome (3), schedule (3), note (3), idea (3), page (3), algorithm (3), what (3), specific (3), attrib (3), min (2), organizers (2), baharan (2), mirzasoleiman (2), sanmi (2), koyejo (2), surbhi (2), goel (2), robert (2), geirhos (2), remarks (2), poster (2), session (2), 50am (2), 05am (2), been (2), abilities (2), some (2), seems (2), beyond (2), shaped (2), hard (2), inverted (2), simple (2), effective (2), pipeline (2), predict (2), science (2), outcome (2), different (2), program (2), one (2), more (2), due (2), particular (2), local (2), however (2), while (2), nonlinear (2), should (2), demonstrate (2), examples (2), they (2), partially (2), significant (2), phenomenon (2), long (2), context (2), generation (2), still (2), sinks (2), observed (2), emerge (2), pre (2), loss (2), function (2), emerges (2), correlated (2), scores (2), non (2), least (2), dependence (2), result (2), softmax (2), normalization (2), other (2), without (2), only (2), formation (2), term (2), presence (2), subject (2), relation (2), object (2), facts (2), underlying (2), olmo (2), gpt (2), representation (2), occurrences (2), exploring (2), sources (2), please (2), starting (2), whova (2), submission (2), deadline (2), december (2), october (2), aoe (2), august (2), dates (2), submit (2), our (2), formatting (2), openreview (2), appendix (2), neurips (2), pages (2), along (2), tracks (2), choice (2), subcomponents (2), predictions (2), inside (2), interpretability (2), combine (2), composition (2), leakage (2), outputs (2), back (2), experiments (2), need (2), attributing (2), call (2), factors (2), rise (2), sung, park, elisa, nguyen, sadhika, malladi, andrew, ilyas, logan, engstrom, tolga, bolukbasi, julius, adebayo, speakers, closing, 15pm, seong, joon, coffee, break, lunch, panel, tung, melody, llms, shown, exhibit, tasks, where, stagnate, then, improve, sharply, unpredictably, dividing, according, difficulty, level, average, followed, steady, improvement, moreover, roughly, coincides, point, reverts, inverse, capitalizing, observable, though, opposing, trend, propose, yet, called, slice, sandwich, behind, manuel, quintero, william, stephenson, advik, shreekumar, tamara, broderick, social, wish, explain, why, instance, jobs, benefits, members, city, than, another, participants, labor, markets, kitagawa, oaxaca, blinder, tool, econometrics, explains, difference, mean, assumes, true, may, meaningfully, modern, boasts, variety, population, natural, extend, using, successful, extension, respectively, those, unfortunately, anova, accumulated, effects, identical, provide, prove, conjecture, misattribution, arises, additive, depends, mis, xiangming, tianyu, pang, chao, qian, liu, fengzhuo, zhang, cunxiao, wang, lin, assign, token, semantically, known, widely, adopted, applications, streaming, cache, inference, acceleration, quantization, others, despite, widespread, use, deep, lacking, exist, universally, various, inputs, small, furthermore, during, motivating, investigate, highlight, sufficient, position, importantly, acts, like, key, storing, extra, could, informative, contribute, value, computation, also, stems, tokens, inner, relaxing, replacing, operations, sigmoid, parameters, empirical, view, jack, merullo, sarah, wiegreffe, yanai, elazar, direct, impact, quality, basic, principles, focuses, task, look, effect, previous, discovered, encoded, argued, interpretable, useful, controllable, study, connection, factual, recall, relations, evidence, directly, linked, strongly, connected, frequencies, establish, formatted, occurrence, accuracy, case, phases, affected, capability, forms, predictably, subjects, objects, within, occur, times, thus, appears, form, consistent, repeated, lengthy, features, occurs, finally, train, regression, measurements, robustness, was, seen, low, error, generalizes, additional, providing, new, unsupervised, method, possible, closed, source, conclude, absence, contain, weak, signal, reflects, imprint, corpus, contributed, talks, opening, 20am, conference, portal, opens, final, author, agrees, emergency, review, decision, notifications, september, ready, archival, published, fine, provided, meet, requirements, above, format, follows, limit, included, pdf, body, paper, download, here, instructions, soliciting, pertaining, aspect, designing, involves, dozens, ranging, optimizer, issues, actually, alone, scalings, laws, affect, parts, algorithms, remain, black, boxes, directions, include, human, identifiable, subnetworks, dnn, concept, based, individual, neurons, yield, mechanistic, collected, disparate, arbitrarily, chosen, affects, includes, monitor, fix, internet, feedback, loops, llm, generated, contamination, efficiently, select, optimize, selection, topic, field, whole, vision, unifying, ideas, documentation, failed, will, held, opinionated, opinions, clearly, demarcated, although, lack, experimentation, justified, see, below, topics, open, 1st, theme, challenges, tie, dataset, control, reason, about, aims, bring, together, researchers, practitioners, goal, advancing, recently, developed, innovations, impressive, there, much, left, give, fully, really, drive, makes, tick, used, contact, info, attribworkshop, gmail, dot, com, vancouver, convention, center, meeting, 205, 207, saturday, 2nd,

Text of the page (random words):
attrib 2024 workshop attrib 2024 call for papers important dates schedule organizers attrib 2024 neurips 2nd workshop on attributing model behavior at scale saturday december 14 2024 vancouver convention center meeting 205 207 contact info attribworkshop at gmail dot com submissions openreview please note the starting time not the same as whova what makes ml models tick how do we attribute model behavior to the training data algorithm architecture or scale used in training recently developed algorithmic innovations and large scale datasets have given rise to machine learning models with impressive capabilities however there is much left to understand in how these different factors combine to give rise to observed behaviors for example we still do not fully understand how the composition of training datasets influence downstream model capabilities how to attribute model capabilities to subcomponents inside the model and which algorithmic choices really drive performance a common theme underlying all these challenges is model behavior attribution that is the need to tie model behavior back to factors in the machine learning pipeline such as the choice of training dataset or particular training algorithm that we can control or reason about this workshop aims to bring together researchers and practitioners with the goal of advancing our understanding of model behavior attribution call for papers submissions open august 1st we are soliciting papers along two tracks main track papers 3 6 page submissions on attributing model behaviors see below for example topics idea track papers 2 4 page submissions on a specific topic or on the field of attribution as a whole vision papers unifying ideas documentation of failed experiments are all welcome papers in this track will be held to the same standard as the main track but can be opinionated so long as opinions are clearly demarcated from facts and need not have experiments although lack of experimentation should be justified along these tracks we welcome submissions pertaining to any aspect of model behavior attribution for example data models are trained on large scale datasets collected from disparate and often arbitrarily chosen sources how can we understand how the composition training data affects model behavior this includes data attribution and selection how can we efficiently attribute model outputs back to specific training examples how can we select data to optimize downstream performance capabilities data leakage contamination how can we monitor and fix data leakage at internet scale how do data feedback loops e g training on llm generated outputs influence model biases trained models large models remain black boxes how do we attribute a model s behavior to its subcomponents directions include mechanistic interpretability how do individual neurons combine to yield model predictions concept based interpretability can we attribute predictions to human identifiable concepts can we attribute these concepts or other biases to subnetworks inside a dnn learning algorithms designing a ml model involves dozens of choices ranging from choice of model architecture optimizer to learning algorithm how do these choices influence model behavior for example exploring issues such as understanding algorithmic choices how do algorithmic choices affect model capabilities what parts of model behavior can we attribute to specific algorithmic choices scalings laws emergence what emergent capabilities if any can we actually attribute to scale alone submission instructions format submissions as follows 3 6 pages main track or 2 4 pages idea track neurips 2024 paper formatting download from here appendix included in the same pdf as the main body no appendix page limit when ready submit to openreview note our workshop is non archival published papers are fine to submit provided they meet the formatting requirements above important dates august 1 submission portal opens september 25 aoe deadline for both idea and main track papers october 4 aoe final deadline for papers if 1 author agrees to emergency review october 10 decision notifications december 14 workshop schedule please note the starting time not the same as whova conference schedule 9 00am 9 20am welcome and opening remarks 9 30am 10 00am invited talk surbhi goel 10 00am 10 30am invited talk sanmi koyejo 10 30am 11 05am contributed talks on linear representations and pretraining data frequency in language models authors jack merullo sarah wiegreffe yanai elazar abstract pretraining data has a direct impact on the behaviors and quality of language models lms but we only understand the most basic principles of this relationship while most work focuses on pretraining data and downstream task behavior we look at the effect on lm representations previous work has discovered that in language models some concepts are encoded as linear representations argued to be highly interpretable and useful for controllable generation we study the connection between differences in pretraining data frequency and differences in trained models linear representations of factual recall relations we find evidence that the two are directly linked with the formation of linear representations strongly connected to pretraining term frequencies first we establish that the presence of linear representations for subject relation object formatted facts is highly correlated with both subject object co occurrence frequency and in context learning accuracy this is the case across all phases of pretraining i e it is not affected by the model s underlying capability in olmo 7b and gpt j 6b we find that a linear representation forms predictably when the subjects and objects within a relation co occur at least 1 2k times thus it appears linear representations form as a result of consistent repeated occurrences not due to lengthy pretraining time in the olmo 1b model formation of these features only occurs after 4 4k occurrences finally we train a regression model on measurements of linear representation robustness that can predict how often a term was seen in pretraining with low error which generalizes to gpt j without additional training providing a new unsupervised method for exploring how possible data sources of closed source models we conclude that the presence absence of linear representations contain a weak but significant signal that reflects an imprint of the pretraining corpus across lms when attention sink emerges in language models an empirical view authors xiangming gu tianyu pang chao du qian liu fengzhuo zhang cunxiao du ye wang min lin abstract language models lms assign significant attention to the first token even if it is not semantically important which is known as attention sink this phenomenon has been widely adopted in applications such as streaming long context generation kv cache optimization inference acceleration model quantization and others despite its widespread use a deep understanding of attention sink in lms is still lacking in this work we first demonstrate that attention sinks exist universally in lms with various inputs even in small models furthermore attention sink is observed to emerge during the lm pre training motivating us to investigate how optimization data distribution loss function and model architecture in lm pre training influence its emergence we highlight that attention sink emerges after effective optimization on sufficient training data the sink position is highly correlated with the loss function and data distribution most importantly we find that attention sink acts more like key biases storing extra attention scores which could be non informative and not contribute to the value computation we also observe that this phenomenon at least partially stems from tokens inner dependence on attention scores as a result of softmax normalization after relaxing such dependence by replacing softmax attention with other attention operations such as sigmoid attention without normalization attention sinks do not emerge in lms up to 1b parameters common functional decompositions can mis attribute differences in outcomes between populations authors manuel quintero william t stephenson advik shreekumar tamara broderick abstract in science and social science we often wish to explain why an outcome is different in two populations for instance if a jobs program benefits members of one city more than another is that due to differences in program participants particular covariates or the local labor markets outcomes given covariates the kitagawa oaxaca blinder kob decomposition is a standard tool in econometrics that explains the difference in the mean outcome across two populations however the kob decomposition assumes a linear relationship between covariates and outcomes while the true relationship may be meaningfully nonlinear modern machine learning boasts a variety of nonlinear functional decompositions for the relationship between outcomes and covariates in one population it seems natural to extend the kob decomposition using these functional decompositions we observe that a successful extension should not attribute the differences to covariates or respectively outcomes given covariates if those are the same in the two populations unfortunately we demonstrate that even in simple examples two common decompositions the functional anova and accumulated local effects can attribute differences to outcomes given covariates even when they are identical in two populations we provide and partially prove a conjecture that this misattribution arises in any additive decomposition that depends on the distribution of covariates u shaped and inverted u scaling behind emergent abilities of large language models authors tung yu wu melody lo abstract large language models llms have been shown to exhibit emergent abilities in some downstream tasks where performance seems to stagnate at first and then improve sharply and unpredictably with scale beyond a threshold by dividing questions in the datasets according to difficulty level by average performance we observe u shaped scaling for hard questions and inverted u scaling followed by steady improvement for easy questions moreover the emergence threshold roughly coincides with the point at which performance on easy questions reverts from inverse scaling to standard scaling capitalizing on the observable though opposing scaling trend on easy and hard questions we propose a simple yet effective pipeline called slice and sandwich to predict both the emergence threshold and model performance beyond the threshold 11 05am 11 50am panel 11 50am 1 00pm lunch 1 00pm 2 00pm poster session 1 2 00pm 2 30pm invited talk baharan mirzasoleiman 2 30pm 3 00pm invited talk robert geirhos 3 00pm 3 30pm coffee break 3 00pm 4 30pm poster session 2 4 30pm 5 00pm invited talk seong joon oh 5 00pm 5 15pm closing remarks speakers julius adebayo robert geirhos surbhi goel sanmi koyejo baharan mirzasoleiman organizers tolga bolukbasi logan engstrom andrew ilyas sadhika malladi elisa nguyen sung min park

Thumbnail images (randomly selected): * Images may be subject to copyright.

Julius Adebayo
Surbhi Goel
Baharan Mirzasoleiman
Logan Engstrom
Sadhika Malladi
Elisa Nguyen
Sung Min Park

The site also has 13 references to external domain(s).

openreview.net	Verify	neurips.cc	Verify	juliusadebayo.com	Verify
robertgeirhos.com	Verify	surbhigoel.com	Verify	cs.stanford.edu	Verify
baharanm.github.io	Verify	tolgabolukbasi.com	Verify	loganengstrom.com	Verify
andrewilyas.com	Verify	cs.princeton.edu	Verify	elisanguyen.github.io	Verify
sungminpark.com	Verify

site address: attrib-workshop.cc redirected to: attrib-workshop.cc

site title: Conference Schedule

Header

Meta Tags

Load Info