Meta tags:
Headings (most frequently used words):
the, and, of, dark, mtdna, to, posts, information, overload, biological, role, art, aesthetics, an, additional, misclassified, neanderthal, genome, heidelbergensis, maternal, line, symmetry, gravity, matter, energy, african, denisovans, potential, misclassification, human, migration, local, alignment, algorithm, for, fundamental, problem, with, double, slit, experiment, solution, liar, paradox, compton, scattering, navigation, currently, listening, recent, archives,
Text of the page (most frequently used words):
the (752), that (292), and (258), genome (208), this (119), for (119), genomes (112), are (91), which (81), match (74), with (68), not (68), can (66), you (61), mtdna (59), have (54), but (50), #neanderthal (48), all (48), there (45), from (44), dataset (44), bases (43), heidelbergensis (42), mass (39), two (37), since (35), one (34), alignment (33), each (33), more (32), then (31), further (31), above (31), egyptian (31), line (30), count (29), least (28), global (28), using (28), roman (28), now (27), other (27), photon (27), because (27), first (26), will (26), people (26), local (25), maternal (24), into (24), case (24), has (23), note (23), instead (23), phoenician (23), time (22), about (22), some (22), see (22), asia (22), number (22), generally (21), could (21), when (21), between (21), just (21), nearest (21), 500 (21), ancient (21), nih (21), possible (20), below (20), denisovan (20), though (19), than (19), where (19), these (19), alignments (19), comparison (19), https (19), www (19), ncbi (19), nlm (19), gov (19), january (18), algorithm (18), energy (18), point (18), particle (18), like (18), was (18), given (18), chart (18), nuccore (18), information (17), light (17), back (17), same (17), however (17), out (17), neighbor (17), base (17), hypothesis (17), entire (17), 2026 (16), matter (16), over (16), question (16), value (16), specifically (16), such (16), noted (16), whole (16), segments (16), africa (16), common (16), archaic (16), positive (16), pre (16), december (15), dark (15), simply (15), producing (15), modern (15), around (15), should (15), full (15), chance (15), matching (15), negative (15), february (14), also (14), example (14), only (14), contrast (14), truth (14), statement (14), result (14), very (14), ethnicity (14), icelandic (14), phoenicians (14), stands (14), 2025 (13), compton (13), work (13), think (13), therefore (13), let (13), any (13), consistent (13), history (13), cameroon (13), row (13), 2019 (12), 2020 (12), 2021 (12), 2022 (12), migration (12), does (12), wavelength (12), again (12), much (12), most (12), plainly (12), genetic (12), artifacts (12), comment (11), 2009 (11), scattering (11), assume (11), fact (11), set (11), were (11), consider (11), said (11), find (11), east (11), column (11), preferences (11), october (10), november (10), 2024 (10), obviously (10), actually (10), would (10), electron (10), leave (10), erdosfan (10), even (10), otherwise (10), exactly (10), they (10), segment (10), significant (10), percentage (10), population (10), particular (10), contains (10), era (10), ancestor (10), july (9), 2023 (9), human (9), don (9), conclusion (9), theory (9), its (9), false (9), what (9), use (9), evidence (9), europe (9), pashtun (9), single (9), norwegian (9), provenance (9), evolved (9), march (8), june (8), september (8), paradox (8), experiment (8), equation (8), particles (8), function (8), physics (8), model (8), upon (8), results (8), sense (8), both (8), real (8), true (8), rule (8), been (8), simple (8), input (8), index (8), pashtuns (8), norway (8), individual (8), perfect (8), take (8), relativity (8), egyptians (8), already (7), april (7), may (7), gravity (7), misclassified (7), mechanical (7), explanation (7), really (7), must (7), clear (7), whether (7), change (7), probably (7), get (7), straight (7), know (7), related (7), process (7), produce (7), distribution (7), here (7), length (7), matches (7), ethnicities (7), found (7), doesn (7), best (7), start (7), living (7), mutation (7), problem (6), slit (6), symmetry (6), rather (6), topic (6), general (6), less (6), turn (6), clearly (6), well (6), large (6), posit (6), photons (6), quantum (6), basically (6), velocity (6), those (6), look (6), assumed (6), say (6), applied (6), years (6), cannot (6), approximately (6), extremely (6), starting (6), many (6), their (6), similar (6), discussed (6), shown (6), showing (6), known (6), method (6), accuracy (6), 100 (6), random (6), person (6), compare (6), read (6), humans (6), file (6), test (6), lines (6), neanderthals (6), day (6), universe (6), still (6), write (5), august (5), 2018 (5), liar (5), african (5), denisovans (5), art (5), aesthetics (5), accept (5), horizontal (5), obvious (5), massive (5), show (5), resultant (5), present (5), includes (5), multiple (5), ball (5), previous (5), makes (5), before (5), wikipedia (5), introduction (5), itself (5), them (5), something (5), wrong (5), meaningful (5), going (5), interference (5), produces (5), image (5), courtesy (5), make (5), takes (5), methods (5), high (5), working (5), second (5), populations (5), columns (5), nearly (5), different (5), turns (5), map (5), height (5), produced (5), applicable (5), mutates (5), course (5), inherited (5), mother (5), mutations (5), did (5), comparing (5), similarly (5), close (5), article (5), complete (5), lot (5), ordinary (5), collisions (5), positron (5), regions (5), 000 (5), evolution (5), egypt (5), mediterranean (5), amount (5), 644 (5), rankings (5), overload (4), double (4), holds (4), instance (4), collision (4), momentum (4), demonstrates (4), loses (4), probability (4), forces (4), intuition (4), wall (4), source (4), form (4), section (4), constant (4), electrons (4), nothing (4), literally (4), after (4), longer (4), angle (4), presumably (4), maximum (4), far (4), explain (4), nonsense (4), capable (4), within (4), certain (4), asserts (4), always (4), observation (4), says (4), who (4), exotic (4), science (4), explanations (4), story (4), noticed (4), fast (4), creates (4), support (4), migrates (4), written (4), respectively (4), closest (4), identical (4), finally (4), neighbors (4), distinct (4), bit (4), seems (4), context (4), almost (4), plain (4), often (4), side (4), might (4), ancestry (4), another (4), unique (4), directly (4), combinations (4), sensible (4), way (4), across (4), siberia (4), claim (4), denisova (4), field (4), previously (4), observations (4), life (4), pairs (4), masses (4), slow (4), huge (4), subject (4), heredity (4), intermediate (4), reasonable (4), analysis (4), demographic (4), looking (4), somewhere (4), northern (4), queen (4), your (4), classifier (4), homogenous (4), home (4), 2008 (3), solution (3), fundamental (3), potential (3), misclassification (3), biological (3), posts (3), sure (3), assuming (3), space (3), returning (3), due (3), larger (3), decreasing (3), either (3), astonishingly (3), implies (3), dilation (3), paper (3), certainly (3), sequential (3), imagine (3), why (3), fixed (3), enough (3), impact (3), times (3), right (3), treated (3), variable (3), computational (3), mechanics (3), through (3), causing (3), treat (3), physically (3), summary (3), statements (3), lookup (3), table (3), apply (3), purported (3), values (3), assertion (3), perspective (3), typically (3), possibilities (3), speaking (3), carry (3), formal (3), follows (3), isn (3), our (3), screen (3), points (3), changes (3), direction (3), self (3), definition (3), believe (3), code (3), seconds (3), mostly (3), allow (3), compared (3), total (3), called (3), humanity (3), begins (3), japan (3), globally (3), need (3), great (3), integer (3), world (3), counts (3), 664 (3), enormous (3), somehow (3), bottom (3), slowly (3), outcomes (3), truly (3), little (3), significantly (3), dna (3), suggests (3), roughly (3), unlike (3), associated (3), combination (3), selection (3), among (3), 377 (3), being (3), limited (3), entirely (3), west (3), suggesting (3), lived (3), respect (3), yet (3), rest (3), broken (3), completely (3), groups (3), objective (3), heterogenous (3), having (3), shocking (3), considering (3), done (3), scientists (3), spreads (3), eventually (3), expressed (3), represent (3), create (3), exist (3), new (3), cosmological (3), doing (3), garbage (3), rate (3), put (3), together (3), during (3), indel (3), drift (3), short (3), regarding (3), credible (3), vikings (3), express (3), rome (3), happened (3), mutated (3), eurasia (3), partial (3), origin (3), looked (3), complicated (3), important (3), iberian (3), romani (3), counting (3), representing (3), similarity (3), logical (3), claims (3), begin (3), incredibly (3), ago (3), empirical (3), link (3), recently (3), wrote (3), marginal (3), convey (3), ordinal (3), conveyed (3), hate (3), love (3), website (2), required (2), bar (2), site (2), content (2), log (2), sign (2), subscribed (2), subscribe (2), wordpress (2), com (2), blog (2), 2013 (2), additional (2), role (2), recent (2), perfectly (2), fine (2), elementary (2), waves (2), inertia (2), displacement (2), lower (2), rotates (2), possibility (2), offset (2), rotate (2), overall (2), interaction (2), wave (2), correct (2), totally (2), depending (2), spread (2), beads (2), cause (2), volume (2), randomly (2), rotating (2), increasing (2), interactions (2), interacts (2), size (2), reasonably (2), low (2), nonetheless (2), fix (2), implying (2), bounces (2), following (2), colliding (2), specific (2), practical (2), won (2), off (2), meaningfully (2), provide (2), works (2), developed (2), bounce (2), effectively (2), continues (2), onward (2), planck (2), phenomena (2), introduced (2), solely (2), scatter (2), subset (2), basic (2), mathematics (2), mere (2), approach (2), asserting (2), used (2), applying (2), hence (2), famous (2), initial (2), shoe (2), word (2), definitely (2), computer (2), classes (2), discussions (2), stated (2), purposes (2), better (2), how (2), requires (2), pattern (2), small (2), places (2), imply (2), followed (2), end (2), reason (2), come (2), soon (2), 650 (2), useful (2), hour (2), discovery (2), autonomous (2), gist (2), considered (2), taking (2), searching (2), vast (2), majority (2), focused (2), efficiency (2), consumer (2), devices (2), started (2), research (2), forward (2), including (2), migrated (2), answer (2), finland (2), sweden (2), africans (2), popping (2), plausible (2), quite (2), european (2), made (2), normalized (2), shows (2), running (2), every (2), especially (2), good (2), predicting (2), cases (2), distances (2), keep (2), predictions (2), mexican (2), thought (2), diligenced (2), ensure (2), euclidean (2), proved (2), datasets (2), symbolically (2), conclude (2), individuals (2), based (2), depend (2), paternal (2), cover (2), making (2), attractive (2), discusses (2), three (2), sequences (2), genetics (2), reading (2), consequence (2), indexes (2), circular (2), mk033602 (2), save (2), archeological (2), 328 (2), maximize (2), connection (2), alone (2), contradict (2), left (2), disparate (2), pacific (2), authors (2), quoted (2), numbers (2), classified (2), cave (2), isolate (2), relevant (2), page (2), neanderthalensis (2), appear (2), files (2), linked (2), assembled (2), except (2), distinctions (2), decidedly (2), interesting (2), too (2), argue (2), call (2), ton (2), things (2), existence (2), interact (2), kind (2), region (2), versions (2), denote (2), combine (2), antimatter (2), pair (2), repulsive (2), force (2), existed (2), filled (2), attracted (2), sufficiently (2), depends (2), assumption (2), build (2), reasons (2), had (2), brain (2), occurred (2), indels (2), period (2), provides (2), addition (2), allowing (2), rock (2), solid (2), mankind (2), demonstrating (2), noting (2), credibly (2), doubt (2), canada (2), relationship (2), iceland (2), viking (2), kick (2), tires (2), uncertainty (2), dublin (2), falsification (2), confident (2), ruled (2), 642 (2), civilization (2), 129 (2), 385 (2), 320 (2), concentrated (2), travelled (2), kazakh (2), north (2), ancestors (2), earlier (2), asian (2), acronyms (2), outside (2), cleopatra (2), king (2), menkaure (2), nefertiti (2), until (2), rulers (2), vedda (2), quickly (2), contain (2), finds (2), evolve (2), confidently (2), expected (2), variables (2), looks (2), few (2), happening (2), super (2), machine (2), means (2), everyone (2), genes (2), discuss (2), able (2), empirically (2), long (2), consists (2), represented (2), tests (2), stored (2), entry (2), whereas (2), algorithms (2), similarities (2), 414 (2), genetically (2), differences (2), increases (2), subsets (2), worth (2), selecting (2), wear (2), gallery (2), exhibition (2), conveying (2), seriously (2), event (2), aesthetic (2), etc (2), facilitate (2), mating (2), decisions (2), exponent (2), expression (2), language (2), opportunity (2), pictures (2), kitchen (2), book (2), sinatra (2), search (2), name, email, loading, comments, collapse, manage, subscriptions, view, reader, report, privacy, account, join, subscribers, 2010, 2011, archives, kalles, kultur, currently, listening, older, navigation, aggressive, views, hand, rotational, conserved, regardless, collides, order, concept, equations, electrostatic, magnetic, centrifugal, unified, entitled, combinatorial, visualize, give, literal, occupy, string, lose, translates, macroscopically, reliably, struck, traveling, towards, causes, displaced, somewhat, pinwheel, strike, independently, reduce, increase, decrease, simplify, matters, concerned, unstated, limits, bouncing, macroscopic, object, heavier, leptons, allows, insight, importantly, notion, tau, treating, fake, inbound, evaluates, maximizing, difference, becomes, maximally, scatters, passing, intuitive, former, extent, losing, recoil, thrown, misses, implications, generalized, governs, arthur, proposed, formula, describe, observed, rays, collide, relies, heavily, introduces, property, nature, pings, ray, billiard, balls, kinds, issues, evaluation, mechanically, evaluable, resolve, prevent, assigning, principles, logic, adhere, exclude, own, solving, contradiction, engaging, contradictions, principle, yields, resolves, distilled, evaluate, asserted, despite, intrinsic, nor, typical, boolean, operator, output, preventing, concocting, assigned, remember, heard, college, came, grammars, thinking, articulation, attributed, greek, philosopher, named, man, lying, distill, formulation, eubulides, simpler, realities, theories, experiments, horrible, unscientific, habit, embracing, suspicious, military, economic, eraser, flat, tunnel, distributions, angles, overlap, along, others, inside, travel, vacuum, arrival, against, path, gun, pointed, diagram, while, measurable, effects, barring, assumptions, download, local_alignment, incomparably, slower, plan, histories, uncovered, reflection, carefully, compares, maximized, mapped, unbelievable, focus, guess, attached, exhausted, extended, presence, establish, began, yes, europeans, asians, mongolia, kenya, tanzania, average, turning, heterogeneous, addressed, overlaps, geography, greeks, brought, maybe, alexander, greco, bactrian, kingdom, percentages, closely, dividing, maps, increment, afghanistan, migrations, covered, planet, located, central, interpretation, incorrect, 605, 208, mind, school, teaches, bering, strait, uncommon, surprised, higher, ask, lives, hold, try, predict, remarkable, complex, trait, provably, generate, easier, terms, performing, task, considers, closer, indicate, evolutionary, factor, developing, sequence, without, inapplicable, traits, father, copy, relatively, advances, sequencing, fictitious, zero, loop, four, determining, determinative, nucleotide, kf358472, fn673705, fr695060, nc_013993, mt576651, mt576652, mt576653, kt780370, kx663333, ku131206, ky751400, mk123269, mt576650, mt921957, mt795654, mt677921, om062614, links, 289, 588, 461, discovered, belgium, 183, infer, discussion, adds, credibility, patterns, linking, scladina, 591, 898, siberian, later, 915, beyond, 300, label, keyed, archeologically, per, estimated, molecular, age, newly, identified, 134, posterior, density, hpd, 177, bayesian, dating, 389, lists, organism, listed, quote, pdf, title, phrase, homo, sapiens, sub_species, mutual, iii, focuses, seem, fasta, available, clusters, cherry, picking, writing, sequenced, led, thesis, material, hypothesize, elegant, detectable, ironically, hypothesized, hitherto, arguably, under, hypothetical, kicked, stick, notation, unknown, slac, 144, divide, inexplicable, acceleration, old, completing, surrounded, exclusively, denoted, exposed, opposite, signs, repulsion, quantity, repel, proximate, completes, dawned, last, night, predicted, mathematicians, cans, deviates, throw, impossible, ostensible, fuss, handwaving, defense, implausible, everything, anthropomorphic, observers, neutrino, deviate, convoluted, rewrote, gotten, anything, measurement, past, narrative, run, 908, outlined, spans, approximate, knowledge, invariant, mathematical, empiricism, gets, job, vary, geneticists, continuing, identifies, insertions, deletions, measure, sum, amazing, arguments, insertion, deletion, drop, inserting, measuring, examine, structure, uninhabited, included, collected, fair, skepticism, deliberately, deceive, researchers, purportedly, members, careful, type, pronounced, colony, established, ruling, dated, squarely, 321, perform, confirmation, seven, historically, counterintuitive, broader, labelling, civilizations, indian, russian, offspring, ancestral, descendant, phoenicia, supported, nepal, denmark, georgia, thailand, philippines, korea, representation, nuance, comes, accepted, museum, arts, boston, lasted, 3150, surprising, demographics, changed, shift, panel, images, 1353, 1336, his, 2550, 2503, macedonian, 2000, sardinians, aboriginals, learn, mongolians, compelling, evolves, instances, remained, mutate, rates, location, clustered, argument, contiguous, highly, driver, never, driving, caused, forming, want, steps, involve, epistemological, reflections, comfortable, uncertain, 415, representative, class, success, correspond, binomial, india, historical, geographies, notes, void, 2500, study, appreciate, takeaway, peaking, exceeds, elapsed, recall, minimum, retrieve, highlight, kazakhstan, italy, russia, apparently, debate, concede, distinguish, third, sima, los, huesos, relatives, easy, software, apple, pro, statistics, once, popularized, possibly, serious, criticism, skull, britannica, popular, curious, demonstrated, categorically, superior, relying, imputation, final, genomics, exceptions, survived, someone, accomplishes, explained, according, 600, 200, immediately, mathematically, details, occur, periods, thousands, sometimes, brittanica, ethnically, opposed, resident, classifiers, statistical, big, picture, matrix, tells, belongs, taken, rows, missing, database, building, uses, learning, informal, fairly, rigorous, level, goal, tell, largest, accounting, alive, shortly, stuck, construct, qualified, scratch, articles, analyzed, assembling, completed, hard, understand, morphologically, society, risk, incest, heightened, societies, suggest, happen, diversity, drastically, cardinality, ranking, 1187, talking, realistically, 1000, collections, outfit, strong, sort, demonstrate, fitness, net, understanding, straightforward, centers, concert, attend, interested, keith, haring, caravaggio, weird, exhibit, primarily, respond, appearance, smell, taste, visceral, mates, potentially, roles, thereby, anomalous, importance, filtering, crowds, basis, shared, discussing, 765, 625, binary, bits, dominates, formalizing, limiting, artifact, piece, music, painting, pants, scale, ranging, indifference, reality, books, paintings, range, wider, expressive, five, seen, surprise, motivating, harder, live, clothes, reached, morning, theoretical, exists, described, tough, coincidence, saw, fan, adult, vega, frank, palm, springs, indexed, archive, disclaimer, copyright, policy, skip, menu,
Text of the page (random words):
nisovan genome 1 row 377 of my dataset the plain conclusion is that neanderthal genome 5 is an archaic siberian denisovan individual with a close maternal connection to living west africans as noted above the cameroon test as the most ancient people across my dataset suggesting a migration from cameroon to siberia which is consistent with the out of africa hypothesis but does not contradict my migration back hypothesis since it s entirely possible that later denisovans migrated back to europe or africa from siberia or further into east asia and the pacific however that is not the point of this note which is limited to the misclassification of two neanderthal genomes neanderthal genome 6 similarly neanderthal genome 6 has 5 289 bases i e 31 90 of the full genome in common with its closest match among the other neanderthal genomes save for neanderthal genome 6 which also seems to be denisovan as discussed above in contrast neanderthal genome 6 has 8 588 bases i e 51 80 of the full genome in common with denisovan genome 1 row 377 of my dataset further neanderthal genome 6 has 10 461 bases i e 63 09 of the full genome in common with the same cameroon genome discussed above however unlike neanderthal genome 5 the provenance file for neanderthal genome 6 and the related article make it clear the genome was discovered in scladina which is an archeological site in belgium even using a local alignment the resultant number of matching bases between neanderthal genome 6 and the cameroon genome is 16 183 which is lower than the number of matching bases between neanderthal genome 6 and that same genome i e 16 328 using a global alignment note that local alignments maximize the number of matching bases the sensible conclusion being that neanderthal genome 6 is actually denisovan though it is not as close to the cameroon genome as neanderthal genome 5 though it is close enough to infer african ancestry this is again consistent with the out of africa hypothesis though it s not clear whether this genome has any connection to asia at least limited to this discussion alone and as such it adds no further credibility to my migration back hypothesis though it does not contradict the migration back hypothesis in any way since it s entirely possible at least some people left africa directly for europe or other places in contrast the migration back hypothesis is about the overall migration patterns of some of the most modern mtdna genomes in the dataset linking otherwise disparate modern humans across enormous distances genome provenance links neanderthal genomes 1 https www ncbi nlm nih gov nuccore om062614 1 2 https www ncbi nlm nih gov nuccore mt677921 1 3 https www ncbi nlm nih gov nuccore mt795654 1 4 https www ncbi nlm nih gov nuccore mt921957 1 5 https www ncbi nlm nih gov nuccore mt576650 1 6 https www ncbi nlm nih gov nuccore mk123269 1 7 https www ncbi nlm nih gov nuccore ky751400 2 8 https www ncbi nlm nih gov nuccore mk033602 1 9 https www ncbi nlm nih gov nuccore mk033602 1 10 https www ncbi nlm nih gov nuccore ku131206 2 denisovan genomes 1 https www ncbi nlm nih gov nuccore kx663333 1 2 https www ncbi nlm nih gov nuccore kt780370 1 3 https www ncbi nlm nih gov nuccore mt576653 1 4 https www ncbi nlm nih gov nuccore mt576652 1 5 https www ncbi nlm nih gov nuccore mt576651 1 6 https www ncbi nlm nih gov nuccore nc_013993 1 7 https www ncbi nlm nih gov nuccore fr695060 1 8 https www ncbi nlm nih gov nuccore fn673705 1 cameroon genome 1 https www ncbi nlm nih gov nucleotide kf358472 1 human migration and mtdna january 7 2026 january 7 2026 erdosfan leave a comment genetic alignment because of relatively recent advances in genetic sequencing we can now read entire mtdna genomes however because mtdna is circular it s not clear where you should start reading the genome as a consequence when comparing two genomes you have no common starting point and the selection of that starting point will impact the number of matching bases as a simple example consider the two fictitious genomes and if we count matching bases using the first index of each genome then the number of matching bases is zero if instead we start at the first index of and the second index of and loop back around to the first g of the match count will be four or 100 of the bases as such determining the starting indexes for comparison i e the genome alignment is determinative of the match count it turns out that mtdna is unique in that it is inherited directly from the mother generally without any mutations at all as such the intuition for combinations of sequences typically associated with genetics is inapplicable to mtdna since there is no combination of traits or sequences inherited from the mother and the father and instead a basically perfect copy of the mother s genome is inherited as a result it makes perfect sense to use a global alignment which we did above where we compared one entire genome to another entire genome in contrast we could instead make use of a local alignment where we compare segments of two genomes for example consider genomes and first you ll note these genomes are not the same length unlike in the example above which is another factor to be considered when developing an alignment for comparison if we simply use the first three bases of each genome for comparison then the match count will be one since the first two initial a s match if instead we use index two of and index one of then the entire sequence matches and the resultant match count will be three note that the number of possible global alignments is simply the length of the genome that is when using a global alignment you fix one genome and rotate the other one base at a time and that will cover all possible global alignments between the two genomes in contrast the number of local alignments is much larger since you have to consider all local alignments of each possible length as a result it is much easier to consider all possible global alignments between two genomes than local alignments in fact it turns out there is exactly one plausible global alignment for mtdna making global alignments extremely attractive in terms of efficiency specifically it takes 0 02 seconds to compare a given genome to my entire dataset of roughly 650 genomes using a global alignment performing the same task using a local alignment takes one hour and the algorithm i ve been using considers only a small subset of all possible local alignments that said local alignments allow you to take a closer look at two genomes and find common segments which could indicate a common evolutionary history this note discusses global alignments i ll write something soon that discusses local alignments as a second look to support my work on mtdna generally nearest neighbor the nearest neighbor algorithm can provably generate perfect accuracy for certain euclidean datasets that said dna is obviously not euclidean and as such the results i proved do not hold for dna datasets however common sense suggests we might as well try it and it turns out you get really good results that are significantly better than chance to apply the nearest neighbor algorithm to an mtdna genome we simply find the genome that has the most bases in common with i e its best match in the dataset and hence its nearest neighbor symbolically you could write as for accuracy using nearest neighbor to predict the ethnicity of each individual in my dataset produces an accuracy of 30 87 and because there are 75 global ethnicities chance implies an accuracy of as such we can conclude that the nearest neighbor algorithm is not producing random results and more generally produces results that provide meaningful information about the ethnicities of individuals based solely upon their mtdna which is remarkable since ethnicity is a complex trait that clearly should depend upon paternal ancestry as well the global distribution of mtdna it turns out the distribution of mtdna is truly global and a result we should not be surprised that the accuracy of the nearest neighbor method as applied to my dataset is a little low though as noted it is significantly higher than chance and therefore plainly not producing random predictions that is if we ask what is e g the best match for a norwegian genome you could find that it is a mexican genome which is in fact the case for this norwegian genome now you might say this is just a mexican person that lives in norway but i ve of course thought of this and each genome has been diligenced to ensure that the stated ethnicity of the person is e g norwegian now keep in mind that this is literally the closest match for this norwegian genome and it s somehow on the other side of the world but high school history teaches us about migration over the bering strait and this could literally be an instance of that but it doesn t have to be the bottom line is mtdna mutates so slowly that outcomes like this are not uncommon in fact by definition because the accuracy of the nearest neighbor method is 38 07 when applied to predicting ethnicity it must be the case that 100 38 07 69 13 of genomes have a nearest neighbor that is of a different ethnicity one interpretation is that oh well the nearest neighbor method isn t very good at predicting ethnicity but this is simply incorrect because the resultant match counts are almost always over 99 of the entire genome specifically 605 of the 664 genomes in the dataset i e 91 11 map to a nearest neighbor that is 99 or more identical to the genome in question further 208 of the 664 genomes in the dataset i e 31 33 map to a nearest neighbor that is 99 9 or more identical to the genome in question the plain conclusion is that more often than not nearly identical genomes are found in different ethnicities and in some cases the distances are enormous in particular the pashtuns are the nearest neighbors of a significant number of global genomes below is a chart showing the number of times by ethnicity that a pashtun genome was a nearest neighbor of that ethnicity so e g returning to norway column 7 there are 3 norwegian genomes that have a pashtun nearest neighbor and so column 7 has a height of 3 more generally the chart is produced by running the nearest neighbor algorithm on every genome in the dataset and if a given genome maps to a pashtun genome we increment the applicable column for the genome s ethnicity e g norway column 7 there are 20 norwegian genomes so of norwegian genomes map to pashtuns who are generally located in central asia in particular afghanistan this seems far but in the full context of human history it s really not especially given known migrations which covered nearly the whole planet the chart above is not normalized to show percentages and instead shows the integer number of pashtun nearest neighbors for each column however it turns out that a significant percentage of genomes in ethnicities all over the world map to the pashtuns which is just not true generally of other ethnicities that is it seems the pashtuns are a source population or closely related to that source population of a significant number of people globally this is shown in the chart below which is normalized by dividing each column by the number of genomes in that column s population producing a percentage as you can see a significant percentage of europeans e g finland norway and sweden columns 6 7 and 8 respectively east asians e g japan and mongolia columns 4 and 44 respectively and africans e g kenya and tanzania columns 46 and 70 respectively have genomes that are closest to pashtuns further the average match count to a pashtun genome over this chart is so these are plainly meaningful nearly identical matches finally these pashtun genomes that are turning up as nearest neighbors are heterogeneous that is it s not the case that a single pashtun genome is popping up globally and instead multiple distinct pashtun genomes are popping up globally as nearest neighbors one not so plausible explanation that i think should be addressed is the greco bactrian kingdom which overlaps quite a bit with the geography of the pashtuns the hypothesis would be that ancient greeks brought european mtdna to the pashtuns maybe but i don t think alexander the great made it to japan so we need a different hypothesis to explain the global distribution of pashtun mtdna all of this is instead consistent with what i ve called the migration back hypothesis which is that humanity begins in africa migrates to asia and then migrates back to africa and europe and further into east asia this is a more general hypothesis that many populations including the pashtuns migrated back from asia to africa and europe and extended their presence into east asia the question is can we also establish that humanity began in africa using these and other similar methods astonishingly the answer is yes and this is discussed at some length in a summary on mtdna that i ve written local alignment algorithm for mtdna january 6 2026 january 6 2026 erdosfan leave a comment the vast majority of my work in mtdna has focused on global alignments because of the unbelievable efficiency this creates even when working on consumer devices however i ve basically exhausted the topic of global alignments so i ve started to focus on local alignments to further support my research that is i m going to second guess my work using local alignments to see if i get the same results so far that is exactly the case using the attached local alignment algorithm which is very straight forward specifically it takes a given input genome and compares it to a comparison genome by taking 500 bases at a time and searching one by one for the index of the comparison genome where the match count between those 500 bases from the input genome is maximized when compared to the comparison genome it does so for all 500 base segments of the input genome producing a starting index for each such 500 base segment of the input genome that is mapped to the comparison genome i e an alignment and a total match count using that alignment it is incomparably slower than my global alignment algorithm which takes just 0 02 seconds to the find the nearest neighbor of an input genome over a dataset of approximately 650 whole mtdna genomes which is obviously really useful since it s so fast in contrast the local alignment algorithm takes 1 hour to find the nearest neighbor of an input genome over the same dataset this is obviously much less useful for discovery purposes but my plan is to use it as further evidence for the histories i uncovered using mostly autonomous extremely fast global alignment methods upon reflection the gist is global alignment methods are so fast they allow for high volume autonomous discovery that can then be more carefully considered using local alignments here s the code more to come soon local_alignment download a fundamental problem with the double slit experiment december 29 2025 december 29 2025...
|