[go: up one dir, main page]

Showing posts with label nonlinearity. Show all posts
Showing posts with label nonlinearity. Show all posts

Wednesday, March 07, 2018

Better to be Lucky than Good?

The arXiv paper below looks at stochastic dynamical models that can transform initial (e.g., Gaussian) talent distributions into power law outcomes (e.g., observed wealth distributions in modern societies). While the models themselves may not be entirely realistic, they illustrate the potentially large role of luck relative to ability in real life outcomes.

We're used to seeing correlations reported, often between variables that have been standardized so that both are normally distributed. I've written about this many times in the past: Success, Ability, and All That , Success vs Ability.





But wealth typically follows a power law distribution:


Of course, it might be the case that better measurements would uncover a power law distribution of individual talents. But it's far more plausible to me that random fluctuations + nonlinear amplifications transform, over time, normally distributed talents into power law outcomes.

Talent vs Luck: the role of randomness in success and failure
https://arxiv.org/pdf/1802.07068.pdf

The largely dominant meritocratic paradigm of highly competitive Western cultures is rooted on the belief that success is due mainly, if not exclusively, to personal qualities such as talent, intelligence, skills, smartness, efforts, willfulness, hard work or risk taking. Sometimes, we are willing to admit that a certain degree of luck could also play a role in achieving significant material success. But, as a matter of fact, it is rather common to underestimate the importance of external forces in individual successful stories. It is very well known that intelligence (or, more in general, talent and personal qualities) exhibits a Gaussian distribution among the population, whereas the distribution of wealth - often considered a proxy of success - follows typically a power law (Pareto law), with a large majority of poor people and a very small number of billionaires. Such a discrepancy between a Normal distribution of inputs, with a typical scale (the average talent or intelligence), and the scale invariant distribution of outputs, suggests that some hidden ingredient is at work behind the scenes. In this paper, with the help of a very simple agent-based toy model, we suggest that such an ingredient is just randomness. In particular, we show that, if it is true that some degree of talent is necessary to be successful in life, almost never the most talented people reach the highest peaks of success, being overtaken by mediocre but sensibly luckier individuals. As to our knowledge, this counterintuitive result - although implicitly suggested between the lines in a vast literature - is quantified here for the first time. It sheds new light on the effectiveness of assessing merit on the basis of the reached level of success and underlines the risks of distributing excessive honors or resources to people who, at the end of the day, could have been simply luckier than others. With the help of this model, several policy hypotheses are also addressed and compared to show the most efficient strategies for public funding of research in order to improve meritocracy, diversity and innovation.
Here is a specific example of random fluctuations and nonlinear amplification:
Nonlinearity and Noisy Outcomes: ... The researchers placed a number of songs online and asked volunteers to rate them. One group rated them without seeing others' opinions. In a number of "worlds" the raters were allowed to see the opinions of others in their world. Unsurprisingly, the interactive worlds exhibited large fluctuations, in which songs judged as mediocre by isolated listeners rose on the basis of small initial fluctuations in their ratings (e.g., in a particular world, the first 10 raters may have all liked an otherwise mediocre song, and subsequent listeners were influenced by this, leading to a positive feedback loop).

It isn't hard to think of a number of other contexts where this effect plays out. Think of the careers of two otherwise identical competitors (e.g., in science, business, academia). The one who enjoys an intial positive fluctuation may be carried along far beyond their competitor, for no reason of superior merit. The effect also appears in competing technologies or brands or fashion trends.

If outcomes are so noisy, then successful prediction is more a matter of luck than skill. The successful predictor is not necessarily a better judge of intrinsic quality, since quality is swamped by random fluctuations that are amplified nonlinearly. This picture undermines the rationale for the high compensation awarded to certain CEOs, studio and recording executives, even portfolio managers. ...

Sunday, October 11, 2015

Additivity in yeast quantitative traits



A new paper from the Kruglyak lab at UCLA shows yet again (this time in yeast) that population variation in quantitative traits tends to be dominated by additive effects. There are deep evolutionary reasons for this to be the case -- see excerpt below (at bottom of this post). For other examples, including humans, mice, chickens, cows, plants, see links here.
Genetic interactions contribute less than additive effects to quantitative trait variation in yeast (http://dx.doi.org/10.1101/019513)

Genetic mapping studies of quantitative traits typically focus on detecting loci that contribute additively to trait variation. Genetic interactions are often proposed as a contributing factor to trait variation, but the relative contribution of interactions to trait variation is a subject of debate. Here, we use a very large cross between two yeast strains to accurately estimate the fraction of phenotypic variance due to pairwise QTLQTL interactions for 20 quantitative traits. We find that this fraction is 9% on average, substantially less than the contribution of additive QTL (43%). Statistically significant QTL-QTL pairs typically have small individual effect sizes, but collectively explain 40% of the pairwise interaction variance. We show that pairwise interaction variance is largely explained by pairs of loci at least one of which has a significant additive effect. These results refine our understanding of the genetic architecture of quantitative traits and help guide future mapping studies.


Genetic interactions arise when the joint effect of alleles at two or more loci on a phenotype departs from simply adding up the effects of the alleles at each locus. Many examples of such interactions are known, but the relative contribution of interactions to trait variation is a subject of debate1–5. We previously generated a panel of 1,008 recombinant offspring (“segregants”) from a cross between two strains of yeast: a widely used laboratory strain (BY) and an isolate from a vineyard (RM)6. Using this panel, we estimated the contribution of additive genetic factors to phenotypic variation (narrow-sense or additive heritability) for 46 traits and resolved nearly all of this contribution (on average 87%) to specific genome-wide-significant quantitative trait loci (QTL). ...

We detected nearly 800 significant additive QTL. We were able to refine the location of the QTL explaining at least 1% of trait variance to approximately 10 kb, and we resolved 31 QTL to single genes. We also detected over 200 significant QTL-QTL interactions; in most cases, one or both of the loci also had significant additive effects. For most traits studied, we detected one or a few additive QTL of large effect, plus many QTL and QTL-QTL interactions of small effect. We find that the contribution of QTL-QTL interactions to phenotypic variance is typically less than a quarter of the contribution of additive effects. These results provide a picture of the genetic contributions to quantitative traits at an unprecedented resolution.

... One can test for interactions either between all pairs of markers (full scan), or only between pairs where one marker corresponds to a significant additive QTL (marginal scan). In principle, the former can detect a wider range of interactions, but the latter can have higher power due to a reduced search space. Here, the two approaches yielded similar results, detecting 205 and 266 QTL-QTL interactions, respectively, at an FDR of 10%, with 172 interactions detected by both approaches. In the full scan, 153 of the QTL-QTL interactions correspond to cases where both interacting loci are also significant additive QTL, 36 correspond to cases where one of the loci is a significant additive QTL, and only 16 correspond to cases where neither locus is a significant additive QTL.
For related discussion of nonlinear genetic models, see here:
It is a common belief in genomics that nonlinear interactions (epistasis) in complex traits make the task of reconstructing genetic models extremely difficult, if not impossible. In fact, it is often suggested that overcoming nonlinearity will require much larger data sets and significantly more computing power. Our results show that in broad classes of plausibly realistic models, this is not the case.
Determination of Nonlinear Genetic Architecture using Compressed Sensing (arXiv:1408.6583)
Chiu Man Ho, Stephen D.H. Hsu
Subjects: Genomics (q-bio.GN); Applications (stat.AP)

We introduce a statistical method that can reconstruct nonlinear genetic models (i.e., including epistasis, or gene-gene interactions) from phenotype-genotype (GWAS) data. The computational and data resource requirements are similar to those necessary for reconstruction of linear genetic models (or identification of gene-trait associations), assuming a condition of generalized sparsity, which limits the total number of gene-gene interactions. An example of a sparse nonlinear model is one in which a typical locus interacts with several or even many others, but only a small subset of all possible interactions exist. It seems plausible that most genetic architectures fall in this category. Our method uses a generalization of compressed sensing (L1-penalized regression) applied to nonlinear functions of the sensing matrix. We give theoretical arguments suggesting that the method is nearly optimal in performance, and demonstrate its effectiveness on broad classes of nonlinear genetic models using both real and simulated human genomes.
I've discussed additivity many times previously, so I'll just quote below from Additivity and complex traits in mice:
You may have noticed that I am gradually collecting copious evidence for (approximate) additivity. Far too many scientists and quasi-scientists are infected by the epistasis or epigenetics meme, which is appealing to those who "revel in complexity" and would like to believe that biology is too complex to succumb to equations. ...

I sometimes explain things this way:

There is a deep evolutionary reason behind additivity: nonlinear mechanisms are fragile and often "break" due to DNA recombination in sexual reproduction. Effects which are only controlled by a single locus are more robustly passed on to offspring. ...

Many people confuse the following statements:

"The brain is complex and nonlinear and many genes interact in its construction and operation."

"Differences in brain performance between two individuals of the same species must be due to nonlinear (non-additive) effects of genes."

The first statement is true, but the second does not appear to be true across a range of species and quantitative traits. On the genetic architecture of intelligence and other quantitative traits (p.16):
... The preceding discussion is not intended to convey an overly simplistic view of genetics or systems biology. Complex nonlinear genetic systems certainly exist and are realized in every organism. However, quantitative differences between individuals within a species may be largely due to independent linear effects of specific genetic variants. As noted, linear effects are the most readily evolvable in response to selection, whereas nonlinear gadgets are more likely to be fragile to small changes. (Evolutionary adaptations requiring significant changes to nonlinear gadgets are improbable and therefore require exponentially more time than simple adjustment of frequencies of alleles of linear effect.) One might say that, to first approximation, Biology = linear combinations of nonlinear gadgets, and most of the variation between individuals is in the (linear) way gadgets are combined, rather than in the realization of different gadgets in different individuals.

Linear models work well in practice, allowing, for example, SNP-based prediction of quantitative traits (milk yield, fat and protein content, productive life, etc.) in dairy cattle. ...
See also Explain it to me like I'm five years old.

Tuesday, December 21, 2010

Scaling laws for cities



Shorter Geoff West on cities (isn't math great?): resource consumption scales sub-linearly with population, whereas economic output scales super-linearly.

Uncompressed version follows below and at link ;-)

NYTimes: ... After two years of analysis, West and Bettencourt discovered that all of these urban variables could be described by a few exquisitely simple equations. For example, if they know the population of a metropolitan area in a given country, they can estimate, with approximately 85 percent accuracy, its average income and the dimensions of its sewer system. These are the laws, they say, that automatically emerge whenever people “agglomerate,” cramming themselves into apartment buildings and subway cars. It doesn’t matter if the place is Manhattan or Manhattan, Kan.: the urban patterns remain the same. West isn’t shy about describing the magnitude of this accomplishment. “What we found are the constants that describe every city,” he says. “I can take these laws and make precise predictions about the number of violent crimes and the surface area of roads in a city in Japan with 200,000 people. I don’t know anything about this city or even where it is or its history, but I can tell you all about it. And the reason I can do that is because every city is really the same.” After a pause, as if reflecting on his hyperbole, West adds: “Look, we all know that every city is unique. That’s all we talk about when we talk about cities, those things that make New York different from L.A., or Tokyo different from Albuquerque. But focusing on those differences misses the point. Sure, there are differences, but different from what? We’ve found the what.”

There is something deeply strange about thinking of the metropolis in such abstract terms. [REALLY?!?] We usually describe cities, after all, as local entities defined by geography and history. New Orleans isn’t a generic place of 336,644 people. It’s the bayou and Katrina and Cajun cuisine. New York isn’t just another city. It’s a former Dutch fur-trading settlement, the center of the finance industry and home to the Yankees. And yet, West insists, those facts are mere details, interesting anecdotes that don’t explain very much. The only way to really understand the city, West says, is to understand its deep structure, its defining patterns, which will show us whether a metropolis will flourish or fall apart. We can’t make our cities work better until we know how they work. And, West says, he knows how they work.

West has been drawn to different fields before. In 1997, less than five years after he transitioned away from high-energy physics, he published one of the most contentious and influential papers in modern biology. (The research, which appeared in Science, has been cited more than 1,500 times.) The last line of the paper summarizes the sweep of its ambition, as West and his co-authors assert that they have just solved “the single most pervasive theme underlying all biological diversity,” showing how the most vital facts about animals — heart rate, size, caloric needs — are interrelated in unexpected ways.

... In city after city, the indicators of urban “metabolism,” like the number of gas stations or the total surface area of roads, showed that when a city doubles in size, it requires an increase in resources of only 85 percent.

This straightforward observation has some surprising implications. It suggests, for instance, that modern cities are the real centers of sustainability. According to the data, people who live in densely populated places require less heat in the winter and need fewer miles of asphalt per capita. (A recent analysis by economists at Harvard and U.C.L.A. demonstrated that the average Manhattanite emits 14,127 fewer pounds of carbon dioxide annually than someone living in the New York suburbs.) Small communities might look green, but they consume a disproportionate amount of everything. As a result, West argues, creating a more sustainable society will require our big cities to get even bigger. We need more megalopolises.

But a city is not just a frugal elephant; biological equations can’t entirely explain the growth of urban areas. While the first settlements in Mesopotamia might have helped people conserve scarce resources — irrigation networks meant more water for everyone — the concept of the city spread for an entirely different reason. “In retrospect, I was quite stupid,” West says. He was so excited by the parallels between cities and living things that he “didn’t pay enough attention to the ways in which urban areas and organisms are completely different.”

What Bettencourt and West failed to appreciate, at least at first, was that the value of modern cities has little to do with energy efficiency. As West puts it, “Nobody moves to New York to save money on their gas bill.” Why, then, do we put up with the indignities of the city? Why do we accept the failing schools and overpriced apartments, the bedbugs and the traffic?

In essence, they arrive at the sensible conclusion that cities are valuable because they facilitate human interactions, as people crammed into a few square miles exchange ideas and start collaborations. “If you ask people why they move to the city, they always give the same reasons,” West says. “They’ve come to get a job or follow their friends or to be at the center of a scene. That’s why we pay the high rent. Cities are all about the people, not the infrastructure.”

It’s when West switches the conversation from infrastructure to people that he brings up the work of Jane Jacobs, the urban activist and author of “The Death and Life of Great American Cities.” Jacobs was a fierce advocate for the preservation of small-scale neighborhoods, like Greenwich Village and the North End in Boston. The value of such urban areas, she said, is that they facilitate the free flow of information between city dwellers. To illustrate her point, Jacobs described her local stretch of Hudson Street in the Village. She compared the crowded sidewalk to a spontaneous “ballet,” filled with people from different walks of life. School kids on the stoops, gossiping homemakers, “business lunchers” on their way back to the office. While urban planners had long derided such neighborhoods for their inefficiencies — that’s why Robert Moses, the “master builder” of New York, wanted to build an eight-lane elevated highway through SoHo and the Village — Jacobs insisted that these casual exchanges were essential. She saw the city not as a mass of buildings but rather as a vessel of empty spaces, in which people interacted with other people. The city wasn’t a skyline — it was a dance.

If West’s basic idea was familiar, however, the evidence he provided for it was anything but. The challenge for Bettencourt and West was finding a way to quantify urban interactions. As usual, they began with reams of statistics. The first data set they analyzed was on the economic productivity of American cities, and it quickly became clear that their working hypothesis — like elephants, cities become more efficient as they get bigger — was profoundly incomplete. According to the data, whenever a city doubles in size, every measure of economic activity, from construction spending to the amount of bank deposits, increases by approximately 15 percent per capita. It doesn’t matter how big the city is; the law remains the same. “This remarkable equation is why people move to the big city,” West says. “Because you can take the same person, and if you just move them to a city that’s twice as big, then all of a sudden they’ll do 15 percent more of everything that we can measure.” While Jacobs could only speculate on the value of our urban interactions, West insists that he has found a way to “scientifically confirm” her conjectures. “One of my favorite compliments is when people come up to me and say, ‘You have done what Jane Jacobs would have done, if only she could do mathematics,’ ” West says. “What the data clearly shows, and what she was clever enough to anticipate, is that when people come together, they become much more productive.” ...

Tuesday, February 03, 2009

Power laws, baby

Thanks to reader GS for the link to this excellent article by statistical physicist Eugene Stanley. There's much more to the article, but I decided to excerpt this discussion of security price fluctuations and the observation that they are far from log normal.

Stanley: ...

When they analyzed these data–200 million of them–in exactly the same fashion that Bachelier had analyzed data almost a century earlier, they made a startling discovery. The pdf of price changes was not Gaussian plus outliers, as previously believed. Rather, all the data–including data previously termed outliers–conformed to a single pdf encompassing both everyday fluctuations and “once in a century” fluctuations. Instead of a Gaussian or some correction to a Gaussian, they found a power law pdf with exponent -4, a sufficiently large exponent that the difference from a Gaussian is not huge; however, the probability of a “once in a century” event of, say, 100 standard deviations is exp(-10,000) for the Gaussian, but simply 10-8 for an inverse quartic law. If one analyzes a data set containing 200 million data in two years, this means there are only two such events–in two years!

Now which is better, the concept of “everyday fluctuations” which can be modeled with a drunkard’s walk, complemented by a few “once in a century” outliers? Or a single empirical law with no outliers but for which a complete theory does not exist despite promising progress by Xavier Gabaix of NYU’s Stern School of Management and his collaborators? Here we come to one of the most salient differences between traditional economics and the econophysicists: economists are hesitant to put much stock in laws that have no coherent and complete theory supporting them, while physicists cannot afford this reluctance. There are so many phenomena we do not understand. Indeed, many physics “laws” have proved useful long before any theoretical underpinning was developed . . . Newton’s laws and Coulomb’s law to name but two.

And all of us are loathe to accept even a well-documented empirical law that seems to go against our own everyday experience. For stock price fluctuations, we all experience calm periods of everyday fluctuations, punctuated by highly volatile periods that seem to cluster. So we would expect the pdf of stock price fluctuations to be bimodal, with a broad maximum centered around, say, 1-3 standard deviations and then a narrow peak centered around, say, 50 standard deviations. And it is easy to show that if we do not have access to “all the data” but instead sample only a small fraction of the 200 million data recently analyzed, then this everyday experience is perfectly correct, since the rare events are indeed rare and we barely recall those that are “large but not that large”.

The same is true for earthquakes: our everyday experience teaches us that small quakes are going on all the time but are barely noticeable except by those who work at seismic detection stations. And every so often occurs a “once in a century” truly horrific event, such as the famous San Francisco earthquake. Yet when seismic stations analyze all the data, they find not the bimodal distribution of everyday experience but rather a power law, the Gutenberg-Richter law, describing the number of earthquakes of a given magnitude.

Sunday, April 15, 2007

Nonlinearity and noisy outcomes

The Times magazine has a great little summary of some recent social science research, which studied the effects of social influence on judgements of quality. The researchers placed a number of songs online and asked volunteers to rate them. One group rated them without seeing others' opinions. In a number of "worlds" the raters were allowed to see the opinions of others in their world. Unsurprisingly, the interactive worlds exhibited large fluctuations, in which songs judged as mediocre by isolated listeners rose on the basis of small initial fluctuations in their ratings (e.g., in a particular world, the first 10 raters may have all liked an otherwise mediocre song, and subsequent listeners were influenced by this, leading to a positive feedback loop).

It isn't hard to think of a number of other contexts where this effect plays out. Think of the careers of two otherwise identical competitors (e.g., in science, business, academia). The one who enjoys an intial positive fluctuation may be carried along far beyond their competitor, for no reason of superior merit. The effect also appears in competing technologies or brands or fashion trends.

If outcomes are so noisy, then successful prediction is more a matter of luck than skill. The successful predictor is not necessarily a better judge of intrinsic quality, since quality is swamped by random fluctuations that are amplified nonlinearly. This picture undermines the rationale for the high compensation awarded to certain CEOs, studio and recording executives, even portfolio managers. In recent years I've often heard the argument that these people deserve their compensation because they generate tremendous value for society by making correct decisions about resource allocation (especially if they sit at the cash nexus of finance). However, the argument depends heavily on the assumption that the people in question are really adding value, rather than just throwing darts. If the system is sufficiently noisy it may be almost impossible to tell one way or the other. We may be rewarding the lucky, rather than the good, and a society with fewer incentives for these people may be equally or nearly equally efficient.

See related discussion of studio executives here, and another related discussion here.

As anyone who follows the business of culture is aware, the profits of cultural industries depend disproportionately on the occasional outsize success — a blockbuster movie, a best-selling book or a superstar artist — to offset the many investments that fail dismally. What may be less clear to casual observers is why professional editors, studio executives and talent managers, many of whom have a lifetime of experience in their businesses, are so bad at predicting which of their many potential projects will make it big. How could it be that industry executives rejected, passed over or even disparaged smash hits like “Star Wars,” “Harry Potter” and the Beatles, even as many of their most confident bets turned out to be flops? It may be true, in other words, that “nobody knows anything,” as the screenwriter William Goldman once said about Hollywood. But why? Of course, the experts may simply not be as smart as they would like us to believe. Recent research, however, suggests that reliable hit prediction is impossible no matter how much you know — a result that has implications not only for our understanding of best-seller lists but for business and politics as well.

...But people almost never make decisions independently — in part because the world abounds with so many choices that we have little hope of ever finding what we want on our own; in part because we are never really sure what we want anyway; and in part because what we often want is not so much to experience the “best” of everything as it is to experience the same things as other people and thereby also experience the benefits of sharing.

There’s nothing wrong with these tendencies. Ultimately, we’re all social beings, and without one another to rely on, life would be not only intolerable but meaningless. Yet our mutual dependence has unexpected consequences, one of which is that if people do not make decisions independently — if even in part they like things because other people like them — then predicting hits is not only difficult but actually impossible, no matter how much you know about individual tastes.

The reason is that when people tend to like what other people like, differences in popularity are subject to what is called “cumulative advantage,” or the “rich get richer” effect. This means that if one object happens to be slightly more popular than another at just the right point, it will tend to become more popular still. As a result, even tiny, random fluctuations can blow up, generating potentially enormous long-run differences among even indistinguishable competitors — a phenomenon that is similar in some ways to the famous “butterfly effect” from chaos theory. Thus, if history were to be somehow rerun many times, seemingly identical universes with the same set of competitors and the same overall market tastes would quickly generate different winners: Madonna would have been popular in this world, but in some other version of history, she would be a nobody, and someone we have never heard of would be in her place.

...Fortunately, the explosive growth of the Internet has made it possible to study human activity in a controlled manner for thousands or even millions of people at the same time. Recently, my collaborators, Matthew Salganik and Peter Dodds, and I conducted just such a Web-based experiment. In our study, published last year in Science, more than 14,000 participants registered at our Web site, Music Lab (www.musiclab.columbia.edu), and were asked to listen to, rate and, if they chose, download songs by bands they had never heard of. Some of the participants saw only the names of the songs and bands, while others also saw how many times the songs had been downloaded by previous participants. This second group — in what we called the “social influence” condition — was further split into eight parallel “worlds” such that participants could see the prior downloads of people only in their own world. We didn’t manipulate any of these rankings — all the artists in all the worlds started out identically, with zero downloads — but because the different worlds were kept separate, they subsequently evolved independently of one another.

This setup let us test the possibility of prediction in two very direct ways. First, if people know what they like regardless of what they think other people like, the most successful songs should draw about the same amount of the total market share in both the independent and social-influence conditions — that is, hits shouldn’t be any bigger just because the people downloading them know what other people downloaded. And second, the very same songs — the “best” ones — should become hits in all social-influence worlds.

What we found, however, was exactly the opposite. In all the social-influence worlds, the most popular songs were much more popular (and the least popular songs were less popular) than in the independent condition. At the same time, however, the particular songs that became hits were different in different worlds, just as cumulative-advantage theory would predict. Introducing social influence into human decision making, in other words, didn’t just make the hits bigger; it also made them more unpredictable. ...

Blog Archive

Labels