Pessimism of the Intellect, Optimism of the Will Favorite posts | Manifold podcast | Twitter: @hsu_steve
Tuesday, January 15, 2019
Monday, September 03, 2018
PanOpticon in my Pocket: 0.35GB/month of surveillance, no charge!
Some quantities which can be easily calculated using this data: How many people visited a specific Toyota dealership last month? How many times did someone test drive a car? Who were those people who test drove a car? How many people stopped / started a typical 9-5 job commute pattern? (BLS only dreams of knowing this number.) What was the occupancy of a specific hotel or rental property last month? How many people were on the 1:30 PM flight from LAX to Laguardia last Friday? Who were they? ...
Of course, absolute numbers may be noisy, but diffs from month to month or year to year, with reasonable normalization / averaging, can yield insights at the micro, macro, and individual firm level.
If your quant team is not looking at this data, it should be ;-)
Google Data Collection
Professor Douglas C. Schmidt, Vanderbilt University
August 15, 2018
... Both Android and Chrome send data to Google even in the absence of any user interaction. Our experiments show that a dormant, stationary Android phone (with Chrome active in the background) communicated location information to Google 340 times during a 24-hour period, or at an average of 14 data communications per hour. In fact, location information constituted 35% of all the data samples sent to Google. In contrast, a similar experiment showed that on an iOS Apple device with Safari (where neither Android nor Chrome were used), Google could not collect any appreciable data (location or otherwise) in the absence of a user interaction with the device.
e. After a user starts interacting with an Android phone (e.g. moves around, visits webpages, uses apps), passive communications to Google server domains increase significantly, even in cases where the user did not use any prominent Google applications (i.e. no Google Search, no YouTube, no Gmail, and no Google Maps). This increase is driven largely by data activity from Google’s publisher and advertiser products (e.g. Google Analytics, DoubleClick, AdWords)11. Such data constituted 46% of all requests to Google servers from the Android phone. Google collected location at a 1.4x higher rate compared to the stationary phone experiment with no user interaction. Magnitude wise, Google’s servers communicated 11.6 MB of data per day (or 0.35 GB/month) with the Android device. This experiment suggests that even if a user does not interact with any key Google applications, Google is still able to collect considerable information through its advertiser and publisher products.
f. While using an iOS device, if a user decides to forgo the use of any Google product (i.e. no Android, no Chrome, no Google applications), and visits only non-Google webpages, the number of times data is communicated to Google servers still remains surprisingly high. This communication is driven purely by advertiser/publisher services. The number of times such Google services are called from an iOS device is similar to an Android device. In this experiment, the total magnitude of data communicated to Google servers from an iOS device is found to be approximately half of that from the Android device.
g. Advertising identifiers (which are purportedly “user anonymous” and collect activity data on apps and 3rd-party webpage visits) can get connected with a user’s Google identity. This happens via passing of device-level identification information to Google servers by an Android device. Likewise, the DoubleClick cookie ID (which tracks a user’s activity on the 3rd-party webpages) is another purportedly “user anonymous” identifier that Google can connect to a user’s Google Account if a user accesses a Google application in the same browser in which a 3rd-party webpage was previously accessed. Overall, our findings indicate that Google has the ability to connect the anonymous data collected through passive means with the personal information of the user.
Wednesday, August 30, 2017
Normies Lament
This interview with the Irish Times (not Ezra Klein) is much better than the one I originally linked to below.
###############
Ezra Klein talks to Angela Nagle. It's still normie normative, but Nagle has at least done some homework.
Click the link below to hear the podcast.
From 4Chan to Charlottesville: where the alt-right came from, and where it's going
Angela Nagle spent the better part of the past decade in the darkest corners of the internet, learning how online subcultures emerge and thrive on forums like 4chan and Tumblr.
The result is her fantastic new book, Kill All the Normies: Online Culture Wars From 4Chan And Tumblr to Trump and the Alt-Right, a comprehensive exploration of the origins of our current political moment.
We talk about the origins of the alt-right, and how the movement morphed from transgressive aesthetics on the internet to the violence in Charlottesville, but we also discuss PC culture on the left, demographic change in America, and the toxicity of online politics in general. Nagle is particularly interested in how the left's policing of language radicalizes its victims and creates space for alt-right groups to find eager recruits, and so we dive deep into that.
Books:
Civilization and Its Discontents by Sigmund Freud
This Is Why We Can't Have Nice Things: Mapping the Relationship between Online Trolling and Mainstream Culture by Whitney Phillips
The Net Delusion: The Dark Side of Internet Freedom by Evgeny Morozov
Monday, September 05, 2016
A secret map of the world (Venkatesh Rao / Ribbonfarm)
In case you can't make out all the features on the map, here is a hi-res version. See also this other map.
Some places of note:
Isle of Deep Learning
Isle of Physics
Moldbug's Lair
Alt-Right Hills
Dark Enlightenment Volcano
Paleo Crossing
Satoshi Mines
Secret Cloud Empire of Amazon
Fjords of Sisu
Algomonopolia (Google, Facebook, ...)
a16z Unicorn Hunting Ground
Lean Startup Town
SJW Cathedral
Manosphere Tar Pit
Global Bro-Science Laboratory
NSA
Academia
Efficient Market Temple
Graveyard of Boomer Dreams
Ghost of Industrial Past
If these memes are unfamiliar, you need to spend more time on the internet or in the bay area :-)
Sunday, August 07, 2016
Podcast: Clay Shirky on tech and the internet in China
In this episode of Sinica, Clay Shirky, the author of Here Comes Everybody who has written about the internet and its effects on society since the 1990s, joins Kaiser and Jeremy to discuss the strengths and weaknesses of China’s tech industry and the extraordinary advances the nation has made in the online world.
The hour-long conversation delves into the details and big-picture phenomena driving the globe’s largest internet market, and includes an analysis of Xiaomi’s innovation, the struggles that successful Chinese companies face when taking their brands abroad and the nation’s robust ecommerce offerings.
Clay has written numerous books, including Little Rice: Smartphones, Xiaomi, and the Chinese Dream in addition to the aforementioned Here Comes Everybody: The Power of Organizing Without Organizations. He is also a Shanghai-based associate professor with New York University’s Arthur L. Carter Journalism Institute and the school’s Interactive Telecommunications Program.
Related: NYTimes video explaining WeChat. (The future is here, it's just not evenly distributed.)
Friday, May 30, 2014
Sunday, April 14, 2013
Why blog? A professor responds
What I had in mind was a university-wide platform that would aggregate the output of participating faculty. This kind of branded expert channel might have a place amid the economic collapse in journalism we are currently experiencing. If Huffington Post is worth $315 million (OK, not really, just another dumb move by AOL), what might a platform showcasing 100 clever faculty from a major research university be worth? 100 bloggers (say, each posting once every 10 days or so = 10 new posts per day) out of 2000 MSU faculty doesn't sound too crazy, does it?
Hi Steve,I certainly view blogging as a means of recording and organizing my thoughts. Sometimes I get really thoughtful and insightful feedback in the comments (although sometimes not). There's also the pleasure of self-expression! As James Salter wrote
I liked reading your "Blogging Professors" post, since I've thought several times, "Should I write a blog?" But I've also thought, "Why does anyone bother to write a blog?" The reasons to write are, as you note, to propagate one's "fabulous ideas and opinions worthy of wider attention and discussion" and to create dialogs and conversations. My own reasons not to write have been (1) that it would take time, and I have too little time as it is, and (2) that I doubt I'd be likely to make even the slightest ripple in the vast pool of the internet.
Reason (1) is, I'm sure, obvious. It's hard to find "work time" between experiments, meetings, classes, seminars, journal clubs, staring at data, writing analysis code, talking to students, planning classes, teachings classes, reading papers, reading books, and probably several other things I'm forgetting. And "free time" has its own constraints, and any new activities would have to compete with things I'm very fond of, like wandering the public library with the kids, or playing games with them in random taquerias, or painting pictures myself (which, sadly, has been steadily dwindling in frequency).
Of course, I'm sure most commenters will point out that it's all a matter of incentives: I have no incentive, as a faculty member, to blog. This is true, but not very explanatory in itself. We all do plenty of things that don't have concrete incentives. This past week, I've spent about two hours reviewing a paper. Next week I'll spend at least half an hour with a postdoc (not from my lab) starting a faculty position (elsewhere) giving advice on grants. Later this term, I'll probably put a lot of work into a talk on [ geeky science topic involving microscopy; unspecified to preserve anonymity ] for a journal club I don't usually attend -- it's a fascinating topic I've gotten increasingly involved with. I certainly don't get any reward from the University (or even the department) for doing these sorts of things. So why do them? In all these cases, there's some combination of reciprocity (I publish articles in journal X, so I should review papers for journal X), or personal interactions (I like to have conversations with colleagues), or both. Is any of this the case for blogging?
I'd guess -- though I have no data on this -- that most blogs, especially new ones, have very little readership. Certainly one often stumbles on blogs with a total absence of comments. (Not that blog comments in general are often worth reading…) And even if posts are read, is there likely to be much interaction or dialog, compared to the other activities noted above?
As you note, one way out of this would be group blogs, which might expand readership and reduce writing effort. Another would be if the university actively promoted blogs. (I'm constantly amazed at how little work the university puts into describing to the public what faculty do, and how ineptly what little they do is done.)
And, of course, another solution is to simply look at blogging as a way of recording and refining one's thoughts -- regardless of whether they're read or not. I've toyed with this; maybe I'll take it up…
There comes a time when you realize that everything is a dream, and only those things preserved in writing have any possibility of being real.
Wednesday, August 22, 2012
Beating down hash functions
Ars technica: ... An even more powerful technique is a hybrid attack. It combines a word list, like the one used by Redman, with rules to greatly expand the number of passwords those lists can crack. Rather than brute-forcing the five letters in Julia1984, hackers simply compile a list of first names for every single Facebook user and add them to a medium-sized dictionary of, say, 100 million words. While the attack requires more combinations than the mask attack above—specifically about 1 trillion (100 million * 104) possible strings—it's still a manageable number that takes only about two minutes using the same AMD 7970 card. The payoff, however, is more than worth the additional effort, since it will quickly crack Christopher2000, thomas1964, and scores of others.
"The hybrid is my favorite attack," said Atom, the pseudonymous developer of Hashcat, whose team won this year's Crack Me if You Can contest at Defcon. "It's the most efficient. If I get a new hash list, let's say 500,000 hashes, I can crack 50 percent just with hybrid."
With half the passwords in a given breach recovered, cracking experts like Atom can use Passpal and other programs to isolate patterns that are unique to the website from which they came. They then write new rules to crack the remaining unknown passwords. More often than not, however, no amount of sophistication and high-end hardware is enough to quickly crack some hashes exposed in a server breach. To ensure they keep up with changing password choices, crackers will regularly brute-force crack some percentage of the unknown passwords, even when they contain as many as nine or more characters.
"It's very expensive, but you do it to improve your model and keep up with passwords people are choosing," said Moxie Marlinspike, another cracking expert. "Then, given that knowledge, you can go back and build rules and word lists to effectively crack lists without having to brute force all of them. When you feed your successes back into your process, you just keep learning more and more and more and it does snowball."
Monday, March 12, 2012
Back in the day: startup CEO
I wonder how many people have spoken at Def Con and also given technical briefings in the bowels of Langley ;-)
When I visited China after starting (and exiting) SafeWeb, I thought I might have an unpleasant surprise waiting for me when entering the country. But luckily this cloak and dagger stuff is overblown.
SafeWeb's Triangle Boy: IP spoofing and strong encryption in service of a free Internet
SafeWeb is an encrypted (SSL) anonymous proxy service, used approximately 100 million times per month by hundreds of thousands of people worldwide. Triangle Boy is an Open Source program that lets volunteers turn their PCs into entry points into the SafeWeb network, thereby foiling censorship in countries like China and Iran. Triangle Boy uses IP spoofing and innovative packet routing to minimize the load on volunteer machines. I discuss SafeWeb's goals and technologies, its involvement with the CIA through In-Q-Tel (the agency's venture fund) and the Internet as a catalyst for social transformation in China.
Stephen Hsu is the CEO and co-founder of SafeWeb.
Wednesday, September 28, 2011
Amazon Silk
I wonder what Apple's response will be to this. Perhaps we'll see a "split-browser" update of (mobile) Safari soon. On the desktop I switched over to Chrome 1-2 years ago because it feels faster and it runs Google apps flawlessly. If Silk tries to do things too aggressively it might break a few applications or web pages (very tough to QA stuff like that). But probably there are speedups (e.g., smart pre-caching of popular content) that can be achieved without risk of breaking functionality and which can be exploited within a more conservative approach. Users will probably be forgiving because it's running on a $199 device with a 7" screen (Amazon Fire). The Silk team blog is here.
Saturday, April 09, 2011
Update on NYTimes paywall
If I try to read, for example, the Sidney Lumet obituary (btw, I highly recommend Dog Day Afternoon :-), the browser url bar shows the following when the subscription page has finally loaded:
http://www.nytimes.com/2011/04/10/movies/sidney-lumet-director-of-american-classics-dies-at-86.html?hp&gwh=A8B811D09B0F452C9A3F74E25512D060
If I eliminate all the cruft after "html" so that the url bar reads
http://www.nytimes.com/2011/04/10/movies/sidney-lumet-director-of-american-classics-dies-at-86.html
then reloading lets me read the article for free. This has worked for every article I've tried -- probably 20 or so by now.
Tuesday, March 29, 2011
Misers' methods for reading the NYTimes
The Times wants to charge me $35/month for unlimited digital access (that means on multiple devices, like mobile, tablet, computer). Now, I'm all for supporting journalism, and the Times in particular, but it seems kind of high to me. Let's see how it all works out for the Grey Lady. Perhaps a micropayment scheme would be better? (Has Google rolled their version out yet?)
Apparently they won't limit access to articles reached via link (i.e., from blogs, Twitter, search engine; see below for more details). This is strategic: they want their articles to be read, and to be influential, so don't want to frustrate potential readers who arrive via search or social network.
Therefore, I think you can just type the following into Google to get (free) access to daily NYTimes content (up to 5 articles per day; see note at bottom):
site:nytimes.com < today's date > < keywords >
i.e.,
site:nytimes.com march 29 2011 japan reactor
or
site:nytimes.com 2011/03/29 japan reactor
Soon someone will write a little web or mobile app to do exactly this kind of thing, mashing a nice graphical display with links that connect via Google or Twitter or whatever. Hmm ...
Here is a Twitter feed someone has already put up for this purpose. See also links in comments below.
*** It looks like search engine links are only good for 5 articles a day:
9. Can I still access NYTimes.com articles through Facebook, Twitter, search engines or my blog?
Yes. We encourage links from Facebook, Twitter, search engines, blogs and social media. When you visit NYTimes.com through a link from one of these channels, that article (or video, slide show, etc.) will count toward your monthly limit of 20 free articles, but you will still be able to view it even if you've already read your 20 free articles.
Like other external links, links from search engine results will count toward your monthly limit. If you have reached your monthly limit, you'll have a daily limit of 5 free articles through a given search engine. This limit applies to the majority of search engines.
Tuesday, February 08, 2011
You say you want a revolution
NYTimes: ... some new demonstrators said they had joined the protests after watching an emotional television interview on Monday night with Wael Ghonim, a Google marketing executive who was snatched off the street nearly two weeks ago, for his role in helping to organize the revolt as the administrator of a popular Facebook page.
One protester in Tahrir Square on Tuesday, Ahmed Meyer El Shamy, an executive with an international pharmaceutical company, told The Times, “many, many people” had resolved to join the demonstration “because of what they saw on TV last night.”
During that interview, Mr. Ghonim acknowledged that he had been the anonymous administrator of the Facebook page We Are All Khaled Said, dedicated to the memory of a 28-year-old Egyptian man beaten to death by the police in Alexandria on June 6, 2010, which helped spark the protests.
More video at the NYTimes link above. (Sorry, I just realized the version below doesn't have subtitles. Unless you speak Arabic you have to click through to the Times; the last video shows Ghonim's emotional reaction when shown pictures of protestors who died.)
Wednesday, January 20, 2010
Aurora uses Chinese error-checking algorithm?
... "Operation Aurora" is the latest in a series of attacks originating out of Mainland China. Previous attacks have been known as – "GhostNet" and "Titan Rain." Operation Aurora takes its name directly from the hackers this time – the name was coined after virus analysts found unique strings in some of the malware involved in the attack. These strings are debug symbol file paths in source code that has apparently been custom-written for these attacks.
... The compiler often offers other clues to a malware sample’s origin. For instance, if the binary uses a PE resource section, the resource’s headers will often provide a language code. The Hydraq component does use a resource section, but in this case, the author was careful to either compile the code on an English-language system, or they edited the language code in the binary after-the-fact. So outside of the fact that PRC IP addresses have been used as control servers in the attacks, there is no "hard evidence" of involvement of the PRC or any agents thereof.
There is one interesting clue in the Hydraq binary that points back to mainland China, however. While analyzing the samples, I noticed a CRC (cyclic redundancy check) algorithm that seemed somewhat unusual. CRCs are used to check for errors that might have been introduced into stored or transferred data. There are many different CRC algorithms and implementations of those algorithms, but this is one I had not previously seen in any of my reverse-engineering efforts.
... The CRC algorithm used in Hydraq uses a table of only 16 constants; basically a truncated version of the typical 256-value table. By decompiling the algorithm and searching the Internet for source code with similar constants, operations and a 16-value CRC table size, I was able to locate one instance of source code that fully matched the structural code implementation in Hydraq and also produced the same output when given the same input ...
... This source code was created to implement a 16-bit CRC algorithm compatible with the implementation known as "CRC-16 XMODEM", while requiring only a 16-value CRC table. It is actually a clever optimization of the standard CRC-16 reference code that allows the CRC-16 algorithm to be used in applications where memory is at a premium, such as hobby microcontrollers. Because the author used the C "int" type to store the CRC value, the number of bits in the output is dependent on the platform on which the code is compiled. In the case of Hydraq, which is a 32-bit Windows DLL, this CRC-16 implementation actually outputs a 32-bit value, which makes it compatible with neither existing CRC-16 nor CRC-32 implementations.
Perhaps the most interesting aspect of this source code sample is that it is of Chinese origin, released as part of a Chinese-language paper on optimizing CRC algorithms for use in microcontrollers. The full paper was published in simplified Chinese characters, and all existing references and publications of the sample source code seem to be exclusively on Chinese websites. This CRC-16 implementation seems to be virtually unknown outside of China, as shown by a Google search for one of the key variables, "crc_ta[16]". At the time of this writing, almost every page with meaningful content concerning the algorithm is Chinese ...
Thursday, January 14, 2010
"Aurora" doesn't sound very Chinese
If I were a Chinese hacker, wouldn't the filepaths on my development machine have non-English (unicode) characters? I'm sure some readers of this blog would know -- if you develop software in a Chinese language environment, do you use English words or Chinese characters for path and directory names?
Of course, it's possible the attackers just bought the malware from a black hat developer somewhere or have deliberately obfuscated the origin of their code. We need some more forensic information...
McAfee Security Insights Blog: ... the intruders gained access to an organization by sending a tailored attack to one or a few targeted individuals. We suspect these individuals were targeted because they likely had access to valuable intellectual property. These attacks will look like they come from a trusted source, leading the target to fall for the trap and clicking a link or file. That’s when the exploitation takes place, using the vulnerability in Microsoft’s Internet Explorer.
Once the malware is downloaded and installed, it opens a back door that allows the attacker to perform reconnaissance and gain complete control over the compromised system. The attacker can now identify high value targets and start to siphon off valuable data from the company. ...
Operation “Aurora”
I am sure you are wondering about the name “Aurora.” Based on our analysis, “Aurora” was part of the filepath on the attacker’s machine that was included in two of the malware binaries that we have confirmed are associated with the attack. That filepath is typically inserted by code compilers to indicate where debug symbols and source code are located on the machine of the developer. We believe the name was the internal name the attacker(s) gave to this operation. ...
Google dead in China?
WSJ reports on the decision process at Google. There are still a number of open questions:
1. What were Google's prospects in China? Are they really hopelessly behind Baidu? I've seen market share estimates ranging from 15-30% (no agreement even on the sign of the derivative!), and also the claim that the most sophisticated users (i.e., the ones with the most disposable income in the long run) tended to use Google. Perhaps no reason to throw in the towel -- but then why did Kai Fu Lee resign in September? Was it just the opportunity to run his own investment fund? (Here is an earlier post on Baidu, with a talk given by founder Robin Li.)
2. How serious is the state-sponsored security threat to companies operating in China? Did this play a big role in Google's decision? Coordinated attacks by state-run intelligence are significantly harder to deal with than ordinary hackers or even corporate espionage. An intelligence agency only has to turn a few key employees to get at important source code that necessarily would have to be available to researchers and operations people at Google China. It would be difficult to justify the risks of operating in an environment that hostile. (Needless to say, it would be long-run detrimental for China to create an environment that hostile to foreign companies.) On the other hand, snooping around for information about a few email users is hardly a threat of the same proportions.
WSJ: Google Inc.'s startling threat to withdraw from China was an intensely personal decision, drawing its celebrated founders and other top executives into a debate over the right way to confront the issues of censorship and cyber security.
The blog post Tuesday that revealed Google's very public response to what it called a "highly sophisticated and targeted attack on our corporate infrastructure originating from China" was crafted over a period of weeks, with heavy involvement from Google's co-founders, Larry Page and Sergey Brin.
For the two men, China has always been a sensitive topic. Mr. Brin has long confided in friends and Google colleagues of his ambivalence in doing business in China, noting that his early childhood in Russia exacerbated the moral dilemma of cooperating with government censorship, people who have spoken to him said. Over the years, Mr. Brin has served as Google's unofficial corporate conscience, the protector of its motto "Don't be Evil."
The investigation into the cyber intrusion began weeks ago, although how Google detected it remains unclear. As Google employees gathered more evidence they believed linked the attack to China and Chinese authorities, Chief Executive Eric Schmidt, along with Messrs. Page and Brin, began discussing how they should respond, entering into an intense debate over whether it was better to stay in China and do what they can to change the regime from within, or whether to leave, according to people familiar with the discussions. A Google spokesman said Messrs. Page, Brin and Schmidt wouldn't comment.
Mr. Schmidt made the argument he long has, according to these people, namely that it is moral to do business in China in an effort to try to open up the regime. Mr. Brin strenuously argued the other side, namely that the company had done enough trying and that it could no longer justify censoring its search results.
How the debate ultimately resolved itself remains unclear. The three ultimately agreed they should disclose the attack publicly, trying to break with what they saw as a conspiratorial culture of companies keeping silent about attacks of this nature, according to one person familiar with the matter.
Soon, Google's vice president of public policy and communications, Rachel Whetstone, began crafting and revising a number of versions of a possible statement the company planned to release publicly, these people said, sharing it with the three.
The top three agreed that in addition to discussing the attack, the blog post should contain some language about human rights, the strongest statement of which is a clause in the penultimate paragraph of the post.
The section said they had reached the decision to re-evaluate their business in China after considering the attacks "combined with the attempts over the past year to further limit free speech on the web."
Concerned about potential retribution against Google employees in China, the founders and their advisors agreed to include a line saying that the move was "driven by our executives in the United States, without the knowledge or involvement of our employees in China."
... Veteran observers of trade between the countries suggest that Google, and the U.S. generally, has little leverage to press China to back down on Internet censorship or other issues.
Some expressions of support for Google's position flowed in from around the world, including from consumers in China as well as some U.S. companies—including rival Yahoo Inc.—and politicians. Secretary of State Hillary Clinton Tuesday issued a statement saying Google's allegations "raise very serious concerns and questions," and that "we look to the Chinese government for an explanation."
Odds are high Google could be left largely on its own in taking concrete steps to confront the Chinese government. Veteran observers of trade between the countries suggest that Google, and the U.S. generally, has very little leverage to press China to back down on Internet censorship or other issues.
Besides the Google.cn Web site, Google has a range of other business initiatives and partnerships in China that could be affected by its decision. By snubbing Chinese authorities so publicly, the company risks government retaliation against itself or its partners. The decision also affects local competitors who could benefit from any retreat. Shares of Google's biggest Chinese rival, Baidu Inc., surged following the news.
Google's blog post Tuesday said cyber-attacks on its infrastructure resulted in "the theft of intellectual property," stating that it found evidence to suggest that a primary goal of the attackers was accessing the Gmail accounts of Chinese human-rights activists.
Again, snooping on activists and stealing core IP are two very different activities. Which was more disturbing to Google? Previous discussion here.
“Don’t Be Evil” always did sound a bit to me like tikkun olam, or repairing the world (see this profile of Brin and Page). Not sure whether CEO Schmidt is down with that ;-)
Wednesday, January 13, 2010
What's up with Google and China?
Some good comments at TechCrunch:
1. Google’s business was not doing well in China. Does anyone really think Google would be doing this if it had top market share in the country? For one thing, I’d guess that would open them up to shareholder lawsuits. Google is a for-profit, publicly-held company at the end of the day. When I met with Google’s former head of China Kai-fu Lee in Beijing last October, he noted that one reason he left Google was that it was clear the company was never going to substantially increase its market share or beat Baidu. Google has clearly decided doing business in China isn’t worth it, and are turning what would be a negative into a marketing positive for its business in the rest of the world.
2. Google is ready to burn bridges. This is not how negotiations are done in China, and Google has done well enough there to know that. You don’t get results by pressuring the government in a public, English-language blog post. If Google were indeed still working with the government this letter would not have been posted because it has likely slammed every door shut, as a long-time entrepreneur in China Marc van der Chijs and many others said on Twitter. This was a scorched earth move, aimed at buying Google some good will in the rest of the world; Chinese customers and staff were essentially just thrown under the bus.
Actually, recent reports estimate their 2009 search market share at around 30 percent, which is nothing to sneeze at. They could have had a good business in China, although I agree with Kai Fu Lee that the government would never let them dominate the market there the way they do in the rest of the world.
The hacking used trojans injected via a zero-day vulnerability in Adobe (PDF file attachments) [Edit: or was it an IE browser problem? And were the hackers really Chinese -- why codename Aurora?]. The claim that these attacks on multiple companies were coordinated by Chinese intelligence services is plausible but far from proven.
It's important to emphasize that the Chinese government is not monolithic. The parts of the government concerned with economic growth and technology development will be asking some hard questions of the intelligence apparatus about this. No economic planner wants high tech companies like Google or Adobe to stop operating in China as a consequence of security risks.
Thursday, January 07, 2010
Wikipedia: emergent phenomenon?
Aaron Swartz: I first met Jimbo Wales, the face of Wikipedia, when he came to speak at Stanford. Wales told us about Wikipedia’s history, technology, and culture, but one thing he said stands out. “The idea that a lot of people have of Wikipedia,” he noted, “is that it’s some emergent phenomenon — the wisdom of mobs, swarm intelligence, that sort of thing — thousands and thousands of individual users each adding a little bit of content and out of this emerges a coherent body of work.”† But, he insisted, the truth was rather different: Wikipedia was actually written by “a community … a dedicated group of a few hundred volunteers” where “I know all of them and they all know each other”. Really, “it’s much like any traditional organization.”
The difference, of course, is crucial. Not just for the public, who wants to know how a grand thing like Wikipedia actually gets written, but also for Wales, who wants to know how to run the site. “For me this is really important, because I spend a lot of time listening to those four or five hundred and if … those people were just a bunch of people talking … maybe I can just safely ignore them when setting policy” and instead worry about “the million people writing a sentence each”.
So did the Gang of 500 actually write Wikipedia? Wales decided to run a simple study to find out: he counted who made the most edits to the site. “I expected to find something like an 80-20 rule: 80% of the work being done by 20% of the users, just because that seems to come up a lot. But it’s actually much, much tighter than that: it turns out over 50% of all the edits are done by just .7% of the users … 524 people. … And in fact the most active 2%, which is 1400 people, have done 73.4% of all the edits.” The remaining 25% of edits, he said, were from “people who [are] contributing … a minor change of a fact or a minor spelling fix … or something like that.” ...
[But what if we analyze the amount of text contributed by each person, not just the number of edits? See original for analysis of edit patterns of specific articles, including amount of text added.]
... When you put it all together, the story become clear: an outsider makes one edit to add a chunk of information, then insiders make several edits tweaking and reformatting it. In addition, insiders rack up thousands of edits doing things like changing the name of a category across the entire site — the kind of thing only insiders deeply care about. As a result, insiders account for the vast majority of the edits. But it’s the outsiders who provide nearly all of the content.
And when you think about it, this makes perfect sense. Writing an encyclopedia is hard. To do anywhere near a decent job, you have to know a great deal of information about an incredibly wide variety of subjects. Writing so much text is difficult, but doing all the background research seems impossible.
On the other hand, everyone has a bunch of obscure things that, for one reason or another, they’ve come to know well. So they share them, clicking the edit link and adding a paragraph or two to Wikipedia. At the same time, a small number of people have become particularly involved in Wikipedia itself, learning its policies and special syntax, and spending their time tweaking the contributions of everybody else.
Monday, June 16, 2008
Neal Stephenson on wiring the world
Mother Earth Mother Board
The hacker tourist ventures forth across the wide and wondrous meatspace of three continents, chronicling the laying of the longest wire on Earth.
By Neal Stephenson
Many of the themes from the WIRED article also appear in my favorite Stephenson novel, Cryptonomicon (Google Books version). I have a particularly soft spot for the novel, since its plot parallels some spooky, crypto aspects of my own startup experience.
Friday, March 04, 2005
Times on China Internet censorship
The PRC government is expending a lot of resources on this, and is in many ways quite successful. But, around the edges, there is no stopping the flow of information. While there is no effective political organization in China beyond the government, ordinary people (or, at least, the few hundred million people with direct or indirect access to the Internet) have greater and greater access to uncensored information.
Fairly soon the expectations of the average person in China for democracy and personal freedom will be no different than in other parts of the world. There will be a consensus view that it is "normal" for the government to implement democratic reforms, if only in a gradual way.
Expectations for better governance are increasing everywhere (well, perhaps not in the US ;-) As shown in Georgia, Ukraine and Lebanon, fewer and fewer soldiers are willing to shoot peaceful demonstrators in support of an unpopular government, and the demonstrators know this. Perhaps satellite TV deserves as much credit for this as the Internet, but both are playing an important role.
An interesting (and optimistic) quote from the article: "All of the big mistakes made in China since 1949 have had to do with a lack of information," said Guo Liang, an Internet expert at the Chinese Academy of Social Sciences in Beijing. "Lower levels of government have come to understand this, and I believe that since the SARS epidemic, upper levels may be beginning to understand this, too."
Blog Archive
Labels
- physics (420)
- genetics (325)
- globalization (301)
- genomics (295)
- technology (282)
- brainpower (280)
- finance (275)
- american society (261)
- China (249)
- innovation (231)
- ai (206)
- economics (202)
- psychometrics (190)
- science (172)
- psychology (169)
- machine learning (166)
- biology (163)
- photos (162)
- genetic engineering (150)
- universities (150)
- travel (144)
- podcasts (143)
- higher education (141)
- startups (139)
- human capital (127)
- geopolitics (124)
- credit crisis (115)
- political correctness (108)
- iq (107)
- quantum mechanics (107)
- cognitive science (103)
- autobiographical (97)
- politics (93)
- careers (90)
- bounded rationality (88)
- social science (86)
- history of science (85)
- realpolitik (85)
- statistics (83)
- elitism (81)
- talks (80)
- evolution (79)
- credit crunch (78)
- biotech (76)
- genius (76)
- gilded age (73)
- income inequality (73)
- caltech (68)
- books (64)
- academia (62)
- history (61)
- intellectual history (61)
- MSU (60)
- sci fi (60)
- harvard (58)
- silicon valley (58)
- mma (57)
- mathematics (55)
- education (53)
- video (52)
- kids (51)
- bgi (48)
- black holes (48)
- cdo (45)
- derivatives (43)
- neuroscience (43)
- affirmative action (42)
- behavioral economics (42)
- economic history (42)
- literature (42)
- nuclear weapons (42)
- computing (41)
- jiujitsu (41)
- physical training (40)
- film (39)
- many worlds (39)
- quantum field theory (39)
- expert prediction (37)
- ufc (37)
- bjj (36)
- bubbles (36)
- mortgages (36)
- google (35)
- race relations (35)
- hedge funds (34)
- security (34)
- von Neumann (34)
- meritocracy (31)
- feynman (30)
- quants (30)
- taiwan (30)
- efficient markets (29)
- foo camp (29)
- movies (29)
- sports (29)
- music (28)
- singularity (27)
- entrepreneurs (26)
- conferences (25)
- housing (25)
- obama (25)
- subprime (25)
- venture capital (25)
- berkeley (24)
- epidemics (24)
- war (24)
- wall street (23)
- athletics (22)
- russia (22)
- ultimate fighting (22)
- cds (20)
- internet (20)
- new yorker (20)
- blogging (19)
- japan (19)
- scifoo (19)
- christmas (18)
- dna (18)
- gender (18)
- goldman sachs (18)
- university of oregon (18)
- cold war (17)
- cryptography (17)
- freeman dyson (17)
- smpy (17)
- treasury bailout (17)
- algorithms (16)
- autism (16)
- personality (16)
- privacy (16)
- Fermi problems (15)
- cosmology (15)
- happiness (15)
- height (15)
- india (15)
- oppenheimer (15)
- probability (15)
- social networks (15)
- wwii (15)
- fitness (14)
- government (14)
- les grandes ecoles (14)
- neanderthals (14)
- quantum computers (14)
- blade runner (13)
- chess (13)
- hedonic treadmill (13)
- nsa (13)
- philosophy of mind (13)
- research (13)
- aspergers (12)
- climate change (12)
- harvard society of fellows (12)
- malcolm gladwell (12)
- net worth (12)
- nobel prize (12)
- pseudoscience (12)
- Einstein (11)
- art (11)
- democracy (11)
- entropy (11)
- geeks (11)
- string theory (11)
- television (11)
- Go (10)
- ability (10)
- complexity (10)
- dating (10)
- energy (10)
- football (10)
- france (10)
- italy (10)
- mutants (10)
- nerds (10)
- olympics (10)
- pop culture (10)
- crossfit (9)
- encryption (9)
- eugene (9)
- flynn effect (9)
- james salter (9)
- simulation (9)
- tail risk (9)
- turing test (9)
- alan turing (8)
- alpha (8)
- ashkenazim (8)
- data mining (8)
- determinism (8)
- environmentalism (8)
- games (8)
- keynes (8)
- manhattan (8)
- new york times (8)
- pca (8)
- philip k. dick (8)
- qcd (8)
- real estate (8)
- robot genius (8)
- success (8)
- usain bolt (8)
- Iran (7)
- aig (7)
- basketball (7)
- free will (7)
- fx (7)
- game theory (7)
- hugh everett (7)
- inequality (7)
- information theory (7)
- iraq war (7)
- markets (7)
- paris (7)
- patents (7)
- poker (7)
- teaching (7)
- vietnam war (7)
- volatility (7)
- anthropic principle (6)
- bayes (6)
- class (6)
- drones (6)
- econtalk (6)
- empire (6)
- global warming (6)
- godel (6)
- intellectual property (6)
- nassim taleb (6)
- noam chomsky (6)
- prostitution (6)
- rationality (6)
- academia sinica (5)
- bobby fischer (5)
- demographics (5)
- fake alpha (5)
- kasparov (5)
- luck (5)
- nonlinearity (5)
- perimeter institute (5)
- renaissance technologies (5)
- sad but true (5)
- software development (5)
- solar energy (5)
- warren buffet (5)
- 100m (4)
- Poincare (4)
- assortative mating (4)
- bill gates (4)
- borges (4)
- cambridge uk (4)
- censorship (4)
- charles darwin (4)
- computers (4)
- creativity (4)
- hormones (4)
- humor (4)
- judo (4)
- kerviel (4)
- microsoft (4)
- mixed martial arts (4)
- monsters (4)
- moore's law (4)
- soros (4)
- supercomputers (4)
- trento (4)
- 200m (3)
- babies (3)
- brain drain (3)
- charlie munger (3)
- cheng ting hsu (3)
- chet baker (3)
- correlation (3)
- ecosystems (3)
- equity risk premium (3)
- facebook (3)
- fannie (3)
- feminism (3)
- fst (3)
- intellectual ventures (3)
- jim simons (3)
- language (3)
- lee kwan yew (3)
- lewontin fallacy (3)
- lhc (3)
- magic (3)
- michael lewis (3)
- mit (3)
- nathan myhrvold (3)
- neal stephenson (3)
- olympiads (3)
- path integrals (3)
- risk preference (3)
- search (3)
- sec (3)
- sivs (3)
- society generale (3)
- systemic risk (3)
- thailand (3)
- twitter (3)
- alibaba (2)
- bear stearns (2)
- bruce springsteen (2)
- charles babbage (2)
- cloning (2)
- david mamet (2)
- digital books (2)
- donald mackenzie (2)
- drugs (2)
- dune (2)
- exchange rates (2)
- frauds (2)
- freddie (2)
- gaussian copula (2)
- heinlein (2)
- industrial revolution (2)
- james watson (2)
- ltcm (2)
- mating (2)
- mba (2)
- mccain (2)
- monkeys (2)
- national character (2)
- nicholas metropolis (2)
- no holds barred (2)
- offices (2)
- oligarchs (2)
- palin (2)
- population structure (2)
- prisoner's dilemma (2)
- singapore (2)
- skidelsky (2)
- socgen (2)
- sprints (2)
- star wars (2)
- ussr (2)
- variance (2)
- virtual reality (2)
- war nerd (2)
- abx (1)
- anathem (1)
- andrew lo (1)
- antikythera mechanism (1)
- athens (1)
- atlas shrugged (1)
- ayn rand (1)
- bay area (1)
- beats (1)
- book search (1)
- bunnie huang (1)
- car dealers (1)
- carlos slim (1)
- catastrophe bonds (1)
- cdos (1)
- ces 2008 (1)
- chance (1)
- children (1)
- cochran-harpending (1)
- cpi (1)
- david x. li (1)
- dick cavett (1)
- dolomites (1)
- eharmony (1)
- eliot spitzer (1)
- escorts (1)
- faces (1)
- fads (1)
- favorite posts (1)
- fiber optic cable (1)
- francis crick (1)
- gary brecher (1)
- gizmos (1)
- greece (1)
- greenspan (1)
- hypocrisy (1)
- igon value (1)
- iit (1)
- inflation (1)
- information asymmetry (1)
- iphone (1)
- jack kerouac (1)
- jaynes (1)
- jazz (1)
- jfk (1)
- john dolan (1)
- john kerry (1)
- john paulson (1)
- john searle (1)
- john tierney (1)
- jonathan littell (1)
- las vegas (1)
- lawyers (1)
- lehman auction (1)
- les bienveillantes (1)
- lowell wood (1)
- lse (1)
- machine (1)
- mcgeorge bundy (1)
- mexico (1)
- michael jackson (1)
- mickey rourke (1)
- migration (1)
- money:tech (1)
- myron scholes (1)
- netwon institute (1)
- networks (1)
- newton institute (1)
- nfl (1)
- oliver stone (1)
- phil gramm (1)
- philanthropy (1)
- philip greenspun (1)
- portfolio theory (1)
- power laws (1)
- pyschology (1)
- randomness (1)
- recession (1)
- sales (1)
- skype (1)
- standard deviation (1)
- starship troopers (1)
- students today (1)
- teleportation (1)
- tierney lab blog (1)
- tomonaga (1)
- tyler cowen (1)
- venice (1)
- violence (1)
- virtual meetings (1)
- wealth effect (1)