Saturday, February 28, 2015

Some long-form articles worth your time

Nothing connects these articles, other than that they're all interesting (to me, that is).

And they're all long.

  • Invasion of the Hedge Fund Almonds
    Our increasing fondness for nuts—along with a $28-million-a-year marketing campaign by the Almond Board of California—are part of what has prompted the almond boom. But the main driver comes from abroad. Nearly 70 percent of California's almond crop is exported, with China the leading customer: Between 2007 and 2013, US almond exports to China and Hong Kong more than quadrupled, feeding a growing middle class' appetite for high-protein, healthy food. Almonds now rank as the No. 1 US specialty crop export, beating wine by a count of $3.4 billion to $1.3 billion in 2012. (Walnuts and pistachios hold the third and fourth spots, each bringing in more than $1 billion in foreign sales.) As a result, wholesale almond prices jumped 78 percent between 2008 and 2012, even as production expanded 16 percent.

    According to UC-Davis' Howitt, the shift to almonds and other tree nuts is part of a long-term trend in California, the nation's top agricultural state. Farmers in the Central Valley once grew mostly wheat and cattle. But over time, they have gravitated toward more-lucrative crops that take advantage of the region's rare climate. "It's a normal, natural process driven by market demand," Howitt says. "We grow the stuff that people buy more of when they have more money." Like nuts, which can replace low-margin products such as cotton, corn, or beef.

  • How crazy am I to think I actually know where that Malaysia Airlines plane is?
    Meanwhile, a core of engineers and scientists had split off via group email and included me. We called ourselves the Independent Group,11 or IG. If you found yourself wondering how a satellite with geosynchronous orbit responds to a shortage of hydrazine, all you had to do was ask.12 The IG’s first big break came in late May, when the Malaysians finally released the raw Inmarsat data. By combining the data with other reliable information, we were able to put together a time line of the plane’s final hours: Forty minutes after the plane took off from Kuala Lumpur, MH370 went electronically dark. For about an hour after that, the plane was tracked on radar following a zigzag course and traveling fast. Then it disappeared from military radar. Three minutes later, the communications system logged back onto the satellite. This was a major revelation. It hadn’t stayed connected, as we’d always assumed. This event corresponded with the first satellite ping. Over the course of the next six hours, the plane generated six more handshakes as it moved away from the satellite.
  • Proving that Android’s, Java’s and Python’s sorting algorithm is broken (and showing how to fix it)
    After we had successfully verified Counting and Radix sort implementations in Java (J. Autom. Reasoning 53(2), 129-139) with a formal verification tool called KeY, we were looking for a new challenge. TimSort seemed to fit the bill, as it is rather complex and widely used. Unfortunately, we weren’t able to prove its correctness. A closer analysis showed that this was, quite simply, because TimSort was broken and our theoretical considerations finally led us to a path towards finding the bug (interestingly, that bug appears already in the Python implementation). This blog post shows how we did it.
  • Mastering Git submodules
    Submodules are hair-pulling for sure, what with their host of pitfalls and traps lurking around most use cases. Still, they are not without merits, if you know how to handle them.

    In this post, we’ll dive deep into Git submodules, starting by making sure they’re the right tool for the job, then going through every standard use case, step by step, so as to illustrate best practices.

  • Mastering Git subtrees
    A month ago we were exploring Git submodules; I told you then our next in-depth article would be about subtrees, which are the main alternative.

    As before, we’ll dive deep and perform every common use-case step by step to illustrate best practices.

Friday, February 27, 2015

Rough news for the Derby project

Nearly 20 years ago, when Java was just emerging as an exciting new programming language, a small software company named "Cloudscape" was started up to build database software in Java.

Although I never worked at Cloudscape, their offices were only a couple blocks from my office, and I knew a number of the principal engineers very well.

Cloudscape assembled a superb engineering team and built a powerful product, but struggled to find commercial success as the "Dot Com Bubble" burst around the end of the 1990's. In 1999, Cloudscape was acquired by Informix, and in 2001 Informix was acquired by IBM.

Ten years ago, in the summer of 2004, IBM contributed the code to the Apache Software Foundation as Derby.

For many years, Derby was one of the most active and most successful projects at Apache, with dozens of committers and contributors building new features and fixing bugs, and the project produced release after release after release of new software. Both IBM and Sun Microsystems made substantial commitments to the project, providing material resources such as testing labs and equipment, but more importantly employing some of the most talented engineers I've ever had the pleasure of working with, and enabling those engineers to work on Derby.

It was an open source nirvana.

But in recent years, the community has struggled.

Sun Microsystems, of course, collapsed during the Great Recession of 2008, and in 2009 was sold to Oracle Corporation. IBM remains an independent corporation but is suffering greatly as well.

The end result is that, over the last year, both Oracle and IBM have essentially halted their support of the Derby project. Certainly in both cases this was done for valid and undoubtedly necessary business reasons, but the impact on the Derby project is severe.

It's hard for a non-programmer to understand the attachment that a programmer feels to their code. It's just an inanimate thing, code, but when you spend 20 years devoting almost every waking minute to thinking about it, and concentrating on it, and giving it your best, you grow powerfully attached to that code.

I feel bad for all the friends that I've made over the years, and wish them well. Such a collection of brilliant Java programmers has rarely been assembled, and I am sure that they are all going to move on to much better and brighter prospects.

And I feel bad for the Derby project, which was, at one time, a poster child for what an open source project could be, and for what the open source development process could produce, but is now a codebase whose future, frankly, must be considered to be in doubt.

Personally, I continue to enjoy working with the Derby codebase, and it is a professional interest of mine, so I hope to remain involved with the project as long as Apache will allow it to continue.

I'm not sure why I felt the need to post this, but I didn't want Derby to just quietly fade away without somebody taking a minute to salute it, and praise it, and record what was, what is, and (perhaps) what will be.

To close, let me share what is (I think) the last picture of the remaining Derby development team, taken last fall, just around the time that the people in question were learning the fate that their corporate masters had in mind for the work they devoted the greater portion of their professional lives to.

Tuesday, February 24, 2015

What I'm reading, late February edition

Working hard, reading a lot.

  • 13th USENIX Conference on File and Storage Technologies
    The full Proceedings published by USENIX for the conference are available for download below. Individual papers can also be downloaded from the presentation page. Copyright to the individual works is retained by the author[s].
  • http2 explained
    http2 explained describes the protocol HTTP/2 at a technical and protocol level. Background, the protocol, the implementations and the future.
  • You Had One Job, Lenovo
    When Lenovo preinstalled Superfish adware on its laptops, it betrayed its customers and sold out their security. It did it for no good reason, and it may not even have known what it was doing. I’m not sure which is scarier.
  • Lenovo PCs ship with man-in-the-middle adware that breaks HTTPS connections
    It installs a self-signed root HTTPS certificate that can intercept encrypted traffic for every website a user visits. When a user visits an HTTPS site, the site certificate is signed and controlled by Superfish and falsely represents itself as the official website certificate.
  • Superfish, Komodia, PrivDog vulnerability test
    Check the box below. If you see a "YES", you have a problem.
  • Extracting the SuperFish certificate
    I extracted the certificate from the SuperFish adware and cracked the password ("komodia") that encrypted it. I discuss how down below.
  • Exploiting the Superfish certificate
    As discussed in my previous blogpost, it took about 3 hours to reverse engineer the Lenovo/Superfish certificate and crack the password. In this blog post, I described how I used that certificate in order to pwn victims using a rogue WiFi hotspot
  • How to target XP with VC2012 or VC2013 and continue to use the Windows 8.x SDK
    One of the limitations of the Microsoft provided solution for targeting XP while using Visual Studio 2012 (Update 1 and above), or Visual Studio 2013, is that you must use a special “Platform toolset” in project properties that forces usage of the Windows SDK 7.1 (instead of Windows 8.x SDK which is the default). The other function the platform toolset provides is that it sets the Linker’s “Minimum Required Version” setting to 5.01 (instead of 6 which is the default). But that function can just as easily be done manually by setting it in project properties.
  • There are too many shiny objects and it is killing me
    The rest of the day is then used renting a VPS server, installing Linux (for the cool-factor) and going through the mandatory list of essential stuff I need, like version managers, package managers, vim bundles, custom prompts, terminal colors, and so forth. Somewhere along the way I get sidetracked and I dump the Linux installation and install Windows Server.
  • Shipping Culture Is Hurting Us
    Quickly getting something in front of the people that will actually use it is a great idea. It means you waste less time building something they don’t actually want. But I look around the industry today and I get worried. Don’t get me wrong – I see brilliant people shipping brilliant, innovative software. But I also see a lot of us using half-baked technologies to shove half-assed software out the door.
  • Programming Achievements: How to Level Up as a Developer
    We've all had specific experiences that clearly advanced our skills as developers. We've learned a new language that exposed us to a new way of thinking. Or we crafted the perfect design, only to watch it unveil its gross imperfections in the harsh realities of a production environment. And we became better programmers because of it. Some experiences equip you with new techniques. Others expose you to anti-patterns...and allow you to understand why they are anti-patterns. It's these experiences that teach you, that influence your thought process, that influence your approach to problems, that improve your designs.
    Musicians get better by practice and tackling harder and harder pieces, not by switching instruments or genres, nor by learning more and varied easy pieces. Ditto almost every other specialty inhabited by experts or masters.
  • An Ideal Conversation
    This is an article about basic conversation mechanics. It’s not about what motivates the person sitting across from you, it’s about some of the quirks you’ll encounter as the conversation occurs.
  • Procedural City Generation
    Three layers of simplex noise were combined to define the population density map. The resulting map has two purposes. One is to guide the forward extension of existing road segments; if a random deviation will reach a higher population than extending the original segment straight ahead, the extension will match that deviation. The second purpose of the population map is to determine when normal road segments should branch off from a highway - when the population along the highway meets a defined threshold.
  • MAS.S66: Indistinguishable From…Magic as Interface, Technology, and Tradition
    With a focus on the creation of functional prototypes and practicing real magical crafts, this class combines theatrical illusion, game design, sleight of hand, machine learning, camouflage, and neuroscience to explore how ideas from ancient magic and modern stage illusion can inform cutting edge technology.

    Guest lecturers and representatives of Member companies will contribute to select project critiques. Requires regular reading, discussion, practicing magic tricks, design exercises, a midterm project and final project.

  • Iseb: The Worm Maiden
    If you are a true tunnel fan, maybe a true fanatic, then when you hear about boring machines (the mothers of all tunnels) then it makes you want to see them. Good. So that’s what happened to us. Like grimey servants we followed every new trace that could lead us to her, the aim of our two year quest was always to see the toughest of all the machines. A dormant juggernaut that lies underground.

Monday, February 23, 2015

Five Thoughts on Software Testing

I felt like sharing some thoughts on testing, not necessarily related to any particular incident, just some things that were on my mind.

  • The easiest time to add the tests is now.

    It always seems to be the case that the time to write the tests is sometime later.

    • "I'm pretty much done; I just have to write some tests."
    • "If I write a lot of tests now, I'll just end up changing them all later."
    • "We're still in the design phase; how can we write any tests now?"

    I'm here to tell you that you can, and should, write those tests now. The main reason is that the sooner you write tests, the sooner you can start running those tests, and thus the sooner you will start benefiting from your tests.

    And I often write my tests when I'm still in the design phase; let me explain what I mean by that. While I'm noodling along, thinking about the problem at hand, toying with different ways to solve it, starting to sketch out the framework of the code, I keep a pad of paper and a pen at hand.

    Each time I think of an interesting situation that the code will have to handle, I have trained myself to immediately make a note of that on my "tests" pad.

    And as I'm coding, as I'm writing flow-of-control code, like if tests, while loops, etc., I note down additional details, so that I'm keeping track of things that I'll need to test (both sides of an if, variables that will affect my loop counters, error conditions I'll want to provoke in my testing).

    And, lastly, because I'm thinking about the tests while I design and write the code, I'm also remembering to build in testability, making sure that I have adequate scaffolding around the code to enable me to stimulate the test conditions of interest.

  • Tests have to be designed, implemented, and maintained.

    Tests aren't just written once. (Well, good ones aren't.) Rather, tests are written, run, revised, augmented, adjusted, over and over and over, almost as frequently as the code itself is modified.

    All those habits that you've built around your coding, like

    • document what you're doing, choose good variable names, favor understandable code whenever possible
    • modularize your work, don't repeat yourself, design for change
    All of those same considerations apply to your tests.

    Don't be afraid to ask for a design review for your tests.

    And when you see a test with a problem, take the time to refactor it and improve it.

  • Tests make your other tools more valuable.

    Having a rich and complete set of tests brings all sorts of payback indirectly.

    You can, of course, run your tests on dozens of different platforms, under dozens of different configurations.

    But there are much more interesting ways to run your tests.

    • Run your tests with a code-coverage tool, to see what parts of your codebase are not being exercised.
    • Run your tests with analyzers like Valgrind or the Windows Application Verifier
    Dynamic analysis tools like Valgrind are incredibly powerful, but they become even more powerful when you have an extensive set of tests. You can start to think of each test that you write as actually enabling multiple tests: your test itself, your test on different platforms and configurations, your test under Valgrind's leak detector, your test under Valgrind's buffer overrun detector, etc.
  • Keep historical records about running your tests

    As you're setting up your CI system to execute your test bed, ensure that you arrange to keep historical records about running your tests.

    At a minimum, try to record which tests failed, on what platforms and configurations, on what dates.

    A better historical record will preserve the output from the failed tests.

    A still better historical record will also record at least some information about the successful tests, too. Most test execution software (e.g., JUnit) can produce a simple output report which lists the tests that were run, how long each test took, and whether or not it succeeded. These textual reports, whether in HTML or raw text format, are generally not large (and even if they are, they compress really well), so you can easily keep the records of many thousands of test runs.

    Over time, you'll discover all sorts of uses for the historical records of your test runs:

    • Looking for patterns in tests that only fail intermittently
    • Detecting precisely when a regression was introduced, so you can tie that back to specific changes in the SCM database and quickly find and repair the cause
    • Watching for performance regressions by tracking the performance of certain tests designed to reveal performance behaviors
    • Monitoring the overall growth of your test base, and relating that to the overall growth of your code base
  • You still need professional testers.

    All too often, I see people try to treat automated regression testing as an "either or choice" versus having professional test engineers as part of your staff.

    You need both.

    The real tradeoff is this: by investing in automated regression testing, by having your developers cultivate the habit and discipline of always writing and running thorough basic functional and regression tests, you free up the resources of your professional test engineers to do the really hard stuff:

    • Identifying, isolating, and reproducing those extremely tough bug reports from the field.
    • Building custom test harnesses to enable cross-platform interoperability testing, upgrade and downgrade testing, multi-machine distributed system testing, fault injection testing, etc.
    • Exploratory testing
    • Usability testing
    All of those topics that never seem to find enough time are within your reach, if you just build a firm and solid foundation that enables you to reach out for the wonderful.

Oh, and by the way: Leadership is not the problem!

Saturday, February 21, 2015

It has that name for a reason

Over the last six weeks or so, we've been bothered by a smell, an odor, in our pantry.

Several times, we've hunted through it, digging around, trying to figure out what was causing it.

The leading candidates were the two packages of salmon-liver and bison dog training treats.

Indeed, they are detectable even to our poor human noses, and of course are many hundreds of times more interesting to our dear old black Lab.

But we sealed them up in ziplocs and moved them elsewhere, and they were not the issue.

It seemed rather sulfurous, so much so that we even called the Gas Company and asked them if we somehow had a leak, though it was nowhere close to the supply lines.

Dutifully, they sent a nice man. He agreed that there was a definite smell, and that it was not that far from the methane that they deliberately introduce to natural gas for exactly this reason.

But it was not a gas leak.

Infuriated, I finally unloaded the entire pantry into our living room, spreading things out everywhere (it's rather a large pantry).

And, just as I thought might happen, suddenly Donna exclaimed:

Oh, Ugh! Yes, this is it!

And once we both saw the culprit, I instantly understood why.

Let's pick up the story from the delightful article in a 2009 issue of Saudi Aramco World: "Devil's Dung": The World's Smelliest Spice.

I bought a fist-sized lump of brown-gray resin. Slightly sticky to the touch, it was as dense as a block of wood. Mostly, though, it was remarkable for its terrible, aggressive smell—a sulfurous blend of manure and overcooked cabbage, all with the nose-wrinkling pungency of a summer dumpster. The stench leached into everything nearby, too, which meant I had to double-wrap it and seal it in a plastic tub if I wanted to keep it in the kitchen.

About six months ago, we were trying to cook some recipe, and it called for Asafoetida.

Which we didn't have.

So, we did without (and the recipe was fine).

But then we happened to be in an Indian grocery sometime around the holidays, and I pointed out a jar on the shelf: "Look, dear, they have asafoetida! Shall we buy some, so in case we ever cook a recipe which has it, we'll have it at hand?"

Now, I'm not so sure that was a wise idea.

But at least the Great Mystery of the Pantry Odor is solved.

Americanah: A very short review

I happened to be taking several cross-country plane trips recently, and I brought along Chimamanda Ngozi Adichie's Americanah.

It was a perfect book for reading in wild intense bursts while confined to a too-small seat for too many hours.

Adichie writes superbly: reading her book is both effortless and enthralling. Events feel real and immediate; the characters seem as though they are speaking directly to you. Americanah gives you that wonderful sense that somehow you are sitting on the shoulder of the protagonist, like Jiminy Cricket to Pinocchio, seeing, hearing, touching, even thinking everything right along with Our Hero.

Now, I should say: this is a book about what it's like to be a completely different person than I am. So while I really appreciated Adichie's sharing those emotions and experiences with me, I am (I hope) humble enough to understand that the pain and the sorrow and the trauma that she discusses will never be even the remotest part of my life. At times it makes me uncomfortable; perhaps she is aware of that; I don't think she means to cause that discomfort, at least not directly, but I suspect she would be satisfied to know that it does in fact result.

I wasn't even the slightest bit disappointed in Americanah. I hope it finds many readers; I hope she finds many readers; I hope she writes many more wonderful books.

Monday, February 9, 2015

More thoughts on CVE-2014-9390

I'm still muddling away, trying to understand CVE-2014-9390.

One thing that's weird is that if you go to the MITRE CVE database for CVE-2014-9390, you get:

** RESERVED ** This candidate has been reserved by an organization or individual that will use it when announcing a new security problem. When the candidate has been publicized, the details for this candidate will be provided.

And if you follow the link to the National Vulnerability Database, you get: National Vulnerability Database: Unable to find vuln CVE-2014-9390

Is this unusual? I thought that, since two months have passed since the disclosure, I'd be able to find this vulnerability in the public databases by now.

How long does it usually take?

Meanwhile, I've been amusing myself by trying to understand the vulnerability in more detail.

One very interesting article in this area is on the Metasploit blog: 12 Days of HaXmas: Exploiting CVE-2014-9390 in Git and Mercurial.

Both Git and Mercurial clients have had code for a long time that ensures that no commits are made to anything in the .git or .hg directories. Because these directories control client side behavior of a Git or Mercurial repository, if they were not protected, a Git or Mercurial server could potentially manipulate the contents of certain sensitive files in the repository that could cause unexpected behavior when a client performs certain operations on the repository.

Another Metasploit-related article is on the Packet Storm Security blog: Malicious Git And Mercurial HTTP Server For CVE-2014-9390

This Metasploit module exploits CVE-2014-9390, which affects Git (versions less than, 1.9.5, 2.0.5, 2.1.4 and 2.2.1) and Mercurial (versions less than 3.2.3) and describes three vulnerabilities.

I'm particuarly interested in the HFS+ aspects of the vulnerability.

Although this vulnerability received massive attention as a git issue, it was actually discovered by the author of Mercurial (Matt Mackall), so I tried digging into the Mercurial-related info for more clues.

In the fix notes for this problem for Mercurial, it seems that the bug arises when manipulating a UTF8-encoded filename string.

Meanwhile, a close reading of the git patch indicates that for git, the bug also involved UTF8-encoded filenames.

And Augie Fackler's patch includes reference to an ancient Apple Technote 1150 which might give some more clues about when the HFS+ filesystem has this vulnerability.

I found a copy of that technote here: Technical Note TN1150, and it contains this little bit:

In HFS Plus and case-insensitive HFSX, strings must be compared in a case-insensitive fashion. The Unicode standard does not strictly define upper and lower case equivalence, although it does suggest some equivalences. The HFS Plus string comparison algorithm (defined below) include a concrete case equivalence definition. An implementation must use the equivalence expressed by this algorithm.

Furthermore, Unicode requires that certain formatting characters be ignored (skipped over) during string comparisons. The algorithm and tables used for case equivalence also arrange to ignore these characters. An implementations must ignore the characters that are ignored by this algorithm.

A more relevant section of that document is probably the Unicode Subtleties section, where we read:

To reduce complexity in the B-tree key comparison routines (which have to compare Unicode strings), HFS Plus defines that Unicode strings will be stored in fully decomposed form, with composing characters stored in canonical order. The other equivalent forms are illegal in HFS Plus strings. An implementation must convert these equivalent forms to the fully decomposed form before storing the string on disk.
I'm not sure what any of this means (I worry that I'm approaching the state of "fully decomposed" myself, nowadays), nor do I know whether to trust a random copy of a 10-year-old Apple document. But perhaps what it's saying is that this vulnerability is specific to a code path where you might pass UTF8-encoded filenames to the operating system?

On a relatively modern Mac, running 'man 2 open' just says:

int open(const char *path, int oflag, ...);


The file name specified by path is opened for reading and/or writing, as specified by the argument oflag; the file descriptor is returned to the calling process.

Nowhere obvious in the 'man 2 open' output is anything about the encoding of the filename that you provide to open().

Filename encoding on HFS+ is obviously confusing to many, as witnessed by these StackOverflow questions:

And the Mac filesystem in general is an object of fascination to many: here's a great old article by John Siracusa on Ars Technica: The state of the file system.

I suspect that the bottom line is that certain aspects of CVE-2014-9390 are going to remain opaque to me.

There are just too many things to learn, and not enough time.

Sunday, February 8, 2015

In which people discuss things I don't understand

Is it important? People think so. The market thinks so. So I guess it is.

It's at least interesting.

  • Replacing Middle Management with APIs
    What’s bizarre here is that these lines of code directly control real humans. The Uber API dispatches a human to drive from point A to point B. And the 99designs Tasks API dispatches a human to convert an image into a vector logo (black, white and color). Humans are on the verge of becoming literal cogs in a machine, completely anonymized behind an API. And the companies that control those APIs have strong incentives to drive down the cost of executing those API methods.

    In the long run there’s always something for people to work on and improve, but the introduction of this software layer makes we worry about mid-term employment 5-20 years out. Drivers are opting into a dichotomous workforce: the worker bees below the software layer have no opportunity for on-the-job training that advances their career, and compassionate social connections don’t pierce the software layer either. The skills they develop in driving are not an investment in their future. Once you introduce the software layer between “management” (Uber’s full-time employees building the app and computer systems) and the human workers below the software layer (Uber’s drivers, Instacart’s delivery people), there’s no obvious path upwards. In fact, there’s a massive gap and no systems in place to bridge it.

  • Uber Wants to Replace Its Drivers With Robots. So Much for That “New Economy” It Was Building.
    As Danny Vinik pointed out at the New Republic when Uber’s study initially came out, the evidence that contract jobs lead to unstable employment situations with poor benefits is, so far, weak. It’s also unclear how much longer Uber will be able to legally sustain its contractor-dependent model. Late last week, a federal judge said that Uber’s drivers might have to be treated as employees instead of independent contractors, which would change a lot. But even taking all that uncertainty into account, you know what’s a quick way to undermine your claims about building a new and sustainable work model? Announcing two weeks later that you are building a research facility to develop the technologies that will eventually render all of the people employed in that new work model obsolete.
  • Uber Drivers Love Uber, Says Uber Survey. Here's Why You Should Believe It.
    Seventy eight percent of respondents said they are either very satisfied or somewhat satisfied with Uber, and 69 percent say their opinion of the company has improved since when they started. Seventy one percent said that Uber boosts their income, compared to just 11 percent who said it worsened it. By a 74-5 margin, drivers say that Uber has made their lives better by giving them more flexibility with their schedule. It's unlikely the study's potential biases account for such enthusiastic results.
  • Uber and Lyft Drivers May Have Employee Status, Judge Says
    “The idea that Uber is simply a software platform, I don’t find that a very persuasive argument,” U.S. District Judge Edward Chen said in court Friday.
  • South Korea rejects Uber registration proposal, vows to shut firm down
    "Transporting customers with private or rented cars and accepting compensation is clearly illegal. The company is ignoring local laws by stating its intention to continue such operations," the ministry said in a statement.
  • Disrupt and dismay: why Cape Town’s metered taxi operators are uber-upset
    Transport in Cape Town, especially when it comes to the taxi industries, has always been a sensitive issue. It’s a fragmented industry with tensions climaxing in events such as the minibus taxi violence in 2013 — a different sector but a stark reminder of how intense things can get in the transport space.
  • After handing over license plate info, Uber re-opens its NYC bases
    Uber initially cited trade secrets for not wanting to give up its data. It may not have wanted local governments to being able to send their taxis to popular Uber areas at peak times, for example. But as evidenced by the blog post on giving data to Boston and other cities, the company had a change of heart, realizing that it could offer data to cities as an olive branch and potentially ease Uber’s local regulatory conflicts as a result.
  • Crib Sheet: Saturn's Children
    A society that runs on robot slaves who are, nevertheless, intelligent by virtue of having a human neural connectome for a brain, is a slave society: deeply unhealthy if not totally diseased. I decided to shove the slider all the way over towards terminally diseased (I am not a fan of slavery).
  • Meditations on Moloch
    The implicit question is – if everyone hates the current system, who perpetuates it? And Ginsberg answers: “Moloch”. It’s powerful not because it’s correct – nobody literally thinks an ancient Carthaginian demon causes everything – but because thinking of the system as an agent throws into relief the degree to which the system isn’t an agent.

    Bostrom makes an offhanded reference of the possibility of a dictatorless dystopia, one that every single citizen including the leadership hates but which nevertheless endures unconquered. It’s easy enough to imagine such a state.

Saturday, February 7, 2015

Map art

Two fun articles about two of my favorite things: maps, and art.

  • 100 Years of National Geographic Maps: The Art and Science of Where
    At this writing (the count is obsolete as soon as it is tallied), National Geographic cartographers have produced 438 supplement maps, ten world atlases, dozens of globes, about 3,000 maps for the magazine, and many maps in digital form.
  • Beautiful hand painted ski trail maps
    If you've ever noticed most ski trail maps look kinda the same, the reason is many of them have been painted by a single individual: James Niehues.

    Each view is hand painted by brush and airbrush using opaque watercolor to capture the detail and variations of nature's beauty. In many instances, distortions are necessary to bring everything into a single view. The trick is to do this without the viewer realizing that anything has been altered from the actual perspective.

Some great construction porn

You know, for people who like to look at pictures of giant construction projects.

Stuff I'm reading, blustery day edition

I think the New Normal is that it will rain only about 2 days a month.

But when it rains, it rains HARD.

  • An Introduction to Computer Networks
    Welcome to the website for An Introduction to Computer Networks, a free and open general-purpose computer-networking textbook, complete with diagrams and exercises. It covers the LAN, internetworking and transport layers, focusing primarily on TCP/IP. Particular attention is paid to congestion; other special topics include queuing, real-time traffic, network management, security and the ns simulator.
  • Surviving Data Science "at the Speed of Hype"
    There is this idea endemic to the marketing of data science that big data analysis can happen quickly, supporting an innovative and rapidly changing company. But in my experience and in the experience of many of the analysts I know, this marketing idea bears little resemblance to reality.
  • Why Robot?
    All the hoopla about “robots stealing our jobs” has led people to assume that if some new technology performs a task traditionally performed by humans, it must be a robot. But the term carries a lot of pop-cultural baggage that risks clouding our understanding of what we’re really talking about when we talk about automation.
  • Robot-writing increased AP’s earnings stories by tenfold
    AP managing editor Lou Ferrara told Automated Insights that the news cooperative’s customers are happy to be receiving more stories, and that automation has freed up reporters to work on more difficult stories, according to the release.
  • Why Learning to Code is So Damn Hard
    for the most part, these introductory tools do a great job of guiding you like a child in a crosswalk past the big scary variables and conditional statements and through the early phases of programming syntax. As you conquer one after another of their gamified challenges, your confidence rises. Maybe you can do this after all! How hard can it be? You're basically a developer already!
  • Screw motivation, what you need is discipline.
    How do you cultivate discipline? By building habits – starting as small as you can manage, even microscopic, and gathering momentum, reinvesting it in progressively bigger changes to your routine, and building a positive feedback loop.
  • DeathHacks
    There’s a lot of grunt work that needs to be done after a death. Magazine subscriptions need to be stopped. Home services like lawn care need to be curtailed or amended. Maintenance schedules need to be deduced from calendar entries or handwritten notes. A lot of this is just simple phone calls or emails. Some of it is not.

    The cable company, when called, wouldn’t let me downgrade my dad’s service without receiving a faxed copy of the death certificate and a letter outlining my executrixship. They insisted on me changing the account into my name, pronto. I hung up. Their online chat service, however, would let “Tom” do anything he wanted as long as he had the account number and the eminently guessable-by-me password that was the same as the alarm system’s. This was a reproducible result. Using my dad’s Google document with the usernames and passwords of all of his major accounts, I got a lot of things accomplished without having to speak to or see a real person.

  • Stibitz & Wilson Honorees
    The George R. Stibitz Computer & Communications Pioneer Awards honor individuals who have made significant contributions to the fields of computing and communications. The Edward O. Wilson Biodiversity Technology Pioneer Awards honor individuals who have made significant contributes to the preservation of biodiversity on Earth.

The Book of Strange New Things: a very short review

One of my holiday gifts was Michel Faber's The Book of Strange New Things, and I finished reading it the other day.

Is it possible for a book to be both compelling and repellent? Both bizarre and familiar? Both pedestrian and exotic?

I suppose so, since all those words seem to come to mind as I think about The Book of Strange New Things.

Faber's story tells of Peter Leigh, a young Londoner in some near-future time, who has somehow volunteered for, or applied at, or been chosen by, but at any rate is now associated with, an anonymous corporate entity named USIC, which has put Peter into a spaceship and sent him somewhere else.

It's not really clear where the else is, or why USIC wants to be there, or what's actually happening on that place, except that Peter is now there, along with some other people.

Oh, and Peter, you see, is a minister.

You might think that Peter is there to minister to the other USIC staff on this faraway place. But no, Peter is actually there to minister to the indigenous population: the aliens.

These aliens are not human beings, although they are sufficiently similar to allow for comparisons, and for communication, and for interactions. Happily, by the time Peter gets there the aliens (whom Peter comes to call Oasans) have learned a small amount of English, and during the course of the book their English improves as Peter talks to them and reads to them from The Book of Strange New Things (which is the term the aliens use for Peter's King James Bible). Peter even learns a few works of the alien language.

So that's what the book is about. That, and Peter talking with the other USIC staff, and exchanging emails with his wife, Beatrice, who has remained back in England.

Yes, Peter has left his wife in England and has climbed on a spaceship to some other place. No, I don't buy that for one moment. That's just one of many confounding things about this book.

Another very disturbing thing about The Book of Strange New Things is that it has no ending.

Well, of course, the book ends; there comes a point when there are no more pages to turn and you have finished reading it.

But the story doesn't conclude; it doesn't resolve; it doesn't answer. You just go reading along in the book and then, even though the book has raised all sorts of disturbing and infuriating questions and problems, it just stops. You don't find out what happens; you don't learn the outcome. Infuriatingly, near the end, when Peter gives a speech that seems certain to explain What He Has Learned, he gives the speech to the aliens, in the alien tongue, which is not explained at all, so you have no idea what he has said (although the aliens seem very satisfied by what he tells them).

It's like a sonata which breaks with form and doesn't return to the tonic at the end: it's unsettling and it leaves you all anxious and perturbed.

Usually, when I finish a book, if it was a book worth reading, and if I have paid a fairly reasonable amount of attention, I have some idea what the book was "about": what issues was the author concerned about, what questions are under debate here, where does the author stand on the key topics.

Not so with The Book of Strange New Things.

Of course, it is broadly clear what Faber is interested in: his main character is a minister, the title of the book is the alien's description of the King James Bible, and the book, overall, is divided into four parts:

  1. Thy Will Be Done
  2. On Earth
  3. As It Is
  4. In Heaven

So obviously the book is a reflection on faith, on belief (Peter's wife, after all, is named "Bea Leigh"), on humanity and its relationship with religion, and on our purpose in life.

But does Faber actually have an opinion on these matters? If he does, he keeps it well-hidden. It seems he is content to tell a tale, to raise questions and pose problems, but he is not trying to take us in any particular direction in these discussions.

Early in the book, Beatrice says to Peter, "I have a vision," and she proceeds to tell him what it is:

I see you standing on the shore of a huge lake. It's night and the sky is full of stars. On the water, there's hundreds of small fishing boats, bobbing up and down. each boat has at least one person in it. None of the boats are going anywhere, they've all dropped anchor, because everyone is listening. The air is so calm you don't even have to shout. Your voice just carries over the water.

And then, at the end, Peter reflects on his experience. A bit.

he looked around several times at his church silhouetted against the brilliant sky. No one had emerged from it but him. Belief was a place that people didn't leave until they absolutely must. The Oasans had been keen to follow him to the kingdom of Heaven, but they weren't keen to follow him into the valley of doubt. He knew that one day -- maybe very soon -- they would have another pastor. They'd take from him what they needed, and their search for salvation would go on when he was long gone. After all, their souls dreamt so ardently of a longer stay in the flesh, a longer spell of consciousness. It was natural: they were only human.

Is Faber trying to speak directly to us here? Is he frustrated that we are not keen to follow him into the valley of doubt? Is he deliberately unwilling to tell us anything that might pass for an answer, since he knows that our experience with his book is just transitory, and our search will go on when he is long gone?

Is it OK if our boat isn't going anywhere, as long as we are listening?

I'm not sure, and I'm not sure, in the end, what to tell you about The Book of Strange New Things. I'm quite glad I read it, but I think I'd find it very hard to know whether or not to recommend it to anybody else.

Maybe it's just the sort of book that you have to happen onto on your own, and then on your own you have to decide what it is, and what to make of it, and how you will feel about that when you are done.

So on we go.

Wednesday, February 4, 2015

Wikipedia minutiae

There's a great article on Medium about Giraffedata: One Man’s Quest to Rid Wikipedia of Exactly One Grammatical Mistake.

Giraffedata—a 51-year-old software engineer named Bryan Henderson—is among the most prolific contributors, ranking in the top 1,000 most active editors. While some Wikipedia editors focus on adding content or vetting its accuracy, and others work to streamline the site’s grammar and style, generally few, if any, adopt Giraffedata’s approach to editing: an unrelenting, multi-year project to fix exactly one grammatical error.

What, precisely, is that grammatical issue? Giraffedata explains it on a personal Wikipedia page (I wasn't even aware there were such things): User:Giraffedata/comprised of

Many people do not accept "comprised of" as a valid English phrase for any meaning. The argument goes that "to comprise" means to include, as in "The 9th district comprises all of Centerville and parts of Easton and Weston." And thus, "the 9th district is comprised of ..." is gibberish.

The phrase apparently originated as a confusion of "comprise" and "composed of", which mean about the same thing, as in "the 9th district is composed of ..." There is a traditional saying to help people remember these two sound-alike words: "The whole comprises the parts; the parts compose the whole."

But "comprised of" is in common use and some people defend it as a fully valid additional definition of "to comprise". Even dictionaries acknowledge this usage, though they all tell you it's disputed and typically discourage writers from using it. See for example Wiktionary.

Here is my view of why "comprised of" is poor writing

As the Medium essay notes, Giraffedata's work isn't always welcomed.

He was surprised, however, to find in the first three months that some people disagreed with his edit, sometimes vehemently. “When the first few people said, ‘Why did you do this?’ I said, ‘Well, it’s not grammatical. It’s not English at all.’ And then finally somebody came and said, ‘You jerk, it’s a matter of opinion! It’s completely valid, I looked it up in my dictionary! You have no right to mess with my article!’” Henderson laughs. “That came as quite a surprise.” He stopped checking the ‘minor edit’ box, to acknowledge that some users might find the change controversial.

Indeed, he acknowledges the controversy on his personal Wikipedia page:

A usage note in the Merriam-Webster dictionary notes that a writer "may be subject to criticism" for using "comprised of" and suggests alternative wording for that reason. It notes that in spite of being in use for over a hundred years, it is still attacked as wrong, but says it isn't clear why the attackers have singled out this usage. Opposition has long been declining; in the 1960s, 53 percent of American Heritage Dictionary's expert Usage Panel found the wording unacceptable; in 1996, only 35 percent objected; by 2011, it had fallen a bit more, to 32 percent (quoted here). OED usage note calls it "part of standard English".

Still, doing this brings him joy:

When asked what motivates him, Henderson says he views his pursuit as similar to that of people who choose to spend their Saturdays picking up litter from the side of the road. “I really do think I’m doing a public service, but at the same time, I get something out of it myself. It’s hard to imagine doing it for the rest of my life,” he says with a laugh. “I don’t have any plans to quit, but I guess eventually, I’ll have to find a way. It’s hard to walk away, especially when I’ve actually accomplished something.”

I understand the compulsion, and I expect many have felt it; after all, Duty Calls.

Monday, February 2, 2015

Risky business

Really, when you get right down to it, I don't know very much about football.

So take this for what it's worth (nothing).

But I just don't get why everybody is jumping on the bandwagon to label the Seahawks's last play call "risky" or "stupid".

As I recall, Jerry Rice scored, what, 40 touchdowns on slant passes from inside the 10 yard line?

And Rice probably made 5 times that many first downs on that play, and perhaps 10 times that many total "quick slant" receptions in his career.

That was a perfectly fine play call, even a good one.

Malcolm Butler just did something completely unexpected and amazing.

Which is why, even though I didn't really care about the game, nor who won, and barely even watched a game of football all year, I still really enjoyed almost all of yesterday's game.

Go Patriots, go Seahawks. Well played, and good entertainment, start to finish.

Sunday, February 1, 2015

Goodbye to January

Yesterday, it was 75 degrees in Marin County, clear blue skies, a gentle light wind, so we drove out to the Pt Reyes Lighthouse to see if we could see any Gray Whales. According to the recorded message at the Pt Reyes National Seashore Ranger Station, 18 whales had been seen on Friday.

Visiting the lighthouse to look for the whales in January is extremely popular, so at this time of year you can't drive all the way to the lighthouse. Instead, you have to park at Drake's Beach, and then take the shuttle bus, which is a nice ride and very comfortable.

There may be no better viewpoint over the Pacific Ocean than the observation deck above the Pt Reyes Lighthouse. With our binoculars, we could see miles out to sea, though the views got hazier as we looked. I'm pretty sure that we could see the Farallon Islands; it looked just like this.

But we didn't see any whales.

Sometimes that's what happens. I think they all were there on Friday.

So we rode the shuttle bus back to Drake's Beach, and went for a walk on the beach.

There were about a half dozen elephant seals sunning themselves on the beach, with docents keeping us tourists safely away. In many ways, seeing an elephant seal yawning, just 30 feet away, is a more impressive sight than a pod of gray whales 5 miles out to sea.

But they're both pretty great, and I'm glad we can make such a wonderful day trip on a Saturday in January. Yay California!

January was a month of glorious mild weather; I rode my bike to work every day, and barely needed my sweatshirt on many mornings.

But January was dry, reports the Merc: Driest January in history: Bay Area swings from boom to bust after wettest December.

For the first time ever, San Francisco, Oakland and Sacramento have recorded no rainfall for the month of January -- nada drop.


The state's monthly snowpack survey Thursday near Lake Tahoe's Echo Summit summed up the bad news: 12 percent of normal, with the equivalent of 2.3 inches of water content in the "meager" snowpack. Other spots around the state were worse, some slightly better -- but the outlook is equally dire.

"Unfortunately, today's manual snow survey makes it likely that California's drought will run through a fourth consecutive year," according to a news release from the Department of Water Resources.

Surprisingly, people don't seem to care.
The winter snowpack's water content typically supplies about 30 percent of California's water needs and is considered water in the bank to melt through the dry summer months. Reservoirs rely on ample runoff from snowmelt to meet demand from summer through fall. After three years of drought and dwindling reservoir levels, the demand is growing.

And drought-weary Californians show signs of conservation fatigue. A poll from the Public Policy Institute of California, released this week, showed 59 percent of residents say the water supply is a big problem in their region, but that's down from 68 percent in October.

Indeed, you barely hear it discussed on the news, nobody seems to be talking about it at all.

There will be new data, though: A New Satellite Will Watch the Western Drought from Space.

SMAP will spend three years taking the most accurate readings ever of soil moisture around the world. That’s right: It will measure how wet the dirt is. From space.

As Bob Dylan says, though, you don't need a weatherman to know which way the wind blows. It is dry, dry, dry, and there are too many people in California using too much water, and nobody is working on ways to address the problem.

But it was a beautiful day to visit Pt Reyes yesterday.

Goodbye to January, happy February!