We’ve all heard of Bullshit Jobs: “a form of paid employment that is so completely pointless, unnecessary, or pernicious that even the employee cannot justify its existence even though, as part of the conditions of employment, the employee feels obliged to pretend that this is not the case.”
What I didn’t know (I learned it here) is that the book mentions data science by name as one BS job. Here’s the full passage:
Another common theme was the way many of those laboring in financial institutions—to a much larger degree than those in most large corporations—had little or no idea how their work contributed to the bank as a whole. Irene, for example, worked for several major investment banks in “Onboarding”—that is, monitoring whether the bank’s clients (in this case, various hedge funds and private equity funds) were in compliance with government regulations. In theory, every transaction the bank engaged in had to be assessed. The process was self-evidently corrupt, since the real work was outsourced to shady outfits in Bermuda, Mauritius, and or the Cayman Islands (“where bribes are cheap”), and they invariably found everything to be in order. Nonetheless, since a 100% percent approval rate would hardly do, an elaborate edifice had to be erected so as to make it look as if sometimes, they did indeed find problems sometimes. So Irene would report that the outsider reviewers had okayed the transaction, and a Quality Control board would review Irene’s paperwork and duly locate typos and other minor errors. Then the total number of “fails” in each department would be turned over to be tabulated by a metrics division, this allowing everyone involved to spend hours every week in meetings arguing over whether any particular “fail” was real.
Irene: (direct quote in text) “There was an even higher caste of bullshit, propped atop the metrics bullshit, which were the data scientists. Their job was to collect the fail metrics and apply complex software to make pretty pictures out of the data. The bosses would then take these pretty pictures to their bosses, which helped ease the awkwardness inherent in the fact that they had no idea what they were talking about or what any of their teams actually did.”
A metric is just a count or a mean. It’s a count if it’s profits; it’s a mean if it’s customer satisfaction. It can be a weighted mean, it can be a weighted count, it can be a conditional mean, maybe, if you fit a linear regression model, but it’s still a count or a mean or a difference between two numbers, or something like that. It’s a description. If you are into causal inference, you’ll call this difference an effect.
As we all know, Marx has it that “the philosophers have only interpreted the world, in various ways; the point, however, is to change it.” A data scientist wants to understand the world, which, from a minimalist perspective, could be narrowed down to figuring out the popularity of a product or a service. Here, understanding means describing. Even employing a complicated Gaussian process is still essentially examining a conditional average. Apparently, the value of data science and analytics lies in transforming raw data into actionable insights, enabling informed decision-making, optimizing processes, and uncovering opportunities for innovation and growth. Taking these means and doing something with it. It’s boring.