Why Crying for More Science Won’t Help

He Blinded Us With Science

At the XP2011 conference in Madrid 2011, Laurent Bossavit gave a talk that I found very interesting, outlining some historic perspective on the decade-long agile movement. As one example, Laurent’s report that the conference proceedings from the NATO conference on software in Garmisch-Partenkirchen (in 1968) were censored was simply astonishing. This was the historic place where the term “software crises” was coined, which in turn lead to the creation of the, so called, Software Engineering research field. The censored paper was a brilliant satire of the whole idea and apparently it was simply removed from the resulting conference proceedings. But why? Well, gradually it was revealed that the hidden motive of the conference organisers was to convince NATO to fund research in this new field of science, which they obviously should lead. In their glorious small-mindedness, they must have felt threatened by having a paper in the proceedings mocking their beautiful idea. I guess I should not be surprised, given the human nature, but scientists? Censoring!? Since then, Software Engineering has entrenched itself at the universities of the world, producing massive amounts of papers but very little useful results.

At the end of the presentation, Laurent presented a call to action, where his focus was on science as a means to strengthen the agile movement. As I understood it, Laurent is convinced that science is needed to give us the hard evidence we need to make progress. For example, would it not be very nice to just say that research has proven, beyond doubt, that pair programming actually is better? Then we would not have to debate anymore. As much as I sympathise with Laurent’s goals I remain unconvinced as to the utility of science for this. My first concern is about repeatability.

Science Requires Repeatability

Science is about theory building. We construct more and more refined theories and models, based on observation and experiments. A theory is not valid until it has been corroborated by researchers in other places, which means that repeatability of experiments is a must. The important parameters must remain equal and produce equal or similar results. I will argue that this is simply not possible in the field of software development.

In software development, the following can be considered an axiom: You cannot complete the same work twice. You really can’t. If you use the same people but replace the assignment with a similiar assignment you have changed the nature of the work. If you use toy projects, intentionally created to be of the approximately same nature, then your tasks will be unrealistic. If you instead replace the people doing the work you will have a completely different set of experience, attitudes, skills, etc in action. OK, but what if you use exactly the same people and develop exactly the same software? Well, you can’t. The people who were in that first project are not the same when they take on the same work again. They have learned an incredible amount of useful stuff while developing the first project.

Software development is about learning and that inevitable fact prevents us from repeating experiments with the necessary scientific rigor. Dan North and Elizabeth Keogh have talked a lot recently about what they call deliberate discovery. In their reasoning, all indications point towards our own ignorance being the single biggest bottleneck in all development work. In fact, we often suffer from 2nd order ignorance, i.e. we don’t even know what we are ignorant of. Increasing the speed with which we learn about the problem and how to design the most useful solution to that problem within the boundaries set forth must therefore be vital. One common symptom of this is that developers, when asked about how long it would take them to redo the assignment they just did, typicallly answer somewhere between 20 and 40% of the original time. The rest of the time was “spent” on learning. You cannot step into the same river twice, as the saying goes, and people are about the same. You cannot use the same developers on the same assignment twice.

Agile has made a very valid and important point to our field, that the people on the teams are the most important factor for success. Alistair Cockburn has a lovely phrase for this: “People are a parameter of the first order in software development“. Since we will never have the same people twice then, consequently, we cannot keep that parameter constant. This will introduct a strong element of randomness into any study. My second concern here is context.

Context Is King

I believe that there is simply too much that is contextual in software development to enable us to draw strong, general conclusions. Our field is characterised by having surprisingly few general laws. Certainly, this is true for the people and social issues. For example, all teams have different problems. But even the “hard” practices, things that agile evangelists preach like refactoring or TDD, are absolutely not free from context. If you develop a one-off for a few days, for example a campaign site, would you bother doing strict TDD? For a product prototype? This proves that you can certainly do without TDD in some instances.

TDD is not “good” in itself even though I find it incredibly useful in most situations. To really prove the point, not even the projects that claim they use TDD exclusively don’t TDD everything. Behavior like logging, transactions, thread-safety, randomness are almost never designed using TDD since the test-driving does not affect the design. How could science then “prove” that “doing TDD” is beneficial?

Science or Evolution

So what could science give us? I believe that we should never attempt to find or believe hard data presented to us. As one example, what if a research study showed this: “Using planning poker increases accuracy of estimates of 23%”. Could you use that study to strengthen your agile adoption? I find that conclusion laughable. A similar study in some other place may come up with the exact opposite numbers (and they often do). I once looked at research on TDD and found a very fractured image. Some study claimed it increased quality but productivity went down. Another study claimed productivity went up but more errors were introduced. A 3rd one showed that productivity was about the same, but developers ran their tests more often. This is clearly ridiculous. This is research of little utility but for the graduate student.

We could perhaps have more human-oriented studies, like the ones in social sciences. We could have studies that observe and reflect on human behaviours, telling stories about human behaviour and outcomes. This would probably be interesting but I believe this is not what Laurent means when he calls for more research. Why? Because that kind of material will never convince the pointy-haired manager saying “Show me the research or I won’t condone agile practices here”.

But in there’s exactly the thing. I believe that trying to indulge these people, the laggards, by giving them what they want, is truly just a waste of time. Is convincing laggards really what we want to spend time doing? What would be the point? I mean, if we truly believe that agile and lean ways are more successful then, in time, the companies that use them will prevail and the other companies will eventually die off. If you don’t have the patience to wait for your management to “get it”, then change your job.

If we have enough trust and patience, time will tell who is right. I am afraid science will not help much but I have high hopes for evolution.

11 thoughts on “Why Crying for More Science Won’t Help

    • Hi chrstopher! I think I simply understand it as I wrote about it in the post, namely that it is a toy example, with a person with a particular background and bias, doing several passes over the same kata. It is certainly not without value but I do not think it would qualify as evidence that TDD is better.

  1. Thanks for a very interesting post. I also find it nice to finally have disproved my theory that I will agree with everything you say – because in this case I don’t agree to 100%. I really like your end point though, that if we spend our time trying to prove ourselves right by finding more numbers or articles pointing to our case – then we are probably wasting our time.

    On the other hand I see a place and a need for science and I don’t agree with your definition of science – that it needs to produce repeatable experiments. You are touching on this subject in the end – but there are a lot of sciences that work with non-repeatable things. Social sciences like psychology is one good example, but also economy, geology, parts of astronomy, etc. Yet these areas all produce scientific findings that are not only interesting, but also very useful. The problem they face is often to find enough material to get statistically valid data and to find which parameters are actually correlated and that actually points towards causation which is an even harder problem.

    And i do find numbers like “teeams that are using planning poker have 23% more accurate estimates” fairly interesting – but from that you can of course not draw the conclusion that your own team’s estimates will become better by using planning poker, because you simply don’t know that.

    • Hi Anders! Glad to hear we do not always agree. (What would be the point of that?) :)

      You have certainly found a weak spot in my reasoning, namely that there are many fields counted as “science”, which do not require repeatability and context. However, I think their lack of predictability makes them irrelevant for our purposes.

      Like you say in your comment, even if one study shows positive results, we cannot deduce that we would have the same result with our team, for our project, with our managers, in our culture, for our business. There are simply too many pertinent variables to be certain. Therefore, the “science” in question, may be interesting but not persuasive to a person with a critical mind. Hence, I believe we will never see scientific “proof” that any practice, principle or value is better than another. I lament this deeply, but that’s just the way it is in a human-oriented activity.

    • Thanks, Ulrika! I don’t think there will be a certain moment, of course, but over time companies that are governed by an inferior DNA will succumb to companies that are superior. At any given time, this may seem implausible, but history has shown, time and time again, that companies that lose their touch, however great today, in a decade or two, lose their life.

  2. Also – I’d rather see the world start being efficient today, instead of waiting for what seems like an eternity. Imagine how much would happen if all waste disappeared! :)

    • I am with you 100%. Experiment, measure and find what works for you. Just don’t rely on science to give you the proof that it will work. They will just find what can be measured and ignore what is valuable.

  3. Well, there’s never gonna be one single diet or way of living that will work well for all humans in all contexts – but I’m still happy that there are people that do research on these things (usually by trying and noting down what happened – knowing that they can never repeat exactly the same study) so that I can learn how to eat and live more healthy. I’m never gonna know which way will work best for me unless I try – but knowing which one to try is a lot easier when others have done part of the work! :)

Comments are closed.