Questions around TDD familliarity among Rustaceans


#21

That are exactly the problems where TDD tries to and could really help in practice :D.

I just can’t see this. The more I read about it, TDD looks like some agile development technique, deliberately promoting the creation of subtly broken software on the basis of YAGNI.

Edit: Sorry, the above came out sounding very dismissive. That wasn’t the intent! All I really mean is, I can’t see how TDD helps identify edge cases, because from what I’m reading, it sounds like it does the opposite.


I’ll describe my own testing methodology by relating to a thing I wrote a year ago. (warning: because this is from memory, the examples are embellished)

I needed a library to facilitate writing wrappers around programs with getopt(1) style arguments. For instance, to write a wrapper around cp where -i (interactive) is the default and -f (force) can override it. The library would be initialized with all arguments accepted by the wrapped program (and possibly additional args defined by the user), and then it would parse the argument stream into a data structure that can be modified by the user before being serialized back into arguments.

Before writing the code, a variety of examples were clear to me, such as the following. These tests test the result of serializing the parsed arguments without modification, so that they basically just canonicalize the argument stream in the same way getopt(1) does:

Defined options:  -v,--verbose
                  -o,--output OUTPUT

  Input:  "prog", "a", "-v", "b"
 Output:  "prog", "-v", "--", "a", "b"

  Input:  "prog", "-vv"
 Output:  "prog", "-v", "-v"

  Input:  "prog", "--", "-a"
 Output:  "prog", "--", "-a"

  Input:  "prog", "-a"
 Output:  Unknown option 'a'.

  Input:  "prog", "-o", "-a"
 Output:  "prog", "-o", "-a", "--"

  Input:  "prog", "-o-v"
 Output:  "prog", "-o", "-v", "--"

Had I only coded to the examples that I was capable of dreaming up prior to the implementation, the result would have been, in my opinion, a wonderfully broken piece of software. But this is not what I do.

Instead, during implementation, I play the role of the world’s most terrible pessimist. I constantly ask myself how might things go wrong under the current implementation? Or if I changed this statement to do X instead? Or if I made simplification Y to the input? As soon as I think of a counter example, I write it down.

Long story short, what I discovered is that is pretty much impossible to do any form of preprocessing to a getopt argument stream without full knowledge of the list of valid options. Virtually any attempt to add a lexing stage to a getopt-style parser ends in total disaster. It’s amazing really just how much context-sensitivity can actually exist in what would appear to be such a simple syntax!

Here’s some examples that only could have been dreamed up by my inner pessimist during the implementation:

          Input:  "prog", "-"
   Naive output:  (crash)
 Correct output:  "prog", "--", "-"

          Input:  "prog", "-o-o"
   Naive output:  Option '-o' requires an argument.
 Correct output:  "prog", "-o", "-o", "--"

          Input:  "prog", "-vovv"
   Naive output:  Option '-o' requires an argument.
 Correct output:  "prog", "-v", "-o", "vv", "--"

          Input:  "prog", "-v-n"
   Naive output:  "prog", "-v", "--", "-n"
 Correct output:  Unknown option '-'. (or some other error)

          Input:  "prog", "-o", "--", "-v"
   Naive output:  (probably crash)
 Correct output:  "prog", "-o", "--", "-v", "--"

          Input:  "prog", "-o", "-o", "--", "-v"
   Naive output:  "prog", "-o", "-o", "--", "-v"
 Correct output:  "prog", "-o", "-o", "-v", "--"

          Input:  "prog", "--output=a=a"
   Naive output:  "prog", "--output=a", "a", "--"
 Correct output:  "prog", "--output", "a=a", "--"

#22

I’d never heard of YAGNI, but it perfectly explains the influxDB escaping car-crash. So it’s basically hack something together that gets the job done for our present testing data-set. Now ship! Then discover that someone wants to put a comma or a space into a tag value. Bodge a fix for that. Tests pass. Ship! Now what about putting in a backslash before a separator? Uh-oh, didn’t plan for that. Turns out \\, doesn’t do what you want, as that now escapes both the backslash and the following separator. So some valid printable string values are unrepresentable in this format, making extra work for anyone feeding data in who values reliability. So 99.99% of cases are handled! That only leaves the 0.01% that hackers will use in their exploit, or that will cause that intermittent hard-to-find bug that takes days to track down.

Well, rant over. I actually have to deal with this at work. Now I have a name for the devil that causes so much time wasted debugging avoidable problems: YAGNI.


#23

I don’t really get it, for me your description sounds exactly like how I work when doing TDD, because most of the time during implementation, I write tests like:

the world’s most terrible pessimist

I think you may have missed the point, that you don’t write all the test before starting to implement, it works exactly the opposite, you write a test for some (edge) case and then get it to pass. I don’t see a problem to combine your work flow with TDD, because it’s basically the same, with the only difference, that the iteration of thinking about how your implementation should work is broken down into smaller steps. You can get the same result by writing your test after your implementation, but the chance is much bigger, that you forget a case (error handling for example), because you have to keep them all in your mental model until the whole thing is done. That’s why I wrote, that TDD helps me to keep focused (at least for me).

If you only read about it until now, then I would suggest that you try it for yourself.


#24

Mhh, there are two sides on every coin. The problem isn’t YAGNI, TDD or any other principle or method at all. You can find many examples where things you do not need are causing a bad design, bugs and a lot of wasted time (because you wasted time for implementing things no one will ever need). Most of these principles try to help getting around known traps while developing software. But no one of them is the one and only solution and they aren’t dogmas and following them dogmatically can (and most often) will lead to more problems then not using (following) them.

The real problem while developing software is, that you can’t foresee any use/edge case and that you can’t test anything. Unfortunately that’s backed up by the current state of theoretic informatics, so we have to deal with it, that’s where all these things try to help, not more and not less, but you have to carefully choose the right ‘tool’ for the job. And the right choice will depend on personal preferences, knowledge, the way of thinking about problems and their solutions and so on.

You often hear a similar thing when talking about the advantages of software testing. The first argument from a couple of people is always the same and is something like:

But, you can’t test everything and the tested software will still have bugs, like any other, so why testing at all?

My English isn’t the best and I hope you will understand what I try to say here.

Edit:
A good example for YAGNI is optimization (from a performance or memory point of view) before profiling your application. Doing that the chance is hight that you will spend hours of optimizing a algorithm that’s called only a couple of times, whereas you miss another simple looking part of code that is called tens of thousands times that really causes your app to slow down.


#25

Nice writeup, ExpHP. My experience parallels @kunerd’s, where using TDD helps me to uncover the diabolical as I develop. Your description helps me to understand how your experience with TDD were so different.

Were you writing one test and following that test with the minimal implementation that would cause the test to pass? TDD gets easier when we start with “the right tests”, but I’ve found I could start just about anywhere and end up in a very good place with TDD.

I’m not sure at what point you were expecting to have the “grand design” revealed, but TDD involves heavy refactoring, so “that next feature” may cause an evolution in your thinking. Loose coupling is essential to limiting the scope of the refactorings.

In the end, I agree with you that one couldn’t have known how tough it was to parse all the options that might have been thrown at your wrapper library. In my mind, TDD would/should have driven you to that realization fairly quickly. Refactoring to a more general and ultimately substitutable–as per Liskov Substitution Principle–parsing module, with a well-defined interface would have been a great line of defense, separating the problems of parsing (decoding) from the problems of encapsulating the shell command (encoding).

I assume you ended up in a good place with your project, but I just wanted to take a moment to thank you for explaining what you experience, and to share how TDD might work for you in just such a situation.

What do you think?


#26

I assume you want to cover more ground, so, by all means, use a library, but as a footnote, adding “here’s how you can roll a mock by hand” and leaving the application of that technique to the larger story you’ve covered as an exercise for the reader would work for me. I’d love to see your thoughts on hand-rolling mocks!

I didn’t have time to click on the link you provided, but, like TDD, there’s no substitute for putting something out there for seeing what might have worked even better! :wink: Or maybe that’s just my selfish way of inviting you to go for it. Although I’m sure it will evolve, from your writings here, I don’t think the style or flow will be a problem. :ok_hand:t5:


#27

The point is that if you’re going to implement a subset of the solution on the basis of YAGNI, at least make the boundaries well-defined. Add asserts or error logging if some condition comes along that you know you can’t handle. So completely solve a subset of the problem. Don’t solve a rough middle of the problem space and leave the behaviour for the surrounding space undefined.

That seems to me to be a weakness of driving from tests. It doesn’t force you to consider the cases you’ve chosen NOT to handle yet. Whereas working from code and considering all the possibilities at every operation (i.e. input bytes are 0-255, have we handled all those cases?) does force you to consider error paths for stuff you’ve chosen not to handle. But I guess a good programmer will do both.

I am certainly in favour of solving subsets of the problem, and planning to implement other parts later, but only in a way that leaves nothing undefined, i.e. a complete partial solution. I am also strongly in favour of as thorough testing as is possible. I agree with the pessimist/paranoid approach to testing and coding, because it just saves so much time in the long run.


#28

My hope is that with the article I’ll be able to shed some light into one style of doing everything test first, which is outside-in. I hope that having a concrete examples with the full codebase avaiable can be a basis for a further discussion.

I think that I’m going to create a forum topic right here for the discussion on the article and post a link in this topic when ready.

Again, thank you all for all the suggestions :slight_smile:


#29

Awesome discussions here!

To put in my two cents to “where does the grand design come from”:
It is important that TDD has a three step cycle: Red->Green->Refactor (-> repeat)

The refactoring step is often overlooked, and this is, for me, the step where you think about “grand design” and “nice API”.

Another point often overlooked that this three-step cycle is supposed to be insanely fast – sometimes less than a minute for a complete three-step cycle – if you’re at a simpler part of the spec, such as “make sure output is the same with/without ending newlines”, “throw error on empty input”, or “ignore lines starting with ‘#’

I still stagger myself at how tiny tests should be, because each of those three examples would be a single test (or even multiple single tests per example if you’re trying a few permutations).
The point is to break down the problem space into chunks that are so tiny that your first instinct is “that’s SO trivial, why would I even bother writing tests for that!?!”.
Where the answer is: “so that the cheap computer can test 500 times per second if it still works (at each commit/Ctrl+s), instead of you, as expensive human, checking it once a month (i.e. never)

This is exactly what the refactor step is for: step back and see how the terrible hack you just wrote to obtain the “green” can be made to look nice. In its purest form, TDD wouldn’t be looking back “now and then”, but basically every time you finish a new test, so more like a dozen times per hour.

Which is the point where you pause for a few seconds and briefly write yourself "#todo test for edge-case X I just thought of."
The trick of TDD is to rapidly switch hats between:

  1. worlds greatest pessimist: writing test-cases you’re “sure those idiots would mess up”
  2. the laziest programmer you’ve ever met: "do the bare minumum to meet (1)'s spec
  3. the experienced architect: whip the crap produced by (1)+(2) into something that you will understand two hours, two months and two years from now.

And all of that rotating by the minute(s), in one brain.

I vehemently disagree; there are very few people who have the discipline to write tests afterwards, and those who do will always overlook things, so you end up with 95% or 80% or 70% coverage of “only the important parts”. Only with test-first do you force yourself to stick to 100%/99% coverage. I have personally sensed the difference in trust/relaxation in my code with full coverage vs. “mostly” full coverage. It is really quite the shock how much anxiety we carry along on a daily basis, it is so common we don’t even realise it most of the time.
The moment that you hit full coverage, suddenly the anxiety (“but there could be a bug in the bits I didn’t test”) drops away, and you reach this zen-like “I can do anything” level, because you know, and feel in your bones, that the tests will catch you if you overlook something*. And because the tests can run in the background and give immediate feedback, correcting your mistake is often just a few Ctrl+Z, instead of an hour of debugging after somebody points out an error two weeks later.

However, I will admit that I have since “fallen off the horse” again, and don’t write many tests for my python/bash/perl things. Doing TDD properly requires quite the investment, and most of the benefits only show up very late in the investment-curve, so it is hard to keep it up (especially if your social/team/boss environment doesn’t enforce it.)

* Edit: In that respect, a good test suite gives much the same feeling as the rust compiler gives compared to using C-style raw pointers.