Tetris Is Hard To Test 169
New submitter JackDW writes: Tetris is one of the best-known computer games ever made. It's easy to play but hard to master, and it's based on a NP-hard problem. But that's not all that's difficult about it. Though it's simple enough to be implemented in one line of BBC BASIC, it's complex enough to be really hard to thoroughly test.
It may seem like you can test everything in Tetris just by playing it for a few minutes, but this is very unlikely! As I explain in this article, the game is filled with special cases that rarely occur in normal play, and these can only be easily found with the help of a coverage tool.
It may seem like you can test everything in Tetris just by playing it for a few minutes, but this is very unlikely! As I explain in this article, the game is filled with special cases that rarely occur in normal play, and these can only be easily found with the help of a coverage tool.
One line? (Score:5, Funny)
Re: (Score:2)
Really, who couldn't love code like this:
0d=d:IFdVDUd:a=POINT(32*POS,31-VPOS<<5):RETURNELSEMODE9:GCOL-9:CLG:O FF:d=9:REPEATVDU30:REPEATGOSUBFALSE:IFPOS=28VDUPOS,15,VPOS,24;11,26:IF0E LSEIFa=0PRINT:UNTIL0ELSEUNTILVPOS=25:v=ABSRNDMOD7:i=0:VDU4895;3:REPEATm= 9-INKEY6MOD3:FORr=TRUETO1:t=rANDSGNt:IFt=rCOLOURv-15:VDUrEORm:i+=m=7AND9 -6*r:IF0ELSEFORn=0TO11:d=n/3OR2EORd:GOSUBFALSE:IF1<<(n+i)MOD12AND975AND& C2590EC/8^vVDU2080*AB
Re: (Score:3, Informative)
Really, who couldn't love code like this:
Except that is not "one line". It is six lines. Any program can be a "one-liner" if there is no limit on the line length. Well, unless you writing it in Python.
Also, as long as I am on a rant, Tetris is NOT NP-Hard, since the arrival of the blocks is probabilistic. It is only if the entire sequence of blocks is known in advance that it becomes NP-Hard. But that doesn't happen in actual play.
Re:One line? (Score:5, Informative)
Except that is not "one line". It is six lines. Any program can be a "one-liner" if there is no limit on the line length. Well, unless you writing it in Python.
The line length limit is 256 bytes, of course. And these hacks are the basic-equivalent of the C obfuscation contest.
As the authors say: "I'd like to think it is self documenting. The code speaks for itself; even if what it has to say is not very nice."
Re: (Score:2)
The line length limit is 256 bytes, of course.
The program is 430 bytes.
Re:One line? (Score:5, Informative)
In ASCII, but many BASICs will reduce keywords down to a single byte.
Re: (Score:2)
Re: (Score:2, Informative)
Except that is not "one line". It is six lines. Any program can be a "one-liner" if there is no limit on the line length. Well, unless you writing it in Python.
The line length limit is 256 bytes, of course. And these hacks are the basic-equivalent of the C obfuscation contest.
As the authors say: "I'd like to think it is self documenting. The code speaks for itself; even if what it has to say is not very nice."
And here we get to the core of the problem. The presenter passes off information in an incorrect light, so only the audience that cares to accept it will continue accepting his later statements.
One of those statements is that code coverage tells you what to test. It doesn't. It tells you what you haven't bothered to test. What you need to test is driven by the user stories, customer requirements, and other bits of development documentation. Testers writing tests for permutations of non-important functio
Re: (Score:3, Insightful)
perhaps. I wonder if it NP-hard (Score:5, Interesting)
The fact that probability is involved doesn't mean there's not an optimal strategy, of course, where optimal is defined as "highest expected score" (score X probability). So figuring out an optimal strategy is a hard problem - how hard is it?
If the probability of a certain series of shapes coming next were 100%, we'd have an NP-hard problem, agreed? Does another probability make it easier or harder? Harder, if anything. That's provable because the probability version can be solved by solving each of the potential series as if each were known. What's harder than NP-hard? It may well still be NP-hard. It can't be of any more solvable complexity class.
Re:perhaps. I wonder if it NP-hard (Score:5, Funny)
What's harder than NP-hard?
Intractable.
also, Smiling Bob (Score:2)
What's harder than NP-hard?
Intractable.
True. Smiling Bob is also harder.
Re:perhaps. I wonder if it NP-hard (Score:4, Informative)
I disagree.
For a stochastic process your greedy "take the option with the best chance" algorithm may work, it may fail completely, just depending on the random numbers. If you have an stochastic polynomial algorithm, you have a chance to get the same or better expectation value than your "optimize global then choose greedy" algorithm. Both approaches may win or fail, but in the deterministic game the np-complete version always wins, while the "shortcut" version cannot compete. In the stochastic version, the shortcut may be as good as the optimal solution, because you cannot get the global optimum anyway so choosing a local one may be a good choice.
defined as expected value (Score:2)
As I mentioned, the test of fitness is expected value - the average score of 100,000,000,000 games played with that starstrategy, not the result from one specific, randomly chosen game.
In Hold'em folding preflop with pocket aces may turn out for the best occasionally, but it's still a bad policy because it will lose more than it will win. The best strategy is the one that does well long term.
Re: (Score:3)
Jep, and your strategy gets worse, while another strategy may stay average.
Assume you flip a coin. heads or tails.
You NP-complete algorithm knows the sequence.
The probalistic one just guesses "heads".
Now the Expectation value of both are Zero in the stochastic case. In the deterministic one its infinite win for the np-complete one and zero for the probalistic one.
So this is an example, where a perfect algorithm for deterministic data is just as good as another for stochastic data.
This does not mean, you can
if I understand your point (Score:3)
Let me see if I correctly understand your point. Are you saying:
The best algorithm for a deterministic sequence may be / is NP-hard.
Best best algorithm for the stochastic sequence may be different.
Therefore, the best algorithm for the stochastic sequence may be easier than NP-hard.
That seems to make sense. Until you realize the deterministic sequence IS one case of the stochastic - where the probabilityof a certain sequence happens to be 1.00. If you had a polynomial algorithm for probability X, you could
Re: (Score:3)
no, you got it only the half way.
deterministic:
best = np-hard, perfekt
other: polynomial, good average
stochastic:
best: np-hard, not perfect, quality unknown
other: polynomial, good average
the point is not the runtime complexity, but the result. while the best algorithm cannot be beaten on the det. sequence, it may fail completely (in terms of quality) on a sequence without full information. If you got a good polynomial one with an average result, it may be better for many sequences.
one example may be an perfe
so just don't solve it (Score:2)
So in other words, you're pointing out that you could just not solve it, not come up with the optimum move each time based on expected value. Instead, you could settle for a "good enough" move and sometimes you'd get lucky. This is true.
you stated:
stochastic:
best: np-hard, not perfect, quality unknown
other: polynomial, good average
You called the first algorithm "best", acknowledging that the best (best long- term average) is NP-hard. The other can't be better than the best (by definition) , so the probl
Re: (Score:2)
Best is only with respect to the deterministic case.
disregard runtime complexity.
deterministic:
there is a best strategy (A).
all other strategies (B) are worse or equal.
A stochastic strategy (C) may have a good average quality.
stochastic:
the deterministic best strategy (A) cannot be perfect anymore, just as no other strategy can.
(A) has now unknown quality for a random sequence.
(C) still has the same average quality.
So there is now a possibility, that (C) may beat (A) in an average over a lot of games. (as a
Re: (Score:2)
Note that this limit is on the tokenised form stored in memory, not the ASCII representation. This is why the code e.g. uses "GOSUB FALSE" rather than "GOSUB 0": the FALSE token is shorter than the encoding of a line number.
Programs for the unexpanded (1K) ZX81 frequently used that type of memory-saving. All numbers were stored as floating point and took up 5 bytes of memory, and saying (e.g.)
LET A = CODE("$")
(where CODE is the equivalent of the ASC function for the ZX81's non-ASCII character set and $ is character 13) instead of
LET A = 13
actually saved you memory.
Re: (Score:2, Flamebait)
You're no fun. If I worked for you, I'd quit as soon as possible.
Really, who couldn't love code like this:
0d=d:IFdVDUd:a=POINT(32*POS,31-VPOS<<5):RETURNELSEMODE9:GCOL-9:CLG:O
FF:d=9:REPEATVDU30:REPEATGOSUBFALSE:IFPOS=28VDUPOS,15,VPOS,24;11,26:IF0E
LSEIFa=0PRINT:UNTIL0ELSEUNTILVPOS=25:v=ABSRNDMOD7:i=0:VDU4895;3:REPEATm=
9-INKEY6MOD3:FORr=TRUETO1:t=rANDSGNt:IFt=rCOLOURv-15:VDUrEORm:i+=m=7AND9
-6*r:IF0ELSEFORn=0TO11:d=n/3OR2EORd:GOSUBFALSE:IF1<<(n+i)MOD12AND975AND&
C2590EC/8^vVDU2080*ABSr;:t+=a:IF0ELSENEXT,:VDU20:UNTILt*LOGm:UNTILVPOS=3
Anyone who had to read it, update it, or debug it?
Anyone who had to play the fucking game (it's full of game-breaking bugs - http://survex.com/~olly/rheoli... [survex.com] )?
Re: (Score:2)
DMCA incoming (Score:3)
Re: (Score:3)
Re: (Score:2)
If anybody wrote code like that for me, they'd be made to sit on the naughty step and think very, very hard about what they'd done.
Unless, of course, you were developing for embedded hardware, where you are trying to do way too many things with way too few resources***. Then you'd give that programmer a promotion.
***Although those days are gradually coming to an end, as even the tiniest systems are getting more and more resources, and eventually they'll all join the rest of us, where readability, verifiability, and maintainability take top priority. But for now, they're not all quite there yet.
Re: (Score:3)
Actually, it is written for resource efficiency...specifically program size, which uses memory. The goal was to write a 1 line program, and in BBC Basic, that meant they were limited to 256 characters. Yes, maybe they could have wrote things with more verbose naming and had it compile to the same size, but the particular goal there was to write something big with little code. I think they accomplished it fairly well, and probably 95% (at least) of programmers would be hard pressed to replicate their results
Re: (Score:2)
BBC basic was interpreted, not compiled (though there may have been compilers written for it since).
Re: (Score:2)
BBC basic was interpreted, not compiled (though there may have been compilers written for it since).
It was my original instinct to say the same (since nearly all basic languages are), but I looked it up on wikipedia before posting and found that there was indeed a compiler for it:
http://en.wikipedia.org/wiki/B... [wikipedia.org]
A Compiler for BBC BASIC V was produced by Paul Fellows, team leader of the Arthur OS development, and published initially by DABS Press.[citation needed] This was able to implement almost all of the language, with the obvious exception of the EVAL function – which inevitably required run-time programmatic interpretation. As evidence of its completeness, it was able to support in-line assembler syntax. The compiler itself was written in BBC BASIC. The compiler (running under the interpreter in the early development stages) was able to compile itself, and versions that were distributed were self-compiled object code.[original research?] Many applications initially written to run under the interpreter benefitted from the performance boost that this gave, putting BBC BASIC on a par with other languages for serious application development.
There's not a whole lot of info about it on wikipedia, and it doesn't even say when it was written (and there are no citations), so I have no idea if it was something recent or very old.
Re: (Score:2)
You've never written code in perl, have you?
Re: (Score:2)
I'd hire the person in the blink of an eye. That kind of discipline is sorely missing among younger programmers these days.
Blockheads (Score:1)
I'm sure there is a joke in there somewhere.
Re: (Score:2)
9-INKEY6MOD3:FORr=TRUETO1 --- lol
Re: (Score:1)
Re: Blockheads (Score:2)
RTFM.
in Soviet Russia (Score:5, Funny)
It may seem like you can test everything in Tetris just by playing it for a few minutes, but this is very unlikely! As I explain in this article, the game is filled with special cases that rarely occur in normal play, and these can only be easily found with the help of a coverage tool.
Tetris doesn't need coverage tool to test you. Everything about you.
Code-coverage tool is crutch for weak capitalist engineer. Tetris is Soviet technology, forged by people's will.
Re: (Score:3)
Tetris doesn't need coverage tool to test you. Everything about you.
So what you're saying is...
In Soviet Russia, Tetris game tests you!
Nice advertisement (Score:5, Insightful)
From a company promoting automated WCET analysis. Hah!
Re:Nice advertisement (Score:5, Insightful)
Normal users don't test all cases of a game.
Maybe not, but as soon as you tell yourself, "I don't need to test this code, a normal user will never get to it;" you can be certain that after saying that, a user will find a way to break it. The Gods of Eternity will laugh at you.
Re: (Score:2)
This is just more marketing spam that's found its way onto Slashdot.
Re: (Score:3)
Submitter here. It's "marketing spam" in the sense that it's based on something I did at work. I don't see why this is a problem. Many articles linked from this site involve something that someone did at work.
I thought it was interesting that, though this is a really simple game, you can't test it effectively just by playing it. You have to deliberately seek out all of special cases. That's a fact about virtually all software, but it's not an intuitive one, and that's what the article is about.
Perl-standard line length (Score:2)
Any language that doesn't require carriage return + linefeed can do anything in one line.
And Basic comes with a ton of library fuctions that makes things easier to do. No need to initialize memory, dispaly, setup graphic or keyboard interrupts, etc.
Re: (Score:2)
And let's not even get started talking about line numbers.
Re: (Score:2)
I've seen C64 basic. One line of code can be two lines on the screen. Maybe more than two lines when you realize you can compress names like POKE into the two-character acronym (second being shifted) and using it in list would happily decompress to something that can't be typed within the 2 screen-line limit.
BASIC on the Atari 800 and its descendants exhibited the same behaviour with respect to abbreviations and its three-screen-line limit on a single BASIC line.
Atari User magazine had a feature called "five liners" for very short programs. Many of the more elaborate ones pushed this as far as it would go by *requiring* them to be entered using abbreviations in order to fit this three-screen-line limit. IIRC most of these would be expanded upon processing, often taking them over the limit.
Re:Perl-standard line length (Score:4, Informative)
Re: (Score:2)
Well - it does use quite a nifty trick to implement a subroutine, given that you can only GOSUB a line number, and there's only one line number.
Re: (Score:2)
Any language that doesn't require carriage return + linefeed can do anything in one line.
Exactly... In fact there is a lot of very complicated one-line javascript libraries just download one of those .min.js files :)
br Seriously, a readable 30 line implementation would have been more impressive...
Re: (Score:2)
Maybe you'd prefer a binary version at 256 bytes?
http://256bytes.untergrund.net... [untergrund.net]
Re: (Score:2)
This is no time to read or to drink, sir.
replacing line feeds with terminators is not a 1-l (Score:5, Informative)
In BBC BASIC, a colon is a statement terminator, much like a semicolon in languages with C-style syntax. The linked code is therefore not a one-liner by any meaningful definition of the term. One could replace all of the linefeeds in Linux kernel source with semicolons and other appropriate terminators. That wouldn't make the kernel a one-liner.
Re: (Score:3)
In BBC BASIC, a colon is a statement terminator, much like a semicolon in languages with C-style syntax. The linked code is therefore not a one-liner by any meaningful definition of the term. One could replace all of the linefeeds in Linux kernel source with semicolons and other appropriate terminators. That wouldn't make the kernel a one-liner.
If it's a one-line program, why is it more than one line?
A line of code is not the same as a line on a screen. The program won't fit on one line on most screens, but it will fit on one line of BBC BASIC, which fortunately has a well defined, but short maximum length of 256 characters. The whole program is 257 bytes as there is an extra byte to mark the end of the program.
from: http://survex.com/~olly/rheoli... [survex.com]
Re: (Score:3)
cats are mammals, not all mammals are cats (Score:3)
A line can be no more than 256 characters. That doesn't mean that the following is one line:
foreach mammal in pets
print mammal ' "is a mammal"
if (is_cat(mammal) {
print " and also a cat"
}
}
Just because all cats are mammals doesn't mean that all mammals are cats.
Just because all one-liners are less than 257 characters doesn't mean that all programs less than 257 chara
Re: (Score:2)
Re: (Score:2)
> Yes, because line breaks in BASIC are significant. That means, of course, that they actually *do* something;
Specifically, they *do* approximately the same thing as colons, they are generally synonymous with colons, of which this program has plenty.
Re: cats are mammals, not all mammals are cats (Score:2)
Yes, I see how the two are similare enough so as to be interchangeable. Maybe, if you don't want your program to run.
Re: (Score:2)
Except they don't.
Lines are an important concept of the language - they are referenced by gotos and gosubs. That program is one line. It is more than one statement, but the claim wasn't a "one statement program".
Re: (Score:2)
Re: (Score:2)
While I totally agree with the one-line bullshit, I'd just like to point out, that in fact, you can't collapse the linux kernel into a one-line statement this way; Parts of the code is using macros and they will fail if you put them on the same line.
Re: (Score:2)
So you run cc -E first then
Re: (Score:3)
Re: (Score:2)
While I totally agree with the one-line bullshit, I'd just like to point out, that in fact, you can't collapse the linux kernel into a one-line statement this way; Parts of the code is using macros and they will fail if you put them on the same line.
#include "kernel.c"
That's a one liner!
Re: (Score:2)
what do you expect from a oneliner? Tetris()? A Perl Oneliner does have semicolons as well.
Re: (Score:2)
Ignoratio elenchi
Re: (Score:3)
Re: (Score:3)
46 lines (statements), actually
No, statements are not the same as lines. Lines have real semantic significance in BBC Basic, in a few different ways: for one, GOSUB-type subroutines can only start at the start of a line (because that's where the line number is), and you also can't terminate an "if" without starting a new line. That (plus the 256-byte limit) makes writing one-liners in the language more of a challenge than in other languages where line breaks genuinely aren't significant.
One Line (Score:1)
Perlers are so jealous right now; they need 2 lines.
Re:One Line (Score:4, Informative)
It's simple enough to implement in a shell script. At least three or four of us have done it over the years.
Re: (Score:1)
I guess this aint the kind of joke that works on Slashdot
Re: (Score:2)
It's simple enough to implement in a shell script. At least three or four of us have done it over the years.
True. Here is one example: http://miria.homelinuxserver.o... [homelinuxserver.org]
Infomercial for a code coverage tool? (Score:5, Interesting)
So at some point you reach a point of diminishing returns. It might not be worth making sure every line got tested when there are procedures that have a bug that happens in one in a billion calls. My philosophy is, "Perfection is the goal. Doing better than the last release is the shipping criterion".
Re: (Score:2)
While everything you just said makes sense, nothing beats good testing, and like any tool, this is another one. All that code coverage does is let you focus on what has not been touched, then you'll be able to test it somehow. Also, I could create a similar problem, just like the one you wrote about above and would happen more often. I'm thinking of traffic management in the air. Or maybe even traffic management on land.
Re: (Score:2)
Re: (Score:2)
Re: (Score:3)
All that code coverage does is let you focus on what has not been touched, then you'll be able to test it somehow.
The trouble is that what you really need to test isn't how much coverage of the code you've got, but how much coverage of the possible input space. More specifically, you ideally want to know that each distinct combination of inputs that will cause a different type of behaviour in the code has been considered.
Of course, this is typically an implausibly difficult problem to solve in real world projects. To see why, consider that this article proudly claimed that finding the special case of clearing 4 lines t
Re: (Score:2)
Re: (Score:2)
You're right, this sort of testing should really be about covering the range of possible inputs. But that is typically impossible. There are too many possible scenarios. You need a practical substitute.
I agree that statement coverage is quite crude, it tells you very little about the data being processed. There is more detailed information being produced here - "MC/DC coverage" - which does tell you whether conditional statements have been thoroughly exercised, because each possible reason for the "true" o
Re: (Score:2)
FWIW, I agree with almost everything you wrote. I have nothing against coverage tools, and I use them occasionally myself. I just think it's important to have a realistic view of the benefits you do and don't receive.
The only thing I disagree with is your final paragraph, where you talk about safety-critical code. If you really were working on systems where a failure would have catastrophic consequences, I would hope you had a QA process a lot more sophisticated than running a test suite and this kind of co
Re: (Score:2)
If you really were working on systems where a failure would have catastrophic consequences, I would hope you had a QA process a lot more sophisticated than running a test suite and this kind of coverage tool to check for problems!
Oh, certainly! The good news here is that the avionics industry knows this, and in any case, the FAA won't let them cut corners. I don't know exactly how the industry uses our tools, but it's typically in conjunction with lots of manual testing, with the coverage tool capturing d
Re: (Score:2)
You know, I like what you wrote since it brought up a safety issue once I read about. It was about a plane making a crash landing, and the pilots heard a "GONG" sound, they were never trained for that sound, but they were able to find it in the manual. It seemed that that gong sound was the sound of everything is failing including redundancy. Now that gong sound is in all the simulations.
So I look at it as a tool, a tool to test all the code and see if it works in general for most situations, then test agai
Re: Infomercial for a code coverage tool? (Score:2)
Find better programmers, pay them better, manage the project better to allow time to fix bugs after they've run through QA.
Nonsense -- make your own test suite (Score:5, Insightful)
Has slashdot really become a means for tech companies to inject free advertisement by a simple blog post made to look like real journalism?
Re: (Score:3)
Defining all of the possible scenarios is often a lot harder than it looks. There aren't too many UI coders out there that haven't said "yeah, we need to fix it, but what made the user decide to do that?" at one time or another.
Speaking of UI's (Score:2)
Defining all of the possible scenarios is often a lot harder than it looks. There aren't too many UI coders out there that haven't said "yeah, we need to fix it, but what made the user decide to do that?" at one time or another.
This reminded me of a UI bug I discovered in Steam - if you have 2 monitors, one rotated 90 degrees, but not the primary, and try to maximize steam on that window, bad things happen.
Seeing as how I've seen only like 3 people doing that, most not in a home setting, I don't think it comes up much.
Re: (Score:2)
Add me to the list of those using this configuration in a home setting, unless you need to exclude me for actually having three monitors -- one is a TV that is usually off or being used for other purposes. (Having both AGP and integrated graphics active at the same time is... interesting. I get lots of odd behavior out of it.)
Re: (Score:2)
Why, did you not get enough :CueCats [wikipedia.org] and i-Openers [wikipedia.org]? This is hardly the first Slashvertisement, and it's the only one from this company that I've seen.
Re: (Score:2)
Thing is, you need both your own test suite and a coverage test tool. The two work together. The coverage tool tells you if your tests are incomplete, helping you to fix them.
If I were actually testing Tetris I would definitely do it the way you suggest: a pre-arranged sequence of blocks and a pre-programmed series of moves. I'd run the game with that sequence, then look at the coverage data to see if I needed to add anything. Some of the process can be automated, but the test cases themselves have to be m
Re: (Score:2)
Beware coverage tools (Score:3, Insightful)
The article makes it sound like coverage tools help! If you're not familiar with them, they tell you which bits of code have been run, not how many of the N cases of that code have been executed.
So the code might fail with a particular combination of inputs, but the coverage tool is more interested in which bits of the code have been execute.
It's one of these tools and metrics that non-technical managers use to substitute for an ability to read code.
Re: (Score:2)
This is quite true, but at least it's something that can help. Programmers already make enough mistakes, so any help is welcome. Whether that help is worth the price tag in dollars and time has to be determined on an individual case by case basis.
Re: (Score:1)
All of my best testers have been people who use the product to do the job it was intended for, and that means they're testing the same common pieces of code through every use case. The coverage tool is simply the wrong metric, it assumes one use-case = one piece of code, and treats code as 'covered' if its been run because it doesn't know about the use cases.
Worse, the testers end up trying to run obscure code simply to get the right test metric. So all the belt-and-braces checks I put in to prevent future
Re: (Score:2)
Validation is way more important than writing code. Coding is grunt work that literally anyone can do. There is a huge demand for programmers, and very few are "good" programmers, 90% are just grunts who will never get any better, and that's life due to demand. So you need validation. I wrote and managed RTL development for 15 years at Intel and code coverage is simply mission critical. No other way around it.
If you think being able to "read code" is enough to see all the corner cases, you're either very yo
piece of cake (Score:2)
How I know it's an ad (Score:2)
As I explain in this article, the game is filled with special cases that rarely occur in normal play, and these can only be easily found with the help of a coverage tool.
This doesn't seem like news to me! I'm shocked and appalled!
Comment removed (Score:4, Interesting)
Re: (Score:3)
Sort of spammy, also not convincing (Score:2)
So, on the one hand, it's sort of a spammy/advertisey thing to begin with.
On the other hand, I'm also not entirely convinced that the code coverage tool really solves the problem, because a given line of code can have different effects under different circumstances.
If you read in an address from a text stream, and then write to the memory location denoted, that's just one line of code executing that dereferences the pointer, but good luck determining what it does on all future invocations based on watching
Re:Sort of spammy, also not convincing (Score:4, Insightful)
Code coverage tools will not tell you if your tests are sufficient. They simply tell you what lines of code were hit. They don't tell you whether or not the line of code was hit while doing a meaningful test. In fact, it is trivially easy to write "tests" that exercise 100% of the code but have no expectations at all.
What code coverage tools tell you is what code you definitely haven't tested. If you haven't run that line of code in your tests, you definitely haven't tested it. This is useful information, but not essential if you have a good team. My current team is quite comfortable writing tests. We do most things TDD and without trying hard our average code coverage is 96%. I occasionally wander through the other 4% to see if it is worth testing and most of the time it isn't. Occasionally I will find the odd piece of logic that was jammed in hurriedly without tests, but on our team it is quite rare. On the other hand, I have worked on teams that were not comfortable writing tests and mostly wrote them after writing production code. On those teams we would get about 75% test coverage with holes you could drive a bus through. A code coverage tool was very useful for educating people on the need to improve the way they wrote tests.
I feel very confident I could TDD my way through a tetris implementation and get 100% code coverage without undue effort. I don't think I would find all of the corner cases without help, though. A code coverage tool wouldn't help me in that instance.
Don't see much BBC BASIC these days! (Score:2)
My dad and I wrote a BBC BASIC interpreter for PC-DOS. I'll have to dig it out and see if I can get this working in it.
If it's hard to test (Score:2)
Re: (Score:2)
External interactions (DBs, UI, etc) and highly performant code (embedded systems, kernels, etc) are where I wouldn't instinctively look to use TDD.
The benefits around testing are massive in themselves - you can set up automated unit tests that assure you not just the code coverage, but also the broad range of inputs that might cause different behaviours within that code.
The design benefits however are significant too, and worthwhile in their own right. I find that TDD leads to code that's easier to read, u