> To make things even worse, the community that has most thoroughly embraced them are compiler authors, who many programmers think of as being an impossibly skilled elite
The article's approach seems super ad-hoc, leaving you to have to think hard, do all the work, and make all the mistakes.
If you were to go down the other path, you might try dividing and conquering the problem. An arbitrary Pair<A,B> is trivially constructed from an arbitrary A and an arbitrary B. So if you can generate a string, and a number, you could generate a User full of number and string fields. If your generate function accepts a number describing how complex a string to make, then you can also choose how complicated to make your User. That's all shrinking needs to be. Repeatedly trying smaller Ns while the problem still happens (the problem being one of your unit tests - not an additional "interestingness" test you need to write.)
You'll probably way more likely to hit boundary cases by using the structure of the input and making interesting variations that way, rather than hoping you can permute the right bytes from the CLI.
For more than 20 years I've been doing automatic test input reduction as part of testing Common Lisp compilers. The reduction is on randomly generated inputs, but they are structured in such a way that reduction always gives a valid program that should (in the absence of compiler errors) not signal an error.
It's a tremendously economical way to test compilers. For a modest and finite investment in testing infrastructure I get an unlimited number of tests. Over the years I've run many billions of test inputs on various Common Lisp implementations, although I'm mostly focusing on sbcl these days. When a bug is found the input quickly reduces to a something small that usually immediately tells the developers where the problem is (usually but not always something introduced recently.)
I also have a testing harness that cobbles together usually erroneous Lisp code and sees if the compiler blows up (the sbcl compiler as designed must never throw an error condition even on erroneous input.) This exploits a corpus of public Common Lisp code, combining and mutating the code in various ways.
Shrink Ray, described in the article, is developed by D.R. MacIver who also developed Hypothesis. I remember when it was announced a while back but had forgotten about it, I guess I have something to play with tonight.
I read the first part of this article, then gave up and Googled "Test-case Reducers".
I'm not sure if that's an article failure (that I didn't want to read a whole ton of text and C code details), or a success (as it got me interested in the topic). I guess both?
I read the whole article, and I am still confused. I get that test case reducers find the smallest error causing string. I don't understand why that is particularly valuable.
Also, do test case reducers work on integers or other numbers? What about reducing some other complexity? Is this for developing unit tests or just debugging?
Making the smallest test case that reproduces some bug is hugely valuable when debugging complex systems, especially if you have a wierd heisenbug that is hard to manifest reliably. Having a small reproducible case massively narrows the scope of the search for the bug.
Similarly, narrowing test cases to the smallest case that reproduces a particular behaviour so you're only actually testing a very targeted thing will make the test suite faster and also make it easier to fix tests which break because they exercise a very narrow path.
A reduced test case means you run less code to process the test case, which means your breakpoints trigger less frequently (and the remaining breakpoint triggers are more likely to be relevant to the actual bug). It also means all your debugging steps are likely to run faster and produce less data to sort through. Your log files will be shorter and easier to read/grep, etc.
Imagine being handed a sheet of 10 equations and being told "1 of these equations is wrong." Now imagine that someone came in and erased 8 of the correct equations - they just saved you a bunch of time.
I have a similar tool to shrink ray, called [bonsai](https://github.com/nnunley/bonsai). I designed it to allow me to try to inline and reduce code for both simplifying single file examples, as well working across multiple files. It uses Tree-Sitter for syntax awareness, and the [Perses algorithm](https://doi.org/10.1109/ICSE.2018.00046) as the methodology for simplification.
I'd love to get some feedback if anyone's interested.
This looks interesting, and definitely useful for non-C/Python languages, which existing reducers I know of mostly don't have explicit support for! I can't get it to build though (I've filed an issue).
I was also wondering: is there a UI while reduction happens? I've found Shrink Ray's UI improvements in the last year to be much more useful than I first expected: not just because it gives me something to look at, but because it really helps me understand if reduction is on the right path or not. [Some of the new Shrink Ray extras like being able to rewind reduction to a past point and to skip passes are also really useful too.]
I've only ever known about these through compilers, very cool.
On one project, through a variety of circumstances, dead code elimination was straight up not working, but we wanted to show the theoretical improvement of some approach - but we couldn't figure out why at the moment (we did spend a whole week chasing down the root cause after - maybe worth in hindsight...).
We were doing it by hand at one point, but someone suggested using CReduce for shrinking the code. Definitely was an interesting test-iterate loop...
> To make things even worse, the community that has most thoroughly embraced them are compiler authors, who many programmers think of as being an impossibly skilled elite
The article's approach seems super ad-hoc, leaving you to have to think hard, do all the work, and make all the mistakes.
If you were to go down the other path, you might try dividing and conquering the problem. An arbitrary Pair<A,B> is trivially constructed from an arbitrary A and an arbitrary B. So if you can generate a string, and a number, you could generate a User full of number and string fields. If your generate function accepts a number describing how complex a string to make, then you can also choose how complicated to make your User. That's all shrinking needs to be. Repeatedly trying smaller Ns while the problem still happens (the problem being one of your unit tests - not an additional "interestingness" test you need to write.)
You'll probably way more likely to hit boundary cases by using the structure of the input and making interesting variations that way, rather than hoping you can permute the right bytes from the CLI.
For more than 20 years I've been doing automatic test input reduction as part of testing Common Lisp compilers. The reduction is on randomly generated inputs, but they are structured in such a way that reduction always gives a valid program that should (in the absence of compiler errors) not signal an error.
It's a tremendously economical way to test compilers. For a modest and finite investment in testing infrastructure I get an unlimited number of tests. Over the years I've run many billions of test inputs on various Common Lisp implementations, although I'm mostly focusing on sbcl these days. When a bug is found the input quickly reduces to a something small that usually immediately tells the developers where the problem is (usually but not always something introduced recently.)
I also have a testing harness that cobbles together usually erroneous Lisp code and sees if the compiler blows up (the sbcl compiler as designed must never throw an error condition even on erroneous input.) This exploits a corpus of public Common Lisp code, combining and mutating the code in various ways.
Dustmite is a fantastic tool for finding a bug in your program, by removing parts of the code until the result is the bug.
https://dlang.org/blog/2020/04/13/dustmite-the-general-purpo...
Created by Vladimir Panteleev
Property-based testing frameworks will often do test case reduction as well (called shrinking).
Shrink Ray, described in the article, is developed by D.R. MacIver who also developed Hypothesis. I remember when it was announced a while back but had forgotten about it, I guess I have something to play with tonight.
These days, he’s also working on Hegel - bringing test case reduction and PBT to more languages.
https://hegel.dev
Brilliant tools, well worth investigating for any system-critical applications. They don't seem to get enough attention outside of the FP community.
I read the first part of this article, then gave up and Googled "Test-case Reducers".
I'm not sure if that's an article failure (that I didn't want to read a whole ton of text and C code details), or a success (as it got me interested in the topic). I guess both?
I read the whole article, and I am still confused. I get that test case reducers find the smallest error causing string. I don't understand why that is particularly valuable.
Also, do test case reducers work on integers or other numbers? What about reducing some other complexity? Is this for developing unit tests or just debugging?
Making the smallest test case that reproduces some bug is hugely valuable when debugging complex systems, especially if you have a wierd heisenbug that is hard to manifest reliably. Having a small reproducible case massively narrows the scope of the search for the bug.
Similarly, narrowing test cases to the smallest case that reproduces a particular behaviour so you're only actually testing a very targeted thing will make the test suite faster and also make it easier to fix tests which break because they exercise a very narrow path.
A reduced test case means you run less code to process the test case, which means your breakpoints trigger less frequently (and the remaining breakpoint triggers are more likely to be relevant to the actual bug). It also means all your debugging steps are likely to run faster and produce less data to sort through. Your log files will be shorter and easier to read/grep, etc.
Imagine being handed a sheet of 10 equations and being told "1 of these equations is wrong." Now imagine that someone came in and erased 8 of the correct equations - they just saved you a bunch of time.
> I didn't want to read a whole ton of text and C code details
There's no C in there? It seems to be Python and shell scripts.
> I read the first part of this article, then gave up and Googled "Test-case Reducers".
It's answered pretty early on:
>> Test-case reducers try to reduce the length of an input
If that still doesn't answer the question, try this extension:
>> Test-case reducers try to reduce the length of an [error causing or interesting] input
I have a similar tool to shrink ray, called [bonsai](https://github.com/nnunley/bonsai). I designed it to allow me to try to inline and reduce code for both simplifying single file examples, as well working across multiple files. It uses Tree-Sitter for syntax awareness, and the [Perses algorithm](https://doi.org/10.1109/ICSE.2018.00046) as the methodology for simplification.
I'd love to get some feedback if anyone's interested.
This looks interesting, and definitely useful for non-C/Python languages, which existing reducers I know of mostly don't have explicit support for! I can't get it to build though (I've filed an issue).
I was also wondering: is there a UI while reduction happens? I've found Shrink Ray's UI improvements in the last year to be much more useful than I first expected: not just because it gives me something to look at, but because it really helps me understand if reduction is on the right path or not. [Some of the new Shrink Ray extras like being able to rewind reduction to a past point and to skip passes are also really useful too.]
I've only ever known about these through compilers, very cool.
On one project, through a variety of circumstances, dead code elimination was straight up not working, but we wanted to show the theoretical improvement of some approach - but we couldn't figure out why at the moment (we did spend a whole week chasing down the root cause after - maybe worth in hindsight...).
We were doing it by hand at one point, but someone suggested using CReduce for shrinking the code. Definitely was an interesting test-iterate loop...
Nice share. Increasingly I am thinking about ways to improve verification ("interestingness tests"), ever since reading https://www.jasonwei.net/blog/asymmetry-of-verification-and-...