-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What are the ideas for a new standard test SRFI? #3
Comments
Here are some thought by mdhughes at https://mdhughes.tech/2020/02/27/scheme-test-unit/ |
There are two situations:
My concern is mainly the latter. Gauche supports srfi-64 and srfi-78 (and We can have sophisticated test framework for the former, but I'd like it to be noted not to be used for the latter kind of tests. |
Thanks for the links. The most important thing would be to separate the test runner from the test definition framework, so that tests defined using any framework can be run using any runner. Test runners can be quite complex, and people don't agree on which one is best. Definition frameworks are much simpler, and can be easily ported to new Schemes. The currently dominant frameworks are SRFI 64 (A Scheme API for test suites, originally from Kawa) and Chicken's test egg which was also adapted by Chibi. Chicken test has an almost identical API to SRFI 64. So lots of Scheme tests are using almost the same syntax already. I'd like to publish a SRFI that contains:
This definition framework could be supported by any existing test runner. |
For SRFI testing, it would be nice to get the files in this I think being able to copy the files with no changes important. Then it's so easy that people will actually do it. If small changes are needed here and there, it will become a burden. |
I want to mention something off-topic but which I think will be of interest to readers of this thread. Every Lisper, and probably every programmer who uses a REPL, does informal testing using the REPL. However, such testing does not generate a reusable artifact: you can't use it to find regressions, for example, except by accident. The idea here is to enhance Scheme REPLs that support REPL commands to help generate such artifacts. A REPL command is something you can input that by convention is not just another Scheme expression to evaluate. For example, in Chicken the REPL commands take the form The idea is that as you noodle around in the REPL, a test script is being created. You start making such a script with a
This causes a test case to be written to the file which tests that To this we can add |
On Tue, Aug 4, 2020 at 12:05 AM amirouche ***@***.***> wrote:
I read here and there many people have ideas about a new test SRFI.
What are your ideas?
ref: https://srfi.schemers.org/srfi-64/
ref: https://srfi.schemers.org/srfi-78/
* I use my own test framework, attached, which was inspired by the
original version of JUnit. Its important features are:
* Every test is lexically enclosed.
* Every test or group of tests has a name.
* Tests are first-class objects.
* One can run tests in a mode where only failures are reported. This
way, one doesn't have to wade through output in order to figure out
whether everything passed, or what failed.
* It's possible to run individual tests or test groups or all defined
tests.
* Test groups can be defined concisely.
* Tests only pass if they return the symbol passed. That makes it
harder for buggy tests to appear to pass when they actually never
ran.
* The assert macro uses simple heuristics to display the values that
were passed to it. This makes it less necessary to have a family of
assert macros for different purposes.
* There is an assert-signals-condition macro to test that an
expression causes a particular condition to be raised.
* Failure reports show the captured continuation of the failing test.
This continuation can be used with MIT Scheme's debug to walk the
stack of the failure, examining variables, etc. This is
particularly useful when an unexpected condition is raised during
the test.
Here\'s a transcript of using the `assert` macro. Note how the arguments
to assert are displayed in the failure message. That way, it\'s easy to
read most test failure reports to see exactly what went wrong.
1 ]=> (let ((x '(a b c))) (assert (equal? x '(a b c))))
;Value: passed
1 ]=> (let ((x '(a b c))) (assert (equal? x '(a b c d))))
;Assertion failed: (equal? x (quote ...)) (equal? (a b c) (a b c d))
;To continue, call RESTART with an option number:
; (RESTART 1) => Return to read-eval-print level 1.
2 error> C-c C-c
Interrupt option (? for help): C-c C-c
;Quit!
1 ]=>
Here\'s an example of using the `assert-signals-condition` macro:
(define (assert-singleton list)
(assert (and (pair? list)
(null? (cdr list)))
"List must contain exactly one element."))
(define-test (assert-singleton)
(assert-singleton '(x))
(let ((c condition-type:simple-error))
(assert-signals-condition c (assert-singleton '()))
(assert-signals-condition c (assert-singleton '(x y)))
(assert-signals-condition c (assert-singleton 5))))
Here\'s a procedure definition and a single named test:
(define (singleton-list? value)
"Return true iff value is a list of length one."
(and (pair? value)
(null? (cdr value))))
(define-test (singleton-list?)
(assert (not (singleton-list? '())))
(assert (singleton-list? '(1)))
(assert (not (singleton-list? '(1 2)))))
Here\'s a procedure definition and a group of tests defined together.
Note how the `define-test-group` macro takes a name, a procedure to be
run on each set of data to be tested, and a list of rows of data to pass
to the procedure.
(define (length= lst size)
"Return true iff `lst' has length `size', otherwise #f."
(let next ((count size)
(elements lst))
(cond ((null? elements) (zero? count))
((zero? count) #f)
(else (next (-1+ count) (cdr elements))))))
(define-test-group (length=)
(lambda (expected lst size) (assert (eq? expected (length= lst size))))
'(#t () 0)
'(#t (a b c) 3)
'(#f (a b c) 4)
'(#t (a b c) x)) ; This will fail.
Below is a transcript of running these tests. Note how `run-single-test`
shows the results, pass or fail, of the test whose name was passed to
it, whereas `show-failing-tests` runs all defined tests and only shows
the results of the failing ones. So `show-failing-tests` is most useful
for finding out whether some test is failing and which test it is,
whereas `run-single-test` is most useful for repeatedly running a
specific test while debugging it or while refactoring the code that it
tests.
1 ]=> (run-single-test '(length=))
(length= #t () 0) PASSED.
(length= #t (a b c) 3) PASSED.
(length= #f (a b c) 4) PASSED.
#[unit-test 5586 (length= #t (a b c) x)] FAILED
#[condition 5587 "wrong-type-argument"]
(type #[condition-type 5588 "wrong-type-argument"])
(continuation #[continuation 5589])
(restarts (#[restart 5590 abort]))
(field-values #(x #f integer-zero? 0))
(properties #[|1d-table| 5591])
3 of 4 tests passed.
;Unspecified return value
1 ]=> (show-failing-tests)
#[unit-test 5586 (length= #t (a b c) x)] FAILED
#[condition 5592 "wrong-type-argument"]
(type #[condition-type 5588 "wrong-type-argument"])
(continuation #[continuation 5593])
(restarts (#[restart 5594 abort]))
(field-values #(x #f integer-zero? 0))
(properties #[|1d-table| 5595])
;Unspecified return value
1 ]=> (debug #@5593)
There are 18 subproblems on the stack.
Subproblem level: 0 (this is the lowest subproblem level)
Expression (from stack):
(integer-zero? 'x)
There is no current environment.
There is no execution history for this subproblem.
You are now in the debugger. Type q to quit, ? for commands.
2 debug> u
Subproblem level: 1
Compiled code expression unknown
#[compiled-return-address 5596 ("list" #x3a) #x14f #x1f34e94]
Environment created by the procedure: NEXT
applied to: (x (a b c))
There is no execution history for this subproblem.
2 debug> q
;Unspecified return value
1 ]=>
[unit-test.zip](https://github.com/srfi-explorations/srfi-test/files/5105331/unit-test.zip)
|
I want to mention something off-topic but which I think will be of interest to readers of this thread. Every Lisper, and probably every programmer who uses a REPL, does informal testing using the REPL. However, such testing does not generate a reusable artifact: you can't use it to find regressions, for example, except by accident. The idea here is to enhance Scheme REPLs that support REPL commands to help generate such artifacts. A REPL command is something you can input that by convention is not just another Scheme expression to evaluate. For example, in Chicken the REPL commands take the form The idea is that as you noodle around in the REPL, a test script is being created. You start making such a script with a ,testscript command that specifies the pathname of the test script. Then you go along doing your informal tests like this:
This causes a test case to be written to the file which tests that To this we can add |
+1
IMO some further re-engineering is needed. Most of the I think the Chicken and Chibi test system's use of a parameter current-test-comparator is the Right Thing. The equality predicate of this comparator is used to decide if the expected and actual values are the same. The default comparator's equality predicate is exposed as That said, all we really need is And that's all. |
I like @arthurgleckler test framework 😃 The problem with SRFI-64 is that it does not allow to run a single test or group of tests separately. That may not be necessary for the small-ish test suite of SRFI, but in a real world scenario where tests take more than a few seconds or minutes to run it becomes necessary to run one test at a time. Another thing that painful about SRFI-64, since the tests are not first-class, it is not possible to run the test in a REPL. I think we should have something along the lines of @arthurgleckler tests framework. |
It seems to me having two or more ways to define tests is not a good thing. |
The CHICKEN (and probably Chibi) test library has a way to filter tests using an environment variable. Howver, as far as I know this just filters the output and still runs all the other tests. I don't know if there's an easy way to support this in a way that it will actually run only the selected tests? |
Making tests first-class would probably solve the filtering problem too |
OK, let's do it. A SRFI for reflection on the available test suites and test cases, and for making new ones. Kind of like a WSGI-style meeting point that sits between test definition frameworks and test runners. All frameworks and runners could be implemented against this interface. |
We should keep all the strictly runner-related concepts out of it though :) That's the stuff where complexity comes form. |
This doesn't filter output - it only runs the selected tests. It can also be controlled from Scheme via parameters, the init values from env vars are for convenience. |
I read here and there many people have ideas about a new test SRFI.
What are your ideas?
ref: https://srfi.schemers.org/srfi-64/
ref: https://srfi.schemers.org/srfi-78/
The text was updated successfully, but these errors were encountered: