TL;DR
-
rasa test
stops mid test, console dump below
Details
-
I have a new (not upgraded) small talk bot with 90 intents, 90 stories and my
nlu.yml
file is 9977 lines long. The model trains with no issues, other than it takes about 57 minutes and
rasa shell
works and the bot responds properly when playing around.
I’ve created 6 simple tests to get started, all with a single user phrase, intent and action like this
- story: Greeting 1
steps:
- user: |
intent: greet
- action: utter_greet
When I run my command, I get a few odd errors. The results
directory is generated with the expected files, but the test just stops when performing Entity evaluation
:
rasa test --stories tests/small_talk_tests.yml --fail-on-prediction-errors
2021-01-13 08:46:54 INFO rasa.model - Loading model models/20210107-172420.tar.gz...
2021-01-13 08:47:00 INFO rasa.core.test - Evaluating 0 stories
Progress:
0it [00:00, ?it/s]
2021-01-13 08:47:00 INFO rasa.core.test - Finished collecting predictions.
2021-01-13 08:47:00 WARNING rasa.test - No labels to evaluate. Skip evaluation.
2021-01-13 08:47:00 INFO rasa.core.test - Evaluation Results on END-TO-END level:
2021-01-13 08:47:00 INFO rasa.core.test - Correct: 0 / 0
2021-01-13 08:47:00 INFO rasa.core.test - F1-Score: 0.000
2021-01-13 08:47:00 INFO rasa.core.test - Precision: 0.000
2021-01-13 08:47:00 INFO rasa.core.test - Accuracy: 0.000
2021-01-13 08:47:00 INFO rasa.core.test - In-data fraction: 0
2021-01-13 08:47:00 WARNING rasa.test - No labels to evaluate. Skip evaluation.
2021-01-13 08:47:00 INFO rasa.core.test - Stories report saved to results/story_report.json.
2021-01-13 08:47:01 INFO rasa.core.test - Evaluation Results on ACTION level:
2021-01-13 08:47:01 INFO rasa.core.test - Correct: 0 / 0
2021-01-13 08:47:01 INFO rasa.core.test - F1-Score: 0.000
2021-01-13 08:47:01 INFO rasa.core.test - Precision: 0.000
2021-01-13 08:47:01 INFO rasa.core.test - Accuracy: 0.000
2021-01-13 08:47:01 INFO rasa.core.test - In-data fraction: 0
/home/jwheat/anaconda2/envs/rasa2/lib/python3.8/site-packages/rasa/utils/plotting.py:75: UserWarning: Attempting to set identical left == right == -0.5 results in singular transformations; automatically expanding.
plt.imshow(
/home/jwheat/anaconda2/envs/rasa2/lib/python3.8/site-packages/rasa/utils/plotting.py:75: UserWarning: Attempting to set identical bottom == top == -0.5 results in singular transformations; automatically expanding.
plt.imshow(
2021-01-13 08:47:01 INFO rasa.utils.plotting - Confusion matrix, without normalization:
2021-01-13 08:47:02 INFO rasa.model - Loading model models/20210107-172420.tar.gz...
2021-01-13 08:47:05 INFO rasa.nlu.test - Running model for predictions:
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9701/9701 [01:02<00:00, 154.45it/s]2021-01-13 08:48:08 INFO rasa.nlu.test - Intent evaluation results:
2021-01-13 08:48:08 INFO rasa.nlu.test - Intent Evaluation: Only considering those 9701 examples that have a defined intent out of 9701 examples.
/home/jwheat/anaconda2/envs/rasa2/lib/python3.8/site-packages/sklearn/metrics/_classification.py:1221: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/jwheat/anaconda2/envs/rasa2/lib/python3.8/site-packages/sklearn/metrics/_classification.py:1221: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
2021-01-13 08:48:08 INFO rasa.nlu.test - Classification report saved to results/intent_report.json.
2021-01-13 08:48:08 INFO rasa.nlu.test - Incorrect intent predictions saved to results/intent_errors.json.
2021-01-13 08:48:08 INFO rasa.utils.plotting - Confusion matrix, without normalization:
[[223 1 0 ... 0 0 0]
[ 0 44 0 ... 0 0 0]
[ 0 0 116 ... 0 0 0]
[ 0 0 0 ... 150 0 0]
[ 0 0 0 ... 0 13 0]
[ 0 0 0 ... 0 0 294]]
2021-01-13 08:48:55 INFO rasa.nlu.test - Entity evaluation results:
It literally stops right there and gives me my cursor back. It never actually runs my story tests. I know this because I set a bogus intent that should make the story fail
- story: Greeting 1
steps:
- user: |
intent: xxxgreet
- action: utter_greet
I get the same output with the incorrect xxxgreet
intent.
I see an couple warnings about No labels to evaluate
, a python UserWarning
and UndefinedMetricWarning
I’ve never seen any of those errors before. This is the first time I’ve tried testing a 2.x bot outside the normal out-of-the-box mood bot, so maybe I’m missing something.
Here’s my version information
Rasa Version : 2.1.3
Rasa SDK Version : 2.1.2
Rasa X Version : None
Python Version : 3.8.0
Operating System : Linux-4.4.0-19041-Microsoft-x86_64-with-glibc2.10
Python Path : /home/jwheat/anaconda2/envs/rasa2/bin/python
Any ideas?
That does look pretty weird. One thing that jumps out at me is this line:
Evaluating 0 stories
, though that wouldn’t be related to entity extraction.
Have you tried running rasa test core
?
alright, since I’ve never run rasa test core
I’m not sure if this completed or not
Here’s the output
rasa test core
2021-01-14 13:51:03.555429: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2021-01-14 13:51:03.556933: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.2021-01-14 13:51:06.052208: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-01-14 13:51:06.053596: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2021-01-14 13:51:06.067316: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (LAPTOP-UMLLMSC2): /proc/driver/nvidia/version does not exist
2021-01-14 13:51:06 INFO rasa.model - Loading model models/20210114-134750.tar.gz...
2021-01-14 13:51:07.020210: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library
(oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-01-14 13:51:07.044486: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2208000000 Hz
2021-01-14 13:51:07.055550: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fffeed2c090 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-01-14 13:51:07.057060: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
Processed story blocks: 100%|████████████████████████████████████████████████████████████████████████████████| 90/90 [00:00<00:00, 2555.41it/s, # trackers=1]2021-01-14 13:51:13 INFO rasa.core.test - Evaluating 90 stories
Progress:
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 90/90 [00:00<00:00, 106.82it/s]2021-01-14 13:51:14 INFO rasa.core.test - Finished collecting predictions.
2021-01-14 13:51:14 INFO rasa.core.test - Evaluation Results on CONVERSATION level:
2021-01-14 13:51:14 INFO rasa.core.test - Correct: 90 / 90
2021-01-14 13:51:14 INFO rasa.core.test - F1-Score: 1.000
2021-01-14 13:51:14 INFO rasa.core.test - Precision: 1.000
2021-01-14 13:51:14 INFO rasa.core.test - Accuracy: 1.000
2021-01-14 13:51:14 INFO rasa.core.test - In-data fraction: 1
2021-01-14 13:51:14 INFO rasa.core.test - Stories report saved to results/story_report.json.
2021-01-14 13:51:15 INFO rasa.core.test - Evaluation Results on ACTION level:
2021-01-14 13:51:15 INFO rasa.core.test - Correct: 180 / 180
2021-01-14 13:51:15 INFO rasa.core.test - F1-Score: 1.000
2021-01-14 13:51:15 INFO rasa.core.test - Precision: 1.000
2021-01-14 13:51:15 INFO rasa.core.test - Accuracy: 1.000
2021-01-14 13:51:15 INFO rasa.core.test - In-data fraction: 1
2021-01-14 13:51:15 INFO rasa.utils.plotting - Confusion matrix, without normalization:
[[90 0 0 ... 0 0 0]
[ 0 1 0 ... 0 0 0]
[ 0 0 1 ... 0 0 0]
[ 0 0 0 ... 1 0 0]
[ 0 0 0 ... 0 1 0]
[ 0 0 0 ... 0 0 1]]
Failed stories written to 'results/failed_test_stories.yml'
It stops right there and I get my cursor back.
My results/failed_test_stories.yml
file is empty, is that good?
You shouldn’t have to retrain, but I guess now you’ve got the new model
It looks like you’ve just tested on your training stories, which is why you got 90/90 and results/failed_test_stories.yml
is empty. Sorry for the confusion, I should have specified that you need to run rasa test core
and point to your test stories like so:
rasa test core --stories tests/small_talk_tests.yml
Do you have a GPU?
Thank you for looking over this for me, I really appreciate it.
Actually I did have to retrain. Our models directory is gitignored and I had been working in another branch and it was overwritten, no biggie.
I’ve been fighting that GPU/Cuda warning for a while. I’m unfortunately on Windows 10 in a WSL/Ubuntu terminal and can’t seem to get the drivers happy. That’s what that mess is about.
I had read rasa test
would automatically pick up the test file(s) if they were named test_*
so I had renamed my file test_small_talk.yml
and figured the plain old rasa test core
would have nabbed it but as you pointed out, it didn’t.
This time around I specified the file and it looks better, maybe? Does it still stop short? I’m not sure if it is supposed to say X number of stories passed or stop like it does.
rasa test core --stories tests/test_small_talk.yml
2021-01-14 16:13:55.950223: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2021-01-14 16:13:55.951730: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.2021-01-14 16:14:03.793422: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-01-14 16:14:03.794344: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2021-01-14 16:14:03.795136: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (LAPTOP-UMLLMSC2): /proc/driver/nvidia/version does not exist
2021-01-14 16:14:03 INFO rasa.model - Loading model models/20210114-134750.tar.gz...
2021-01-14 16:14:05.080095: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library
(oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-01-14 16:14:05.092811: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2208000000 Hz
2021-01-14 16:14:05.096182: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fffce836600 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-01-14 16:14:05.097404: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2021-01-14 16:14:11 INFO rasa.shared.utils.validation - The 'version' key is missing in the training data file /tmp/tmp2tv7qp2o/f1a58d47d7c142698ed310bba16bf3a8_test_small_talk.yml. Rasa Open Source will read the file as a version '2.0' file. See https://rasa.com/docs/rasa/training-data-format.
/home/jwheat/anaconda2/envs/rasa2/lib/python3.8/site-packages/rasa/shared/utils/io.py:93: UserWarning: Issue found in '/tmp/tmp2tv7qp2o/f1a58d47d7c142698ed310bba16bf3a8_test_small_talk.yml':
Found intent 'xxxgreet' in stories which is not part of the domain.
More info at https://rasa.com/docs/rasa/stories
Processed story blocks: 100%|██████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 2470.63it/s, # trackers=1]2021-01-14 16:14:11 INFO rasa.core.test - Evaluating 6 stories
Progress:
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 90.73it/s]2021-01-14 16:14:11 INFO rasa.core.test - Finished collecting predictions.
2021-01-14 16:14:11 INFO rasa.core.test - Evaluation Results on CONVERSATION level:
2021-01-14 16:14:11 INFO rasa.core.test - Correct: 5 / 6
2021-01-14 16:14:11 INFO rasa.core.test - F1-Score: 0.909
2021-01-14 16:14:11 INFO rasa.core.test - Precision: 1.000
2021-01-14 16:14:11 INFO rasa.core.test - Accuracy: 0.833
2021-01-14 16:14:11 INFO rasa.core.test - In-data fraction: 0.917
2021-01-14 16:14:11 INFO rasa.core.test - Stories report saved to results/story_report.json.
2021-01-14 16:14:12 INFO rasa.core.test - Evaluation Results on ACTION level:
2021-01-14 16:14:12 INFO rasa.core.test - Correct: 11 / 12
2021-01-14 16:14:12 INFO rasa.core.test - F1-Score: 0.950
2021-01-14 16:14:12 INFO rasa.core.test - Precision: 1.000
2021-01-14 16:14:12 INFO rasa.core.test - Accuracy: 0.917
2021-01-14 16:14:12 INFO rasa.core.test - In-data fraction: 0.917
2021-01-14 16:14:12 INFO rasa.utils.plotting - Confusion matrix, without normalization:
[[0 0 0 0 0]
[0 6 0 0 0]
[1 0 2 0 0]
[0 0 0 2 0]
[0 0 0 0 1]]
Failed stories written to 'results/failed_test_stories.yml'
I actually get my failed story of xxxgreet
to show up now, so that’s progress
My failed_test_stories.yml
version: "2.0"
stories:
- story: Greeting 1 (/tmp/tmp2tv7qp2o/f1a58d47d7c142698ed310bba16bf3a8_test_small_talk.yml)
steps:
- intent: xxxgreet
- action: utter_greet # predicted: action_default_fallback
You’re welcome!
I see about your setup. CUDA can be difficult to please Without belaboring what may be obvious to you, these warnings mean training is defaulting to using the CPU, which is one reason your training may be taking longer than you’d like.
I think that output looks much better. You’ve failed only 1/6
conversations, which is why you have only one story in the results/failed_test_stories.yml
. This following line from your logs was to be expected, since you added a dummy intent:
Found intent 'xxxgreet' in stories which is not part of the domain.
If you haven’t checked it out already, this page of the docs might help you interpret your output.
Okay, so it looks like core
is doing fine. It could be that there was some issue picking up the stories in the first place (when you ran rasa test
without core
). That would explain why it looks like nothing was happening for entity extraction – there was nothing to extract. I’m not sure why this would’ve happened, though, because you did point to them. The stories don’t seem to be malformed either, because rasa test core
tested them once you pointed to them.
I’ll do some digging to see what’s going on. You could try running the command below again, could be something was up the first time, or something is wrong with the
--fail-on-prediction-errors
flag (there shouldn’t be, but it’s possible there’s a bug )
rasa test --stories tests/test_small_talk.yml
Otherwise, you can do a sanity check and test the nlu portion. The bot should ace this since it’s seen all the nlu.yml
data before
rasa test --nlu nlu.yml
I received the same abrupt stop.
On a whim, I spun up a new rasa init
bot and run rasa test
because that bot doesn’t have any entities either and I wanted to see what happens,
I got the same thing, it stops at the Entity evaluation results:
So maybe it’s a bug in the test code if there are no entities and it stops, when it should say something like No entities found and/or Testing complete
To play some more, I added an entity and extended the generic mood bot by adding the entity to the nlu data, trained and tested it again. And it made it past the Entity Evaluation results section
[other test results snipped]
2021-01-15 14:39:44 INFO rasa.nlu.test - Entity evaluation results:
2021-01-15 14:39:44 INFO rasa.nlu.test - Evaluation for entity extractor: DIETClassifier
2021-01-15 14:39:44 INFO rasa.nlu.test - Classification report saved to results/DIETClassifier_report.json.
2021-01-15 14:39:44 INFO rasa.nlu.test - Incorrect entity predictions saved to results/DIETClassifier_errors.json.
2021-01-15 14:39:44 INFO rasa.utils.plotting - Confusion matrix, without normalization:
[[ 5 1]
[ 1 157]]
So I guess there’s no real problem with my bot other than I was expecting something after the Entity evaluation results:
Sorry to waste your time
So maybe it’s a bug in the test code if there are no entities and it stops, when it should say something like No entities found and/or Testing complete
This is very true! I raised a similar issue thanks to your question which states that testing should stop if no stories are found, rather than produce confusing logs and empty files. I’ll update it to include this feedback as well.
You didn’t waste my time! I think this will result in an improvement of our output during testing. It’s always good for us to learn from users where things are confusing or misleading, that way we can try and make them better