On 13.0 the network connection drop while installing and uninstalling plugins or stopping a jail

Description

I was able to reproduce manually, and the UI is not reachable for about 5 to 15 seconds

After that, it goes back to the login

I have found a similar behavior on the API side when the API connection drops after stopping jail.

api2/test_270_jail.py::test_17_verify_exec_job PASSED
api2/test_270_jail.py::test_18_stop_jail PASSED
api2/test_270_jail.py::test_19_verify_jail_stopped FAILED
api2/test_270_jail.py::test_20_export_jail FAILED
api2/test_270_jail.py::test_21_verify_export_job_succeed FAILED
api2/test_270_jail.py::test_22_start_jail FAILED
api2/test_270_jail.py::test_23_wait_for_jail_to_be_up FAILED
api2/test_270_jail.py::test_24_rc_action FAILED
api2/test_270_jail.py::test_25_delete_jail FAILED

A bunch of tests after fails with this:

Error Message
requests.exceptions.ConnectionError: HTTPConnectionPool(host='10.238.238.218', port=80): Max retries exceeded with url: /api/v2.0/core/get_jobs/?id=388 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x8061ba340>: Failed to establish a new connection: [Errno 61] Connection refused'))

Here is the whole output of test 19 https://builds.ixsystems.com/job/TrueNAS%2013.0%20Nightlies%20API%20Tests/984/testReport/api2/test_270_jail/test_19_verify_jail_stopped/ .

After a couple of seconds, the connection comes back. Unfortunately, this does not happen every time the API tests run but has been reproduced twice with the API test so far.

The API test did hang on deleting a jail once, and the job did timeout after 2 hours.

api2/test_270_jail.py::test_25_delete_jail Cancelling nested steps due to timeout

Problem/Justification

None

Impact

None

Activity

Show:

Automation for Jira 
October 26, 2022 at 2:06 PM

This issue has now been closed. Comments made after this point may not be viewed by the TrueNAS Teams. Please open a new issue if you have found a problem or need to re-engage with the TrueNAS Engineering Teams.

Eric Turgeon 
October 26, 2022 at 2:06 PM

Since the code freeze of 13.0-U3, I can't reproduce that issue anymore.

Eric Turgeon 
June 10, 2022 at 7:49 PM
(edited)

it is the default setting of any plugins.

I tried with vtnet0 and em0 and it does the same. As for any other machine, if I can get real hardware on 13.0 I could try to reproduce it.

William Gryzbowski 
June 10, 2022 at 6:57 PM

Worth asking what kind of interface the VM is using. If its emulating igb0 or using vtnet0?

 

If it only happens with vtnet0 maybe not worth wasting the time.

Ryan Moeller 
June 10, 2022 at 6:40 PM

 can you share the network config of the jails? I don't see it in the attached debug. I assume it's a default NAT config but I haven't been able to reproduce this drop in my testing so far.

Cannot Reproduce

Details

Assignee

Reporter

Labels

Original estimate

Time remaining

0m

Components

Fix versions

Affects versions

Priority

Katalon Platform

Created January 5, 2022 at 6:46 PM
Updated October 26, 2022 at 2:10 PM
Resolved October 26, 2022 at 2:06 PM