Kubernetes service is not running - RC2.0 or Hardware Upgrade Issue

Description

Hello guys,

I did an upgrade of my NAS to new hardware and faced an error during migration -> Apps can't be installed + VMs run unstably, I'm getting "Kubernetes service is not running"

What happened before issue (prerequisites):

  • Hardware upgrade from old Intel v3 to new Ryzen 5800x:
    MB: Supermicro X10SLL-F ---> Asrock Rack X570D4U
    CPU: Xeon E3 1271 v3 ---> Ryzen 5800X
    RAM: 32 Gb ECC ---> 64Gb nonECC
    Added 1Tb NVMVe

  • Upgrade from RC1.2 to RC2 at the same time

Unfortunately, the OS upgrade happened exactly during the hardware upgrade, so I still don't know what is the root cause of my problem...

Problem description:
I can't install any apps because I get a "Kubernetes service is not running" error.

Error: Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/middlewared/job.py", line 409, in run
await self.future
File "/usr/lib/python3/dist-packages/middlewared/job.py", line 445, in __run_body
rv = await self.method(*([self] + args))
File "/usr/lib/python3/dist-packages/middlewared/schema.py", line 1137, in nf
res = await f(*args, **kwargs)
File "/usr/lib/python3/dist-packages/middlewared/schema.py", line 1269, in nf
return await func(*args, **kwargs)
File "/usr/lib/python3/dist-packages/middlewared/plugins/chart_releases_linux/chart_release.py", line 478, in do_create
await self.middleware.call('kubernetes.validate_k8s_setup')
File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1324, in call
return await self._call(
File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1281, in _call
return await methodobj(*prepared_call.args)
File "/usr/lib/python3/dist-packages/middlewared/plugins/kubernetes_linux/update.py", line 322, in validate_k8s_setup
raise CallError('Kubernetes service is not running.')
middlewared.service_exception.CallError: [EFAULT] Kubernetes service is not running.

[EFAULT] Kubernetes service is not running.

Also, VMs runs very unstably: VM runs ok after initial configuration, but it stuck somewhere after reboot - no VNC at all, doesn't work properly. I think it can be connected to App issue.

What I tried to solve a problem:

  • Import old config to the new installed TrueNAS;

  • Move application pool assignment to another existing pool;

  • Completely destroy the app pool and create a new one from scratch;

  • Do everything from scratch (freshly new installed TrueNAS).

I noticed a few similar treads, but "reboot + destroy and recreate app pool" don't work in my case.

Similar issues: for an example

Thread on iXsystems forum

Maybe someone has any idea or solution? If anyone had similar issues with AMD builds, it will be helpful as well.

P.S. I have properly configured "Route v4 Interface" and "Route v4 Gateway" in apps settings.

Thank you in advance!

Problem/Justification

None

Impact

None

Activity

Show:

Anton Terekhov 
January 3, 2022 at 3:59 AM

Thank you, .

The issue has gone as soon as I removed one of two memory sticks (I used Crucial 32GBx2 DDR4 3200 MT/s CL22 DIMM modules). System loaded w/o any problem. Also, I was able to successfully import my config on top of the freshly installed TrueNAS Scale.

I'll create a suggestion ticket, thanks for the hint!

Have a great day!

Regards,

Anton

Waqar 
January 2, 2022 at 9:53 PM

Thank you for confirming . From k3s logs we can see different errors which could be likely related to the memory issue you described. Anyways, please do let us know if this issue persists for you once you are using different hardware  

Also please feel free to create a suggestion ticket if you think it would be nice to have some sort of hardware checks and please do clarify what kind of tests do you have in mind and we can then take it from there. Thanks!

Anton Terekhov 
January 2, 2022 at 8:40 PM

Hi , let me repeat myself - I had the same issue if I install the TrueNAS Scale from a scratch and create a new pool (no imports, empty pool which does not have the ix-applications dataset).

Thanks

Waqar 
January 2, 2022 at 8:07 AM

Thank you for investigating the above - however i am curious if it works if you try to use a new pool which does not have the ix-applications dataset ?

Anton Terekhov 
January 1, 2022 at 11:56 PM

Hi,

I hope you're doing well.

 

I figured out the root cause of my issue - the defective RAM module. More details are in the forum post - https://www.truenas.com/community/threads/kubernetes-service-is-not-running-rc2-or-hardware-upgrade-issue.97728/post-675332

 

You can close this ticket if you don't have any other questions or don't want to investigate more deeply.

 

Thank you!

 

Best Regards,

Anton

Cannot Reproduce

Details

Assignee

Reporter

Time remaining

0m

Fix versions

Priority

Katalon Platform

Created December 27, 2021 at 8:58 AM
Updated July 6, 2022 at 8:58 PM
Resolved January 2, 2022 at 9:54 PM