DNS lookup failure
Description
Activity
Heracles31 April 1, 2022 at 7:03 PM
I re-opened it as TC-2027...
Heracles31 March 31, 2022 at 1:02 PM
Bad news : is back in TrueCommand 2.1.1...
Heracles31 September 16, 2021 at 3:32 AM(edited)
Ok! I fixed that one for you. The root cause is : TrueCommand needs to learn about patience and redundancy...
The complete path to replicate and fix the problem is :
Deploy a Docker host.
In that Docker host, deploy a DNS server (I am using PI-Hole but any DNS server will do) and have port 53 forwarded to that container.
Configure a second DNS somewhere else in the network but not in that very Docker host.
Configure the Docker host to use its own IP as first DNS and the other external DNS as a second one. Be sure that both work from the Docker host.
Deploy TrueCommand as a container in that Docker host.
Without any extra settings, TrueCommand will have its own DNS servers configured as the same as its host, so it will point to the "external" IP address of that host for its first DNS and the second DNS will also be configured.
When doing a DNS query, TrueCommand sends the query to that "external" IP. The thing is, because of the routing done by Docker, the replies comes back directly from the DNS' internal Docker IP address, to TrueCommand's Docker internal IP address.
TrueCommand is not expecting a reply from that IP, so discards it.
TrueCommand does not know about patience or DNS redundancy, so gives up right here.
No DNS reply : no connection to the target.
OpenSSL does know about patience so when the first DNS reply ends up discarded, it keeps working up to receiving the second DNS reply from its secondary DNS server.
Thanks to patience and that second DNS reply, OpenSSL can connect by DNS name when TrueCommand can not.
To get that fixed, re-deploy TrueCommand but this time, force its primary DNS using Docker's networking options and point it directly to the local DNS internal Docker IP address. Configure the secondary DNS with its regular IP.
Now the first DNS reply comes from the expected IP, so TrueCommand accept and use it properly.
I now have my TrueNAS servers pointed to by DNS names and TrueCommand see each one of them.
Still you never know when you will need to rely on your secondary DNS server. It is important for TrueCommand to turn to its secondary server when primary server does not work for any reason.
I understood that while working with a FreeRadius container in the same host. That one gave me an explicit log entry about the DNS reply not coming from the expected IP. I figured that this would be the same case for TrueCommand.
Heracles31 September 1, 2021 at 3:44 PM
Hey Ken,
As for the presence of Curl, what is the full path you put it in ? Or did you changed its name ?
Here, from the docker host itself, I did a "find /var/lib/docker -type f -name curl -print" and the only instance of curl it founds is in the /usr/bin directory of another container. I double checked inside TC and definitely, there is not a single file named CURL in that one, neither in /usr/bin nor anywhere else. As soon as you tell me where it is hidden, I will do the test with curl. Is it present only in the nightly instance and removed in the latest ?
About the DNS resolution problem, I will look at it but again, why OpenSSL would be able to do it and not TC ?
As for working the case live, I invite you to post me an SSH public key here and I will give you a secured remote access to my instance. You will be able to SSH in a dedicated host that will have access only to TrueCommand.
Ken Moore September 1, 2021 at 2:40 PM
: I have been digging into this and still cannot reproduce this on any other systems/setups.
So far, it appears to be something in your specific Portainer/DNS setup that is interfering with the container, rather than something generic to TrueCommand.
In particular, I found this [GitHub issue about Portainer|https://github.com/portainer/portainer/issues/2726] not exposing DNS server settings to containers. This appears to be resolved in newer releases, but requires an extra configuration step within portainer itself. Could you try configuring Portainer with your custom DNS provider?
If that still does not work, then I would recommend asking on the Portainer support system, as this issue appears to be with Portainer rather than TrueCommand.
Regarding CURL:
`curl` has been available inside every TrueCommand docker container we have ever published (starting with version 1.2). If your Portainer-opened shell is saying that it is not available, then I am strongly doubting that Portainer is actually opening the shell inside the TC container.
Regarding the latest release:
The "ixsystems/truecommand:latest" tag on DockerHub currently points to the 2.0.2 release image (August 16th).
If you want to try the rolling nightly images, then use the "ixsystems/truecommand:nightly" tag and manually re-pull to update between nightlies.
Issue not resolved after 2.0.2 release.
Important pieces of info:
when I open a shell in the container, I can connect using "openssl s_client -connect atlas.jb.lan:443". Also, despite there is a delay between the command and the connection, connection is established within less than 6 seconds. Here, in your log, the timeout is mentioned after 10 seconds. OpenSSL connects within that window.
Heracles31 : Please comment here with some additional information:
Is the TrueCommand container running behind a Proxy, or is the host system using a proxy of any kind?
Could you please verify how you are running openssl within the container? Which docker command did you use to get to the shell within the container?
Could you try using `curl` to test the connection instead of `openssl`?
Finally, could you check which SSL encryption type is supported on that NAS? TrueCommand needs TLS 1.2+ specifically. SSLv3 support was removed from our toolkit in January of 2020: https://github.com/golang/go/issues/32716