If someone add ssh_args = " " to his .ansible.cfg, it will result into
strange failure later :
<server.example.org> ESTABLISH CONNECTION FOR USER: misc
<server.example.org> REMOTE_MODULE ping
<server.example.org> EXEC ['ssh', '-C', '-tt', '-q', ' ', '-o', 'KbdInteractiveAuthentication=no',
'-o', 'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey', '-o', 'PasswordAuthentication=no',
'-o', 'ConnectTimeout=10', 'server.example.org', "/bin/sh -c 'mkdir -p /tmp/ansible-tmp-1397947711.21-5932460998838
&& chmod a+rx /tmp/ansible-tmp-1397947711.21-5932460998838 && echo /tmp/ansible-tmp-1397947711.21-5932460998838'"]
server.example.org | FAILED => SSH encountered an unknown error during the connection. We recommend you re-run the
command using -vvvv, which will enable SSH debugging output to help diagnose the issue
The root cause is the empty string between -q and -o, who kinda break mkdir.
Any other module is able to detect a dark host, but raw was treating 255
as a return code from the module execution, rather from the connection
attempt. This change allows 255 to be treated as a connection failure
when using the raw module.
We break the read while loop after waiting "the end of the process" and
the pipes are empty, otherwise we do another select that waits all the
timeout.
In particular, do not rely on the $USER environment variable always existing.
tmux for example seems to clear it, causing lots of invalid messages:
"previous known host file not found"
This broke in commit 80fd22dc, but instead of reverting that commit, we now
fall back to expanding just ~ when $USER is not set.
Resolves issue #5082
Code as it was would hit a scenario where one of the FDs was not ready for
reading the first time through -- but p.poll() would show the process as
complete. This would cause ansible to continue on, while leaving some content
left in a pipe.
The other scenario -- the one that causes the unclosed quote, is if we go
through select.select() and we do get stdout in the ready for reading -- we
read from it (9000 bytes), but that's not all that is there. Again we'd get to
the p.poll() check and it would be indeed not none, but we would have left some
of stdout on the FD and thus the json blob would be malformed.
Tested with and without full ssh debugging.
Tested with and without ControlPersist
Tested with and without ControlPersist sockets already created
This patch also checks specifically for a return code of 255, which
indicates an unknown SSH error of some kind. When that happens, ansible
will now recommend running with -vvvv (if not enabled) or show the
output from 'ssh -vvv' (when it is enabled)
This shouldn't generally be needed unless you're working in an environment
that uses rediculously long FQDNs; if the name is too long, you wind up
hitting unix domain socket filepath limits enforced by ssh.
Files were being created in /tmp, but will now be created in $HOME/.ansible/cp/
Addresses CVE-2013-4259: ansible uses a socket with predictable filename in /tmp