With some earlier changes, continuing to forward failed hosts on
to the iterator with each TQM run() call was causing plays with
max_fail_pct set to fail, as hosts which failed in previous plays
were counting those old failures against the % calculation.
Also changed the linear strategy's calculation to use the internal
failed list, rather than the iterator, as this now represents the
hosts failed during the current run only.
As noted in the comment, the TQM may be used for more than one play. As such,
after creating the new PlayIterator object it is necessary to mark any failed
hosts from previous calls to run() as failed in the iterator, so they are
properly skipped during any future calls to run().
Prior to this patch, the retry/until logic would fail any task that
succeeded if it took all of the alloted retries to succeed. This patch
reworks the retry/until logic to make things more simple and clear.
Fixes#15697
Previously the changed code was necessary, however it is now problematic
as we've started using the is_failed() method in other places in the code.
Additional changes at the strategy layer should make this safe to remove
now.
Fixes#15625
* Don't filter hosts remaining based on their failed state. Instead rely
on the PlayIterator to return None/ITERATING_COMPLETE when the host is
failed.
* In the free strategy, make sure we wait outside the host loop for all
pending results to be processed.
* Use the internal _set_failed_state() instead of manually setting things
when a failed child state is hit
Fixes#15623
In `lib/ansible/executor/play_iterator.py`, ansible sets a host's
`_gathered_facts` property to `True` without checking to see if there
are any tasks to be executed. In the event that the entire play is
skipped, `_gathered_facts` will be `True` even though the `setup`
module was never run.
This patch modifies the logic to only set `_gathered_facts` to `True`
when there are tasks to execute.
Closes#15744.
This changeset addresses the issue reported here:
ansible/ansible-modules-core#1765
The yum module (at least) includes its task results as strings, rather than
dicts, and the code this changeset replaces assumed that in that instance the
task was skipped. The updated behaviour assumes that the task has been
skipped only if:
* results exist, and
* all results are dicts that include a truthy skipped value
This was reinitialized every time we forked before so we weren't sharing
the same Locks. It also was not accounting for modules which were
directly invoked by an action plugin instead of going through the
strategy plguins.
* Make ziploader's ansible and ansible.module_utils libraries into
namespace packages.
* Move __version__ and __author__ from ansible/__init__ to
ansible/release.py. This is because namespace packages only load one
__init__.py. If that is not the __init__.py with the author and
version info then those won't be available.
* In ziplaoder, move the version ito ANSIBLE_CONSTANTS.
* Change PluginLoader to properly construct the path to the plugins even
when namespace packages are present.
Previously we were first checking the fail/run state of the child
state for tasks/rescue/always portions of the block. Instead we are now
always recursively iterating over the child state and then evaluating
whether the child state is failed or complete before changing the failed/
run state within the current block.
Fixes#14324
* Move zipcache temp dir creation into the locked section otherwise it
races with other workers.
* Catch IOError and turn it into an AnsibleError. IOErrors can hang
multiprocessng.
Updated python module wrapper explode method to drop 'args' file next to module.
Both execute() and excommunicate() debug methods now pass the module args via file to enable debuggers that are picky about stdin.
Updated unit tests to use a context manager for masking/restoring default streams and argv.
rm _del_ as it might leak memory
renamed to tmp file cleanup
added exception handling when traversing file list, even if one fails try rest
added cleanup to finally to ensure removal in most cases
Ansible when there was a percentage that was calculated to be less than
1.0 would run all hosts as the value for a rolling update.
The error is due to the fact that Python will round a
float that is under 1.0 to 0, which will trigger the case of
0 hosts. The 0 host case tells ansible to run all hosts.
The fix will see if the percentage calculation after int
conversion is 0 and will else to 1 host.
This makes our recursive, ast.parse performance measures as fast as
pre-ziploader baseline.
Since this unittest isn't testing that the returned module data is
correct we don't need to worry about os.rename not having any module
data. Should devise a separate test for the module and caching code