* Make is_encrypted_file handle both files opened in text and binary mode
On python3, by default files are opened in text mode. Since we know
the encoding of vault files (and especially the header which is the
first set of bytes) we can decide whether the file is an encrypted
vault file in either case.
* Fix is_encrypted_file not resetting the file position
* Update is_encrypted_file to check that all the data in the file is ascii
* For is_encrypted_file(), add start_pos and count parameters
This allows callers to specify reading vaulttext from the middle of
a file if necessary.
* Combine VaultLib.encrypt() and VaultLib.encrypt_bytestring()
* Change vault's is_encrypted() to take either text or byte strings and to return False if any part of the data is non-ascii.
* Remove unnecessary use of six.b
* Vault Cipher: mark a few methods as private.
* VaultAES256._is_equal throws a TypeError if given non byte strings
* Make VaultAES256 methods that don't need self staticmethods and classmethods
* Mark VaultAES and is_encrypted as deprecated
* Get rid of VaultFile (unused and feature implemented in a different way)
* Normalize variable and parameter names on plaintext, ciphertext, vaulttext
* Normalize variable and parameter names on "b_" prefix when dealing with bytes
* Test changes:
* Remove redundant tests( both checking the same byte string)
* Fix use of format string without format operator
* Enable vault editor tests on python3
* Initialize the vault_cipher for VaultAES256 testing in setUp()
* Make assertTrue and assertFalse take the actual method calls for
better error messages.
* Test that non-ascii byte strings compare correctly.
* Test that unicode strings and ints raise TypeError
* Test-specific:
* Removed test_methods_exist(). We only have one VaultLib so the
implementation is the assurance that the methods exist. (Can use an abc for
this if it changes).
* Add tests for both byte string and text string input where the API takes either.
* Convert "assert" to unittest assert functions or add a custom message where
that will make failures easier to debug.
* Move instantiating the VaultLib into setUp().
We couldn't copy to_unicode, to_bytes, to_str into module_utils because
of licensing. So once created it we had two sets of functions that did
the same things but had different implementations. To remedy that, this
change removes the ansible.utils.unicode versions of those functions.
Make !vault-encrypted create a AnsibleVaultUnicode
yaml object that can be used as a regular string object.
This allows a playbook to include a encrypted vault
blob for the value of a yaml variable. A 'secret_password'
variable can have it's value encrypted instead of having
to vault encrypt an entire vars file.
Add __ENCRYPTED__ to the vault yaml types so
template.Template can treat it similar
to __UNSAFE__ flags.
vault.VaultLib api changes:
- Split VaultLib.encrypt to encrypt and encrypt_bytestring
- VaultLib.encrypt() previously accepted the plaintext data
as either a byte string or a unicode string.
Doing the right thing based on the input type would fail
on py3 if given a arg of type 'bytes'. To simplify the
API, vaultlib.encrypt() now assumes input plaintext is a
py2 unicode or py3 str. It will encode to utf-8 then call
the new encrypt_bytestring(). The new methods are less
ambiguous.
- moved VaultLib.is_encrypted logic to vault module scope
and split to is_encrypted() and is_encrypted_file().
Add a test/unit/mock/yaml_helper.py
It has some helpers for testing parsing/yaml
Integration tests added as roles test_vault and test_vault_embedded
A simple import of cryptography can throw several types of errors. For example,
if `setuptools` is less than cryptography's minimum requirement of 11.3, then
this import of cryptography will throw a VersionConflict here. An earlier case
threw a DistributionNotFound exception.
An optional dependency should not stop ansible. If the error is more than
an ImportError, log a warning, so that errors can be fixed in ansible or
elsewhere.
* Catch DistributionNotFound when pycrypto is absent
On Solaris 11, module `pkg_resources` throws `DistributionNotFound` on import if `cryptography` is installed but `pycrypto` is not. This change causes that situation to be handled gracefully.
I'm not using Paramiko or Vault, so I my understanding is that I don't
need `pycrpto`. I could install `pycrypto` to make the error go away, but:
- The latest released version of `pycrypto` doesn't build cleanly on Solaris (https://github.com/dlitz/pycrypto/issues/184).
- Solaris includes an old version of GMP that triggers warnings every time Ansible runs (https://github.com/ansible/ansible/issues/6941). I notice that I can silence these warnings with `system_warnings` in `ansible.cfg`, but not installing `pycrypto` seems like a safer solution.
* Ignore only `pkg_resources.DistributionNotFound`, not other exceptions.
CLI already provides a pager() method that feeds $PAGER on stdin, so we
just feed that the plaintext from the vault file. We can also eliminate
the redundant and now-unused shell_pager_command method in VaultEditor.
Now we issue a "Reading … from stdin" prompt if our input isatty(), as
gpg does. We also suppress the "x successful" confirmation message at
the end if we're part of a pipeline.
(The latter requires that we not close sys.stdout in VaultEditor, and
for symmetry we do the same for sys.stdin, though it doesn't matter in
that case.)
This allows the following invocations:
# Interactive use, like gpg
ansible-vault encrypt --output x
# Non-interactive, for scripting
echo plaintext|ansible-vault encrypt --output x
# Separate input and output files
ansible-vault encrypt input.yml --output output.yml
# Existing usage (in-place encryption) unchanged
ansible-vault encrypt inout.yml
…and the analogous cases for ansible-vault decrypt as well.
In all cases, the input and output files can be '-' to read from stdin
or write to stdout. This permits sensitive data to be encrypted and
decrypted without ever hitting disk.
Now that VaultLib always decides to use AES256 to encrypt, we don't need
this broken code any more. We need to be able to decrypt this format for
a while longer, but encryption support can be safely dropped.
Now we don't have to recreate VaultEditor objects for each file, and so
on. It also paves the way towards specifying separate input and output
files later.
It's unused and unnecessary; VaultLib can decide for itself what cipher
to use when encrypting. There's no need (and no provision) for the user
to override the cipher via options, so there's no need for code to see
if that has been done either.
When stretching the key for vault files, use PBKDF2HMAC() from the
cryptography package instead of pycrypto. This will speed up the opening
of vault files by ~10x.
The problem is here in lib/ansible/utils/vault.py:
hash_function = SHA256
# make two keys and one iv
pbkdf2_prf = lambda p, s: HMAC.new(p, s, hash_function).digest()
derivedkey = PBKDF2(password, salt, dkLen=(2 * keylength) + ivlength,
count=10000, prf=pbkdf2_prf)
`PBKDF2()` calls a Python callback function (`pbkdf2_pr()`) 10000 times.
If one has several vault files, this will cause excessive start times
with `ansible` or `ansible-playbook` (we experience ~15 second startup
times).
Testing the original implementation in 1.9.2 with a vault file:
In [2]: %timeit v.decrypt(encrypted_data)
1 loops, best of 3: 265 ms per loop
Having a recent OpenSSL version and using the vault.py changes in this commit:
In [2]: %timeit v.decrypt(encrypted_data)
10 loops, best of 3: 23.2 ms per loop