Over the past year, we (Melbourne node of NeCTAR) has pushed in 2 new batches of compute hardware. One of the thing that bugs me was that the testing of compute hardware was terribly inefficient - operators were manually creating new instances, volumes, attaching volumes to instances, etc, to make sure that each host is working before we move it to production. On top of it being a terrible waste of time, we were also prone to missing out on different test cases (boot from volume? oops! boot from resized volume? oops!). This led us to look at a better way to do testing when we started off with a hardware refresh this year (+4000 VCPUs yay!).
Testing has already been long solved over at OpenStack - whenever a change is merge in gerrit test suites are kicked off. This automated testing ensures that code is working before being merged, which is awesome. With that in mind, I looked at how we can enable this for our cloud.
Difference in tempest tests
Before we being, there are some things to note:
OpenStack tempest tests are quite comprehensive - they test all API calls. However, from Melbourne node’s POV, what we really want to test is the correct working configuration on hosts/controllers. Hence, some filtering is needed for to only run a subset of the tests we are interested in. Also, some tests don’t work for cells.
As an addition to the above, we also want to limit the tests to run on Melbourne node (or even 1 particular host), instead of relying on the default scheduler to assign it to any host. This scenerio is useful as a final QA before adding compute hosts to production aggregates.
Ideally, NeCTAR Core Services will run a set of high level API tests to make sure that the messages get to the correct cell. Each node like us will run a different set of tempest tests for each host in the cell. This ensures that we cover both the control plane and also each host.
Here is a rough guide to our set up.
Run pip install in each repo
$ cd tempest-lib $ pip install . $ cd ../tempest $ pip install .
By default, tempest uses the default scheduler to launch instances/volumes. If you, like us, needs to launch on specific AZs, you might want the following patches.
Set up the configuration file
[auth] use_dynamic_credentials = false [compute] # test.tiny flavor_ref = '<flavor_id>' # ubuntu image_ref = '<image_id>' fixed_network_name = '<network>' availability_zone = '<AZ>' build_interval = 2 volume_device_name = vdc [identity] auth_version = v2 uri = '<keystone_url>' v2_admin_endpoint_type = 'publicURL' catalog_type = identity username = '<user>' password = '<password>' tenant_name = '<tenant>' [identity-feature-enabled] api_v3 = false [validation] run_validation=true network_for_ssh = '<network>' connect_method = 'fixed' image_ssh_user = 'ubuntu' [volume] availability_zone = '<AZ> build_interval = 2 build_timeout = 60
If you don’t want to run all the tests, you can create a whitelist file e.g.
etc/whitelist.txt. An example of ours:
# compute tempest.api.compute.servers.test_create_server.ServersTestJSON.test_host_name_is_same_as_server_name tempest.api.compute.servers.test_create_server.ServersTestJSON.test_verify_created_server_vcpus tempest.api.compute.servers.test_create_server.ServersTestJSON.test_list_servers_with_detail tempest.api.compute.servers.test_create_server.ServersTestJSON.test_verify_server_details # volume tempest.api.compute.volumes.test_attach_volume.AttachVolumeTestJSON.test_attach_detach_volume tempest.api.compute.volumes.test_attach_volume.AttachVolumeTestJSON.test_list_get_volume_attachments tempest.api.compute.volumes.test_volumes_get.VolumesGetTestJSON.test_volume_create_get_delete tempest.api.compute.volumes.test_volume_snapshots.VolumesSnapshotsTestJSON.test_volume_snapshot_create_get_list_delete
To run the whitelist tests, do
ostestr --serial -w etc/whitelist.txt`
To run 1 test, you can do
ostestr --serial --pdb tempest.api.compute.volumes.test_attach_volume.AttachVolumeTestJSON.test_list_get_volume_attachments`
NOTE TO SELF: Please tidy up the commits and put the configuration on Github!