legacy-wiki
Ansible Performance
Imported from the legacy wiki archive. The date above is inferred from the strongest available evidence and may represent a known range rather than the exact publication day.
areas of performance
Every layer of the stack as described in https://tannerjc.net/wiki/index.php?title=Ansible_Hangs_Filament#Tools_By_Layer is potentially a source of performance issues.
strategies
getting started
many tasks
many hosts
many groups
many vars
metrics
duration
[https://en.wikipedia.org/wiki/Duration Duration] is a measurement of time. There are multiple facets of “duration” when observing Ansible.
- total duration of a playbook
- total duration of a task
- total duration of a host within a task
- total duration of an ssh call for a host within a task
- total duration of wating for a sudo password prompt
- total duration of executing python on the remote host
- total duration of processing the results for a host/worker
cpu utilization
Aka [https://en.wikipedia.org/wiki/CPU_time CPU time]. Many tools exist to measure cpu time / utilization and it’s important to understand the various metrics each provides.
https://access.redhat.com/solutions/1160343
An important metric for basic ansible cpu utilization is the “b” (aka “blocked) column from vmstat.
https://linux.die.net/man/8/vmstat <br> http://www.dba-oracle.com/t_linux_oracle_vmstat.htm <br> https://access.redhat.com/solutions/792683 <br>
Multiple factors could cause the number of blocked processes to accumulate.
- not enough cpu cores
- setting ansible’s fork count too high
- not enough memory
- too many hosts returning too much data for the controller to handle
- not enough disk IOPs
=== memory utilization ===
disk utilization
network utilization
tools
https://docs.ansible.com/ansible/devel/plugins/callback/cgroup_memory_recap.html <br> https://github.com/jctanner/ansible-tools/blob/master/ansible_debug_logparser <br>
https://github.com/ansible/qa-scale-lab <br> https://github.com/jctanner/ansible-tools/tree/master/vagrant/ansible_test_inventory <br>
labs
https://github.com/jctanner/ansible-tools/tree/master/playbooks/slowhost
training
https://www.redhat.com/en/services/training/rh442-red-hat-enterprise-performance-tuning <br> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/performance_tuning_guide/chap-red_hat_enterprise_linux-performance_tuning_guide-performance_monitoring_tools
additional reading
https://gist.github.com/sivel/ef405e70f699ce29c49cfdf6104a0492