v8.16.1 (2024-11-14)
- mysql_legacy: fix set_master_use_gtid() query, its value it's part of the syntax, avoid pymysql quoting it.
- mysql_legacy: fix query formatting in set_replication_parameters().
- mysql_legacy: fix check in replication_lag() that would raise if the lag is 0.0s.
- doc: fix example code bug missing a reference to
self
.
v8.16.0 (2024-11-13)
- mysql_legacy: add
MysqlClient
class as a copy of themysql.Mysql
class to later merge those two modules together. - mysql_legacy: improve pymysql usability adding some new helper methods:
execute()
: to execute a query that doesn't return anything via pymysql.fetch_one_row()
: to execute a query with pymysql that should return one row and return it.check_warnings()
: to check if in the last statement there was any warning raised and ask the user what to do.
- mysql_legacy: in the
Instance
class convert all internal queries to use the new methods to use pymysql instead of executing queries via ssh.
- mysql: remove deprecated call to
query()
method of pymysql that is for internal use only. Convert it to acursor().execute()
call that is the part of the public facing API.
v8.15.2 (2024-10-31)
- elasticsearch: removed
ElasticsearchHosts.get_remote_hosts()
getter, superseded by the newRemoteHostsAdapter.remote_hosts()
. - puppet: removed
PuppetServer.server_host()
andPuppetMaster.master_host()
getters, superseeded by the newRemoteHostsAdapter.remote_hosts()
. - Because of the very low usage of the above methods this didn't warrant a major release. Reporting it as breaking here for completeness, their usage will be fixed right after releasing this version.
- remote: add
remote_hosts
getter to theRemoteHostsAdapter
to ease the use from clients. This also removes one-off getter from other classes in thepuppet
andelasticsearch_cluster
modules.
- orchestrator: do not retry on 500s as orchestrator tends to reply to non-existing objects with a 500 with a JSON response, do not retry the request.
- mysql_legacy: accept any exit code for systemctl status to prevent having
RemoteExecutionError
exceptions. - mysql_legacy: add getter for the
Instance
'ssocket
property. - mysql_legacy: fix
list_host_instances()
detection of single and multi-instances independently of the status of the systemd unit.
v8.15.1 (2024-10-23)
- orchestrator: fix bug with older requests that doesn't have the
JSONDecodeError
exception. - service: change
depool_threshold
field to float following Puppet related change.
v8.15.0 (2024-10-23)
- mysql: refactor this currently unused module to be up to date with the current infrastructure while simplifying it. Because of the unused nature of the module this didn't warrant a major release. Reporting it as breaking here for completeness.
- orchestrator: add a new module to interact with Orchestrator's APIs.
- apiclient: add a generic API client module and related Spicerack accessor.
- redfish: use the new apiclient module.
- redfish: add UEFI functions to check if a host is setup with UEFI and to boot into UEFI HTTP.
- puppet: add format option to
hiera_lookup
. - mysql_legacy: add data directory accessor.
- mysql_legacy: re-order the
CORE_SECTIONS
constant from the less impactful to most impactful. - mysql_legacy: get systemd status for instance to easily check if the instance is running or not.
- mysql_legacy: add
cursor
method to theInstance
class to get a mysql client connection to the instance. - remote: add
dry_run
getter forRemoteHosts
, useful forRemoteHostsAdapter
implementations.
- dhcp: Add option to omit sending filename to a vendor, used for the Debian Installer.
- doc: removed deprecated call to
sphinx_rtd_theme
. - tox: only install flake8 when running flake8.
- tests: fix issues reported by pylint >3 and pin Prospector.
v8.14.0 (2024-09-30)
- dbctl: add new module to interact with dbctl (T362893).
- Add a new spicerack accessor to get a
Dbctl
instance. - From the
Dbctl
instance allow to access the dbctl libraries forInstance
,Section
andDbConfig
(mediawiki config). - Dry-run support is ensured via the parent
Confctl
class that sets theread_only
argument to theConftoolClient
instance accordingly.
- Add a new spicerack accessor to get a
- confctl: add native support for RO in conftool
- The spicerack interface to Conftool via the
ConftoolEntity
class does honor dry-run itself, although conftool was not having a dry-run support. - With recent contool development we can now use
ConftoolClient
to initialize it and this interface allows to set aread_only
parameter. - The
ConftoolClient
interface abstracts the setup of the conftool client from the caller, in place of the to-be deprecatedkvobject.KVObject.setup
method. - Use the
read_only
parameter when in dry-run mode, both for safety reasons and also to enable using more complex conftool operations, such as the ones offered by the dbconfig extension.
- The spicerack interface to Conftool via the
- netbox: removed Netbox 3 backward compatibility, all existing Netbox instances are 4+.
v8.13.1 (2024-09-17)
- mysql_legacy: Add a 1 second sleep after
start_slave()
to ensure that a subsequent call toshow_slave_status()
would be reliable. Renamemaster_use_gtid()
toset_master_use_gtid()
for better clarity of the RW nature of it.
v8.13.0 (2024-09-06)
- doc: add intersphinx_timeout (T367410).
- The config should allow to have quicker Debian builds when the network is not available.
- redfish: allow 200 responses in chassis_reset (T365372).
- On Supermicro nodes, chassis_reset's HTTP call gets a HTTP 200 from the BMC, not 204. It seems ok to relax the condition and allow both 204 and 200, without extra logging since the Supermicro's BMC response is not useful.
- redfish: catch no-json-responses in change_user_password (T365372).
- The Supermicro's Redfish implementation works the same as Dell's in change_user_password, except for the fact that no JSON response is returned.
- redfish: introduce the AccountManager URI for DELL (T365372).
- From various tests it seems that the /redfish/v1/AccountService URI works on DELL too, but only for "read-only", namely getting accounts' info. Refactor a bit the redfish class and the find_account() method to take this into account.
v8.12.0 (2024-09-02)
setup.py: update pynetbox to 7.4 (T373794).
- ** After T371890#10081172 Spicerack fails to build due to pynetbox,
since it was upgraded to 7.4
v8.11.0 (2024-09-02)
- dhcp: allow empty distro for DHCPConfMac and DHCPConfOpt82 (T365372).
- Allow "distro" to be empty, so that the correspondent pathprefix config is not rendered. This is useful when we want to add DHCP configs for IP configuration only, like the Supermicro BMC/mgmt interface.
- tox: run less environments on CI (T372485).
v8.10.0 (2024-08-01)
- mysql_legacy: Instance class improvements (T371351).
- Rename use_gtid() to master_use_gtid() to follow MySQL naming convention. Change its signature to accept a setting parameter to pick which valid value to use.
- Introduce a MasterUseGTID enum class to represent the valid values that can be used for the MASTER_USE_GTID parameter.
- Add a run_vertical_query() method to run a query with the vertical output format (G) and parse its result to a list of dictionaries.
- Adapt the other methods that would benefit of the above method to use it.
- redfish: add the add_account function (T365372).
- Supermicro ships their servers with the BMC admin account set to ADMIN, meanwhile we standardized the usage of root inside Wikimedia (basically what Dell does by default). Added a new add_account function that uses Redfish to create a new account.
v8.9.0 (2024-07-25)
- dhcp: add
dhcp_filename
anddhcp_options
for DHCPConfMac and DHCPConfOpt82 (T363576).- The DHCP configuration can now be customized with ad-hoc filename and DHCP option settings.
- mysql_legacy: fix Instance's upgrade path (T367496)
- The binary that runs the mysql upgrade needs to run other tools within the same directory and when called with a full path it will try to run them from the same path. But because the mysql_upgrade binary has a chain of symlink, we need to resolve them first before being able to run it with the full path.
v8.8.0 (2024-07-18)
- netbox: add support for Netbox 4 (T336275).
- Limited support for cables with multiple terminations per sides: the first termination is the only one considered.
- netbox: refactor tests to be more flexible, and adapt them for Netbox 4.
v8.7.0 (2024-07-16)
- redfish: add property for storage manager URI (T365372):
- Add a new property for
RedfishDell
andRedfishSupermicro
to be used as helper in various cookbook that require the URI path to get Storage Members info.
- Add a new property for
- redfish: simplify interface of Redfish classes (T365372):
- Now that we have two implementation we can see the common parts and simplify a bit the hardcoded bits in both derived classes of the Redfish class.
- Define only the specific service name, not the whole path in the concrete classes and define the path in the parent class.
- Define the service names as class properties instead of instance properties to reduce the number of lines and make it more readable, we don't really need the strictness of inheritance to ensure we add all of them when implementing a new vendor, it's fairly rare.
- mediawiki: update siteinfo URL to use mw-api-int (T367949)
- mysql_legacy: update core sections (T367496):
- The external storage sections were recently rotated to new ones.
- mariadb: bugfixes mysql_legacy (T367496):
- We introduced a number of bugs in spicerack 8.6.0 that needs to be handled for automation implementations to begin.
- Refactored and simplified a bit the new APIs.
- Added full test coverage.
v8.6.0 (2024-06-12)
- redfish: expand support for Supermicro hosts (T365372):
- Allow
RedfishSupermicro
to be picked up in__init__.py
based on what Netbox returns as manifacturer (and not just default toRedfishDell
). Update tests to reflect this new behavior. - Move
get_power_state()
to an abstract method, to be implemented in vendor-specific classes. Update also tests to reflect this.
- Allow
- mysql_legacy: improve support for MariaDB instances on each host (T343674).
- redfish: fix typo in DellSCP's class description.
v8.5.0 (2024-04-15)
- netbox: add functions to get and set the device name.
- elasticsearch: remove the dependency from elasticsearch-curator making the calls directly via the elasticsearch library (T345337 and T361647).
- alertmanager: add multi-instance and authentication support (T360932):
- Add support for multiple alertmanager instances based on a configuration file. One of those instances can be
marked as
default
which is used when the call to theSpicerack.alertmanager()
orSpicerack.alertmanager_hosts()
API is used without specifying a specific instance or some other API (likeService.downtime()
) that does not support multiple instances is used. - Add support for per-instance HTTP basic authentication. The metricsinfra Alertmanager instance will be behind HTTP basic authentication to avoid exposing the read-write API to the entire wikiprod network (via the HTTP proxies). This patch adds support for configuring a username and a password to use on a specific Alertmanager instance.
- Add support for multiple alertmanager instances based on a configuration file. One of those instances can be
marked as
- puppet: make
PuppetServer.destroy()
have the same behaviour ofPuppetMaster.destroy()
and do not raise an exception if the host certificate is already missing (T360293).
- setup.py: remove dependency elasticsearch-curator not needed anymore and remove upper bound for black linter that was there for incompatibilities with elasticsearch-curator.
- k8s: Remove use of
@staticmethod
in tests. - tests: fix typos in tests that were erroneously calling mock methods with the wrong names.
- utils: remove
--apply
from isort's call in format-code, now the default in v5.
v8.4.1 (2024-03-06)
- k8s: add getter for the Batch API.
v8.4.0 (2024-02-27)
- netbox: allow to execute a Netbox script and retrieve the results.
- netbox: add getter/setter for primary IPs and access vlan.
- ganeti: pass the v4 and v6 IPs to the VM as
fw_cfg
in the create command.
v8.3.0 (2024-01-29)
- ganeti: add support for routed Ganeti (T300152).
- alertmanager: fix timezone bug when run from a non-UTC computer (T347490).
- setup.py: add missing classifier for Python 3.11.
v8.2.0 (2023-11-22)
- puppet: add a
hiera_lookup()
method to thePuppetServer
andPuppetMaster
classes to perform a hiera lookup of a specific key from the perspective of a specific host.
v8.1.0 (2023-11-20)
- remote: add a new
RemoteHost.get_subset()
method return a newRemoteHosts
instance with a subset of the hosts. Useful when working with instances that inherit fromRemoteHostsAdapter
to be able to work on a subset of the hosts. - service: Add
ipip_encapsulation
field toServiceLVS
to follow what's in Puppet. - puppet: Update
get_ca_server
to also support SRV discovery records.
v8.0.3 (2023-11-16)
- puppet: for the Puppet 7 migration set temporarily the return value of get_puppet_ca_hostname() hardcoded to
puppetmaster1001
to allow to migrate the cumin hosts to Puppet 7.
- doc: expand distributed locking docs, add an example of logging when unable to acquire a lock.
- spicerack: log at debug level some stats of each cookbook execution in a machine-readable format. This can be useful to generate some stats of the cookbook executions allowing to split them by exit code too.
v8.0.2 (2023-10-18)
- locking: delete the key on etcd if no locks remain to keep etcd clean and avoid to left a lot of keys with emty dictionaries as values (T341973).
v8.0.1 (2023-10-18)
- locking: fix path for Spicerack modules locks that was not correctly calculated.
v8.0.0 (2023-10-17)
- dhcp: the
spicerack.Spicerack.dhcp()
accessor has changed signature and now accepts just a datacenter name instead ofRemoteHosts
instance. All cookbooks using this accessor had the same logic implemented to find the specific dhcp hosts in a given datacenter and this logic has been moved inside the accessor. All existing usage will be migrated at deploy time. - netbox: remove methods
fetch_host_status
,fetch_host_detail
andput_host_status
that were deprecated sincev0.0.50
and replaced by thespicerack.netbox.NetboxServer
class. Some private methods have also been renamed to follow more closely Netbox namings.
- Distributed locking support (T341973):
- See the dedicated :ref:`Distributed locking<distributed-locking>` section of the documentation for a general overview.
- Cookbooks class API additions to the
spicerack.cookbook.CookbookRunnerBase
base class:max_concurrency
class property to statically set the maximum number of concurrent runs of a given cookbook, enforced by the distributed lock.lock_ttl
class property to statically set the TTL of the distributed lock acquired for each cookbook run.lock_args
instance property to dynamically modify the locking arguments, for example based on the CLI arguments (RO vs RW mode of operations).
- Cookbooks module API additions:
MAX_CONCURRENCY
module constant to statically set the maximum number of concurrent runs of a given cookbook, enforced by the distributed lock.LOCK_TTL
module constant to statically set the TTL of the distributed lock acquired for each cookbook run.
- Automatically acquire a lock for each cookbook run according to the values defined above.
- spicerack: add a
_spicerack_lock
private accessor to get a lock instance to be passed to the Spicerack modules that would need to acquire a distributed lock with concurrency and TTL. It is different from the public accessor for the cookbooks because the key prefix is different to keep cookbooks custom locks separate from the spicerack modules ones. It's mentioned here as information for Spicerack developers.
- dhcp: acquire exclusive per-DC lock on write operations:
- Acquire an exclusive lock on a per-DC basis when performing write operations, both during the creation of a DHCP snippet and its deletion.
- Always rewrite the DHCP snippet. With the protection of the lock, there is no more need for this check and the library can safely overwrite all the time the DHCP snippet for a given host.
- puppet: add support for puppetserver JSON commands returning non-zero exit code with JSON output (e.g. if a host is missing).
- doc: add new section for the distributed locking support in the Introduction page.
- doc: mark the module interface as deprecated instead of having the class one as preferred, to better describe the current state.
- tox.ini: remove optimization for tox <4. Tox 4 will not re-use the environments because of the different names, so removing this tox <4 optimization as it's making subsequent runs slower with tox 4+.
- dhcp: simplify tests.
- tests: remove obsolete or not anymore needed items from the false positive list of unused code catched by vulture.
v7.4.1 (2023-10-10)
locking: load also
~/.etcdrc
for the running user (T341973):- We currently save the authentication credential in
/root/.etcdrc
. Generically load the effective running user's~/.etcdrc
configuration file too and merge it into the one provided in the configuration. This is done best effort, if the ~/.etcdrc file is missing it will be silently ignored.
- We currently save the authentication credential in
v7.4.0 (2023-10-09)
- Add distribted locking support (T341973):
- locking: add new module for distributed locking support via etcd.
- spicerack: add a new spicerack accessor
lock()
to get an instance of the locking class to acquire and release cookbook specific custom locks (T341973). - cookbook: add
--no-locks
CLI argument to disable locking acquisition/release on a per-run basis. To be used in case of emergency or if there are issues with etcd that prevents to acquire/release locks properly. - By default the locking support is disabled unless the
etcd_config
is set in the configuration file.
spicerack: add
owner
property to get a pre-formatted string of the formuser@host [pid]
useful to identify the owner of a current running process.spicerack: add
current_hostname
property to get the hostname of the host where the cookbook is currently running.spicerack: improve cookbooks help message:
The default argument parser in the CookbookBase class doesn't provide a
prog
name as it's a bit tricky to guess it because it depends on how many cookbooks are defined in a single file.As a result the help message was not very clear up to now:
$ sudo cookbook sre.hosts.decommission -h usage: cookbook [-h] -t TASK_ID [--force] query
With this release we inject the cookbook real name in the parser with the additional costruct to use:
$ sudo cookbook sre.hosts.decommission -h usage: cookbook [GLOBAL_ARGS] sre.hosts.decommission [-h] -t TASK_ID [--force] query
This way it should also help to remind the user that there are global arguments for the cookbook binary in addition to the cookbook-specific ones. It was deemed not necessary to add a message to run
cookbook -h
to get the availableGLOBAL_ARGS
, but it can be easily added.
v7.3.1 (2023-10-04)
- tests: fix test that was actually querying the DNS making it fail in the Debian package build process.
v7.3.0 (2023-10-04)
- puppet: Add new
PuppetServer
class and make thePuppetMaster
inherit from it as it will be deprecated first and then removed in future releases.
- decorators: fix the
set_tries()
function (T346134).- It is used to dynamically change the number of tries on a
@retry
-decorated function/method but was not reading the function signature default value when present. Inspect the signature and if the default value is present, is an integer and is either untyped or typed as integer use it. Add also tests as they were not present and not spotted because the code coverage was considering the function as tested because used in the service module.
- It is used to dynamically change the number of tries on a
- tests: simplify the
spicerack._cookbook.main()
tests avoiding to mock the Spicerack instance and using instead the configuration file to instantiate a real instance.
v7.2.2 (2023-09-11)
- ganeti: add support also for the
sandbox
VLAN. - mediawiki: move the calls to
noc.wikimedia.org
to the kubernetes hosted one.
- puppet: drop deprecated
--ignorecache
switch. - Fix some docstring typos.
- spicerack: make all
CookbookCollection
class arguments as keyword-only to avoid mistakes (internal API).
v7.2.1 (2023-06-21)
- service: make the
monitors
field of theServiceLVS
class optional to adapt it to the recent change in Puppet about it.
v7.2.0 (2023-05-31)
- ganeti: add new
GanetiRAPI
methodsnodes()
andgroups()
to get the related info from the cluster. - ganeti: specify VM memory size in MB to allow for more fine-tune than GB.
- dhcp: when re-generating the DHCP includes and then restarting the DHCP server, in case of a failure make sure to delete the newly created snippet and refresh again to ensure the DHCP is in a good shape.
- dhcp: reword some exception messages.
- .gitignore: add local config files to it.
- Add Python 3.11 support.
v7.1.0 (2023-05-15)
- dhcp: expand support for hostname based match using the manufacturer to adapt to different settings.
- remote: improve usability of
RemoteHosts.wait_reboot_since()
clarifying the message and making it more DRY-RUN friendly.
v7.0.0 (2023-05-08)
- spicerack: refactor IRC logging:
- Rename the existing
irc_logger
tosal_logger
as it logs to IRC with the!log
and hence to SAL. - Add a new
irc_logger
property to log to IRC on the#wikimedia-operations
channel without the!log
prefix to just log to IRC and not SAL.
- Rename the existing
- doc: do not load UI fix when building the manpage.
v6.4.3 (2023-05-08)
- ganeti: enable
--no-wait-for-sync
by default for the virtual machine creation command.
- decorators: fix
dry_run
detection that had a bug in the case of a function with adry_run
argument with a default value. The default value was used also in the presence of a an explicit value set by the caller (T335855). - doc: fix search in documentation as
jQuery
is not automatically loaded by the rtd theme. - doc: Remove extra preceding space in intro example.
v6.4.2 (2023-04-17)
- kafka: remove setting to avoid checking the hostname in TLS certs as all clusters in production are now running with PKI TLS certs that have the hostname in their CN.
- service: add
httpbb_dir
field that was added to the Puppet service catalog.
v6.4.1 (2023-03-30)
- redfish: update log entries location for Dell and make it compatible with different iDRAC versions.
v6.4.0 (2023-03-28)
tox: make config compatible with tox
4.x
.remote: add results to
RemoteExecutionError
. While waiting for Cumin to support a more robust result reporting, pass the results also in the case of a failed execution to theRemoteExecutionError
excepion so that potentially client code could access the partial results on failure using a pattern like:try: results = remote_hosts.run_sync('some command') except RemoteExecutionError as e: results = e.results
- setup.py: force
dnspython
from Bullseye pinning the dependency to the same version of Debian Bullseye as upstream has breaking changes also between minor versions. - dnsdisc: adapt code and tests to work with
dnspython 2.0.0
. - service: improve
check_dns_state
validation check. - puppet: make the
PuppetMaster
class inherit fromRemoteHostsAdapter
to fix a bug in dry-run mode with a method decorated with@retry
. - service: ensure that
dry_run
is passed to theService
class to be detected in dry-run mode for methods decorated with@retry
.
- tox: use
sphinx-build
to generate the documentation, this prevents a deprecation warning for usingsetup.py
.
v6.3.0 (2023-03-15)
- apt: add new module with new
AptGetHosts
class that inherits fromRemoteHostsAdapter
to handle simpleapt-get
use cases but setting all the proper options for non-interactive runs ofapt-get
. - spicerack: add new
spicerack.apt_get()
accessor to runapt-get
commands on target hosts.
- redfish: add simple supermicro class.
- alertmanager: match also FQDN, not only hostnames in the label.
- decorators: add
set_tries()
function to be used for thedynamic_params_callbacks
argument of the@retry
decorator to dynamically modify the number of tries to retry from the client. - dnsdisc: add a
resolve_with_client_ip()
method to resolve with EDNS Client Subnet (ECS) support. - service: extend the discovery capabilities of the service catalog to check the DNS records with ECS support adding
a
check_service_ips()
method and acheck_dns_state()
one. - spicerack: add
authdns_active_hosts
property to get aRemoteHosts
instance for the authoritative DNS servers currently active. As it uses the Cumin's direct backend it works also if PuppetDB is not available.
- icinga: handle edge case where status is not optimal but there are no failed services (T330318).
- icinga: uniform code for acked services like failed services to offer the same API in all involved classes.
- k8s: fix existing docstrings.
- tox: disable bandit's
request_without_timeout
in tests. - setup.py: bump dependencies minimum version to match those in Debian bullseye.
- setup.py: remove temporary upper limit for prospector as the upstream issue has been fixed.
- doc: dynamically set copyright year to current year.
- Use
GenericAlias
objects for type hints in the whole code base given that the lowest supported Python is 3.9:- Use directly
GenericAlias
builtin objects for type hints (e.g.dict[]
instead ofDict[]
). - Use directly
GenericAlias
objects from thecollections.abc
module instead of the ones from thetyping
module (i.e.collections.abc.Sequence
instead oftyping.Sequence
). - See also PEP 585.
- Use directly
- docstrings: automatically document type hints using
sphinx_autodoc_typehints
. Now it's not necessary to repeat in the docstrings the type of the variables and return types as those are automatically added reading the type hints present in the signature. The whole code base has been updated accordingly.
v6.2.2 (2023-02-23)
- icinga: fix condition that determines if a service status is failed or not (T330318).
- redfish: ensure versions are parsed as
packging.version.Version
instances.
v6.2.1 (2023-02-20)
- tests: revert removal of mocked DNS resolver that prevented the tests to run without network access.
v6.2.0 (2023-02-20)
- spicerack: get authdns servers from config file (T329773):
- The list of all authdns servers was retrieved via the cumin alias
A:dns-auth
, which itself comes from Puppet resources (queryP{R:Class = profile::dns::auth}
). - This leads to cookbooks using dnsdisc or service modules failing whenever and authdns is unavailable for maintenance.
- The source of truth for active authdns servers is hiera, so refactor the modules to use a configuration file populated by Puppet instead.
- Using the configuration file from Puppet also removes the need to query the IP of the DNS servers and allows to use the Discovery class also withouth a fully working DNS.
- Use keywords only for most parameters of the touched classes.
- This change breaks the internal spicerack APIs while the cookbook-facing Spicerack class API has been left untouched.
- The list of all authdns servers was retrieved via the cumin alias
- alertmanager: add parent
Alertmanager
class:- In some use cases we need to silence alerts in alertmanager that are not attached to any host via the
instance
label. - In order to do so abstract away a higher level
Alertmanager
class with the generic bits to interact with the Alertmanager APIs and make the existingAlertmanagerHosts
class a derived class of that one. - Add a new Spicerack accessor
alertmanager()
to get an instance of a generic Alertmanager without relations to hosts.
- In some use cases we need to silence alerts in alertmanager that are not attached to any host via the
- icinga: allow
wait_for_optimal
to ignore acknowledged alerts (T319277). - redfish: allow for refreshing the manager info. Some of the iDRAC info such as firmware and BIOS version are more dynamic and as such we gather them every time, however some other data such as the model is fairly static and can benefit from being cached. As such update the interface so that we can refresh the specific data block for functions that need to.
- redfish: add upload/update methods to push firmware upgrades.
- mysql_legacy: remove
x2
handling logic as it's read-write in both datacenters, and actively written to. Remove it from the module's logic completely to avoid confusion and desync with cumin's list of core-db.
v6.1.0 (2023-02-10)
- puppet: allow to specify the exact message when disabling/enabling puppet.
- config: expand user's home (
~
) for logs dir. - cookbook: improve help message.
- redfish: move Dell specific functionalities to the Dell class.
- redfish: store all OOB info for later use.
- redfish: add
system_manager
info and properties forbios_version
,model
,manufacturer
.
- Fix incorrect usage of ClusterShell's
NodeSet
using the Cumin'snodeset
andnodeset_fromlist
instead.
- reposync: switch from
copy_tree
tocopytree
. - kafka: fix typo in docstring.
- dhcp: fix tests using unnecessary hack.
- setup.py: force a newer
sphinx_rtd_theme
. - setup.py: pin elasticsearch-curator
~=5.0
.
v6.0.0 (2022-12-14)
- The
cookbooks_base_dir
config key has been renamed tocookbooks_base_dirs
and must be a list of paths.
- Add support for multiple cookbooks paths to be loaded. All the cookbooks paths must have a directory inside named
cookbooks/
and this directory must not have an__init__.py
file as Namespace Packages are used (see PEP 420) (T325168). - Add module injection support (T319401):
- Add an optional configuration key
external_modules_dir
to define an external modules directory that will be injected in the Python path to allow to use also external modules not present in spicerack. - Add a new
spicerack.SpicerackExtenderBase
class to inherit from in order to define an external accessor class that will be used by Spicerack to allow to use external accessors. - Add an optional configuration key
extender_class
in theinstance_params
configuration key for specifying the fully qualified name of the Python class to use as the extender class.
- Add an optional configuration key
- setup.py: Add
python_requires
metadata. The latest pyroma does check for its presence and it makes sense to add it to prevent from installing the spicerack package on the wrong Python version. - setup.py: Revert old upper limit for
GitPython
, there are no more issue with more recent versions. - setup.py: Set an upper limit for
pylint
andprospector
for upstream issues. - setup.py: Split the python auto-formatter test dependencies on their own extra group so that they can be installed
alone in the already split virtual environment for the tox envs
py3-style
andpy3-format
. This way there are no conflicts between other test dependencies andblack
andisort
. - setup.py: Add specific style tox environments for each Python version to avoid the CI jobs to pick Python 3.7 that
has a pip backtracking issue with the latest versions of the dependencies. Keep the
py3-{style,format}
environments for ease of use locally and to not break compatibility but make thepy3-style
one not run automatically in CI.
v5.0.2 (2022-11-17)
- redfish: fix the reboot message ID check for new iDRAC versions.
v5.0.1 (2022-11-17)
- redfish: add reboot message ID for new iDRAC versions.
setup.py: remove support from Python 3.7 and 3.8. tox: remove support from Python 3.7 and 3.8.
v5.0.0 (2022-11-10)
- Starting with Spicerack v5.0.0 the support for Python 3.7 and 3.8 is dropped. For now there are no breaking changes but it's not guaranteed to work with those versions anymore.
- constants: remove
CORE_DATACENTERS
constant:- Remove the constant from Spicerack as it's a duplicate of the one already present in
wmflib
. - Convert all Spicerack code to use the same variable from
wmflib
. - All the cookbooks have been already migrated to use the
wmflib
one.
- Remove the constant from Spicerack as it's a duplicate of the one already present in
- ipmi: clarify that the target can also be an IP address. The ipmi module works the same as with a management FQDN.
- netbox: update allowed state transitions:
- As the way we use Netbox status is changed as part of the work in T320696 and the
staged
status is not anymore used, update the allowed transitions based on the new Server Lifecycle Diagram.
- As the way we use Netbox status is changed as part of the work in T320696 and the
- mypy: remove upper limit and refactor mypy configuration to properly work with newer versions.
v4.0.0 (2022-09-28)
- redfish: use the management IP instead of FQDN to connect to the management console:
- Some DELL hosts come with the
idrac.webserver.HostHeaderCheck
setting set to1
, that prevents to connect to the Redfish API unless the hostname is set in the configuration, creating a chicken and egg problem to automate the initial setup of the hosts. - To prevent this switch the whole module to use directly IPs for now. We might want to improve this later setting the hostname in the iDRAC settings and then switching to use the FQDN once that is configured, but because most of the automation will be already done by that time it's not clear if it would be a real win.
- [BREAKING API] this changes the
spicerack.Spicerack.redfish()
signature to require a hostname instead of a management FQDN and also makes the username parameter optional, defaulting to useroot
. - [BREAKING API] this changes the
spicerack.redfish.Redfish
class signature to require a hostname and management IP address instead of a single parameter with the FQDN. Although breaking, no cookbook usage should instantiate this class directly, but always via the above accessor.
- Some DELL hosts come with the
- icinga: add explicit support of the DRY-RUN mode (T315537):
- While the DRY-RUN compatibility of the
icinga
module was guaranteed by theremote
module, there was a usage of the@retry
decorator that wasn't able to detect when in DRY-RUN mode and accordingly reduce the number of retries.
- While the DRY-RUN compatibility of the
- Bump
pynetbox
dependency to~= 6.6
(T310745). - netbox: enable pynetbox threading (T311486).
- doc: fix
sphinx_checker
script for Python 3.10. - doc: add an example on how to use the
TOX_SKIP_ENV
environmental variable to run only certain tox environments when in development. - doc: improve documentation of the
CookbookBase
classes usage.
v3.2.1 (2022-08-31)
- elasticsearch_cluster: simplify routine to start masters last. Due to the multiple clusters an host can be a master in one instance and a child of another instance, bringing the process to a halt using the previous logic. The new logic returns all the hosts that are child for all instances first and after that the remaining ones that are master for at least one instance.
- peeringdb: minor fixes:
- Make the
Spicerack.peeringdb()
accessor more flexible allowing the configuration file to miss non mandatory keys. - Add tests for the
Spicerack.peeringdb()
accessor. - Use empty string as default value for the token to avoid the
Optional
type. - Fix mypy ignore for type mismatch.
- Fix various docstrings.
- Make the
- CHANGELOG: fix typos and uniform format.
v3.2.0 (2022-08-18)
- peeringdb: add a new module to interact with the PeeringDB API.
- elasticsearch_cluster: ensure to restart masters one at a time.
- flake8: move flake8's configuration all into
setup.cfg
.
v3.1.1 (2022-07-26)
- k8s: Increase retry value to prevent timeouts.
- Add support for python 3.10.
v3.1.0 (2022-07-20)
- redfish: add support to check the reboot of the DELL iDRACs:
- add a
most_recent_member()
method in theRedfish
class to return the most recent message from an API reply with members from Dell. - add a
last_reboot()
method to theRedfish
class to get the time of the last DELL iDRAC reboot. - add a
wait_reboot_since()
method to theRedfish
class to poll until the DELL iDRAC comes back online after a reboot.
- add a
- redfish: add property for the
HttpPushURI
url, needed for pushing firmware to the DELL iDRACs. - redfish: add a
generation
property to theRedfish
class to represent the DELL iDRAC genration i.e.13
==idrac8
,14
==idrac9
, and allow us to implment workarounds for older generations. - redfish: add a
fqdn()
getter property and__str__()
method to theRedfish
class:- When passing around a
Redfish
instance it's useful to know what host it represents as such add a getter for the FQDN property and update the__str__()
metbod to also return the FQDN.
- When passing around a
- k8s: Add
KubernetesNode.taints
propertry to return the taints of a node. - k8s: Retry checks for expected pods on drain as in some cases (e.g. pods not catching
TERM
) it might take a while for pods to actually terminate. Retry the check for expeced pods to reduce the chance for errors. - k8s: Retry pod evictions on
HTTP 429
from API server:- An
HTTP 429
response from the API server means that the eviction is not currently allowed because of a configuredPodDisruptionBudget
or a API server rate limit was hit. Retryevict()
calls in both cases 3 times with exponential backoff.
- An
- tests: reduce runtime by more than 80%:
- The logging module setup performed in the
spicerack._log.setup_logging()
function is not automatically reset by pytest, leading to slowness in some tests, in particular those with a lot of output, for example due to a lot of retries. - Add a
_reset_logging_module()
funtion in the tests for the_log
module that removes all exisiting filters and handlers to both the root and the IRC loggers. - Call the
_reset_logging_module()
function in the teardown of every test that directly or indirectly calls thespicerack._log.setup_logging()
function. - This reduces the runtime of the unit tests by more than 80%, in my local environment for example it went from ~150s to ~25s for the 825 tests run.
- The logging module setup performed in the
- redfish: better compare Dell SCP attributes:
- When comparing Dell SCP attributes for the configuration, consider them identical if they are a comma-separated list both if the separator is just the comma or comma+space. Some versions of iDRAC return the values comma+space separated when getting the current configuration.
- tests: fix
caplog
usage:- Make sure to use
caplog.at_level()
every time the pytest caplog fixture is used to ensure the reliability of the test itself and to avoid altering the level for other tests. - Rename the
argparse.py
test cookbook toargparse_ok
to prevent any conflict with the stdlib argparse module.
- Make sure to use
v3.0.0 (2022-06-28)
- ganeti: refactor the Ganeti module to support the new data model in Netbox:
- With the new representation of Ganeti data in Netbox, the hardcoded matching between cluster names and Ganeti RAPI FQDN endpoint would not work anymore.
- Refactor the module to gather the data directly from Netbox.
- This requires the addition of a custom field
ip_address
for the virtualization cluster groups model that connects it to the Ganeti RAPI VIP "svc" DNS name that is assigned to the related IP address in Netbox. The custom field has been already added and populated in Netbox in production. - The main benefit is the removal of the hardcoded mapping between clusters and their groups (rows/racks).
- Add a new
get_cluster()
andget_group()
methods in theGaneti
class to get a newGanetiCluster
orGanetiGroup
dataclass instances that represent the data required to identify the related resources. - Removed the hardcoded magic logic that mapped a row
A
to a Ganeti grouprow_A
as we're moving away from row-level redundancy at the network layer towards a rack-level redundancy model. This allows to rename the Ganeti groups at anytime freely.
- icinga: ensure that the downtime was applied (T309447):
- Add a
wait_for_downtimed()
method that polls the Icinga status to ensure that the hosts got downtimed. - Do this best effort, just logging a warning for now in case the downtime can't be verified.
- Add a
- redfish: make task polling work with older models that set the end time to Unix epoch at the task start.
- log: stop suppressing logging exceptions, that were silenced in the logging configuration.
- doc: fix intersphinx links.
v2.6.0 (2022-06-07)
- redfish: Assume all
GET
andHEAD
requests are read-only and anything else is potentally read-write. - redfish: allow to submit tasks with
DELETE
as some Redfish REST API DELETE actions do submit jobs. Thesubmit_task()
method accepts an HTTP method different thanPOST
now. - netbox: update netbox to use internal discovery address as it got migrated from a public IP to the discovery infrastructure.
- doc: set default language as Sphinx 5.0+ requires language to not be None when warnings are treated as errors.
- pylint: remove unnecessary comments. The latest pylint has moved the
no-self-use
reported issue to an optional plugin. We don't need to enable it, hence removing the unnecessary comments.
v2.5.0 (2022-05-26)
- redfish: update signature of the
request()
method to support dynamic keyword arguments that will be passed directly to the requests library:- Although this breaks backward compatibility of the existing API for the
request()
method, it's not currently used directly anywhere and so it was deemed ok to not justify a new major release for this. - In particular the previous
data
parameter that was passed to requests'sjson
parameter would now be passed to request'sdata
parameter, so not being automatically converted to JSON. Existing calls have been modified to callrequests()
with ajson
parameter instead.
- Although this breaks backward compatibility of the existing API for the
- service: add new module to expose Puppet's
service::catalog
:- Add a new module to load the Puppet
service::catalog
hieradata structure into Spicerack. - Part of the abstractions allow to access in a more programmatic way the properties of a given service.
- It also allow to
depool
/pool
(and related context manager) a service in the DNS Discovery realm. - It also allow to
downtime
(and related context manager) a service in a given datacenter in Alertmanager. - See the service module example usage.
- Add a new module to load the Puppet
- reposync: improve git push error handling catching more possible git errors.
- ganeti: add a
startup()
method to startup a Ganeti VM (T306661). - ganeti: add
set_boot_media()
method to modify the instance boot media and change it between disk and network (PXE) (T306661). - ganeti: print the output of a Ganeti VM creation while it's being created so that it gets printed live and not at only at end.
- dhcp: add to the
DHCPConfOpt82
andDHCPConfMac
classes amedia_type
parameter:- This new
media_type
parameter will allow use to easily choose PXE boot media other then the default debian installers. Specifically this will allow us to create cookbooks to test specific point releases as well as rescue and secure-wipe options.
- This new
- mediawiki: Mediawiki APIs now are only listening only on HTTPS, call the siteinfo API in HTTPS.
- remote: increase the wait for reboot timeout (T307260):
- In some cases, in particular during reimages, the reboot time can take longer. Increase the limit for now as in most cases this will not change anything as the check will succeed way before the timeout.
- tests: fix yaml file indentation.
- doc: fix typo.
- setup.py: mark the module as typed so that mypy can type check calls in other tools that are importing this library.
v2.4.1 (2022-04-12)
- elasticsearch_cluster: don't wait for green on first node.
- alertmanager: improve downtime:
- Allow to pass hosts with already a specific port. If the port is present no port-related regex is added, if the port is not present the port-related regex will be automatically added.
- Optimize the regex adding just once the port regex at the end if all hosts don't have the port specified.
- Add a matchers parameter to the
downtime()
anddowntimed()
methods to allow to perform additional filtering adding additional matchers. - Raise an error in case an additional matcher is trying to target the instance property.
- alertmanager: fix downtime:
- Fix the way the matchers for the silence are created. Because AlertManager and Prometheus will evaluate all matchers in AND, we can only add one single matcher for the instance property, that has to match all given hosts, as opposed to the current implementation that was adding one matcher per host.
v2.4.0 (2022-04-04)
- k8s: add a new module with initial support for Kubernetes that supports draining a node (T300879).
- spicerack: add a new
Spicerack.thanos()
accessor to get an instance ofwmflib.prometheus.Thanos
. - ipmi: add a
remove_boot_override()
method to clear any BIOS boot parameter override because some hosts don't automatically clear that after a reboot.
- ipmi: improve the
force_pxe()
method changing the way it sets the Force PXE bit in the BIOS boot parameters to force the reset of the valid flag after a reboot and consider the valid flag as harmless anyway (T304434).
- pylint: fix newly reported issue.
v2.3.3 (2022-03-17)
- reposync: don't catch the
RepoSyncNoChangeError
allowing the calling cookbook to decide what to do in case of no changes in the repository. - reposync: add a
force_sync()
method to perform a force push from the local repository to all remotes.
v2.3.2 (2022-03-10)
- alertmanager: add missing support for dry-run mode.
- reposync: make tests run quicker:
- Some tests were using
192.0.2.1
as a git remote, that doesn't fail immediately, at least on macOS. Replace it with a non-existent local path.
- Some tests were using
v2.3.1 (2022-03-10)
- spicerack: make
http_session
more flexible:- Instead of updating the signature with the new parameters available in wmflib, relax the signature here in spicerack and delegate to wmflib what are the accepted parameters.
- alertmanager: do not retry on HTTP 500 responses:
- The Alertmanager API can respond with an HTTP Status Code of 500 on some requests with a valid JSON response, although there was no server error (i.e. trying to delete an already deleted silence).
- Do not retry on 500 responses, allowing requests to get a proper response and then let the module itself decide what to do based on the content of the response.
v2.3.0 (2022-03-09)
- alertmanager: catch the already deleted silence error (T293209):
- The Alertmanager API, when trying to delete an existing silence, returns 500 with a JSON string message in the case of an already expired or deleted silence.
- On delete, catch the exception and just log a warning message in case the silence has been already deleted / is already expired.
- In orther to achieve this, change the
AlertmanagerError
exception to accept an optional parameter with the API response object.
- elasticsearch_cluster: load the configuration from a yaml file, remove the hardcoded one (T278378).
- spicerack: use the private property for the config dir within the class, for coherence.
v2.2.0 (2022-03-08)
- alertmanager: introduced a new module to manage resources on AlertManager (T293209):
- It has an
AlertmanagerHosts
class that currently supports creating a silence (downtime in Icinga terminology) and removing it given its ID. It also provides a context manager to perform the silence similarly to the icinga module.
- It has an
- alerting: introduced new alerting module with an
AlertingHosts
class as a wrapper around theIcingaHosts
andAlertmanagerHosts
classes so that the same actions are performed on both instances. - spicerack: add accessors for the new
AlertmanagerHosts
andAlertingHosts
classes asalertmanager_hosts
andalerting_hosts
respectively. The preferred way is to use thealerting_hosts
accessor so that actions like the downtime are performed on both systems.
- redfish: fix the default value for the
allow_new_attributes
parameter ofRedfishDell.scp_dump()
.
v2.1.0 (2022-03-03)
- reposync: add new module to manage syncing of automatically generated repositories.
- redfish:
DellSCP
, allow creation of new entities:- So far the
DellSCP
class allowed only to modify existing attributes in existing components. - When dealing with a
DellSCP
configuration, there are cases in which it might be necessary to create attributes that do not exist in the current configuration. For example when changing the boot mode betweenBios
andUefi
a long list of attributes disappear/appear in the configuration. - To allow this use case an
allow_new_attributes
keyword only parameter has been added to the constructor to explicitly allow new attributes, keeping the existing behaviour of typo-protection if that is not passed. - Another possible use case is to start from a configuration and create a components section from scratch.
- To allow this use case an
empty_components()
method was added that, while keeping the rest of the configuration intact, empties the existing components and from there allows to set new attributes, transparently creating any missing component. - Add the
allow_new_attributes
parameter toRedfishDell.scp_dump()
to enable this new feature when dumping a configuration.
- So far the
- dhcp: fix lowercase serial tag matching.
- setup.py: temporary limit redis library:
- The latest
redis
release v4.1.4 creates some dependency issue, for now limit the upper version as we're anyway using v3 in production as that's the version up to Debian Bullseye.
- The latest
- setup.py: upper limit for black:
- On Debian bullseye
elastcisearch-curator
latest release dependencies have a conflict with black's dependencies and it's not possible to put an upper limit toelastcisearch-curator
because previous version don't build properly on Bullseye from pip (the debian package version of it has a patch to override its dependency constraints). - To prevent conflicts force an upper limit on the black version for now.
- On Debian bullseye
- bandit: ignore hardcoded password in tests:
- Ignore the
B105:hardcoded_password_string
andB106:hardcoded_password_funcarg
checks in test directories. - Removed related #nosec comments unnecessary now.
- Ignore the
- prospector: ignore deprecation message:
- The latest
prospector
issues a deprecated message for thepep8
andpep257
tools that have been renamed topycodestyle
andpydocstyle
respectively. The new names are incompatible withprospector < 1.7.0
, so for now keep the old names and disable the deprecation warning.
- The latest
v2.0.0 (2022-02-15)
- management: removed module, it was deprecated in v1.0.0.
- spicerack: allow to execute another cookbook from within a cookbook:
- Add the capability from within a cookbook to call another cookbook with custom parameters using the
run_cookbook()
method in the Spicerack class. - The called cookbook will be executed with the same global options with which the current cookbook is running with and will log in the same file of the current cookbook run.
- Add the capability from within a cookbook to call another cookbook with custom parameters using the
- redfish: better support of parsing JSON responses (T299123):
- In some older Dell servers the Redfish API sometimes replies with different casing for the
MessageId
key, likeMessageID
. - It's also possible that Oem custom messages are reported in the same replies with a different structure.
- Skip the Oem messages and try both keys cases when parsing the reply.
- In some older Dell servers the Redfish API sometimes replies with different casing for the
- redfish: improve support for DRY-RUN mode:
- In DRY-RUN mode allow read-only requests to be performed (only GET and HEAD) but return a dummy successful responses in case of an exception raised by requests (timeout, connection error, etc).
- In DRY-RUN mode don't allow read-write requests and return a successful dummy response instead.
- In various methods return a dummy response in DRY-RUN mode.
- dhcp: case-insensitive match of the serial number for the Dell management DHCP requests:
- When matching the serial number in the DHCP request for the management interfaces of Dell servers, match them in a case-insensitive way because the data sent varies between hosts (
idrac-ABC1234
oriDRAC-ABC1234
).
- setup.py: the latest v2.2.0 release of dnspython is generating mypy issues, temporarily put an upper limit to it.
- spicerack: adapt type hint to the latest wmflib release.
v1.1.1 (2021-12-22)
- redfish: tell if any change was made in
DellSCP
instances:- When updating a
DellSCP
configuration with theset()
orupdate()
method, returnTrue
if the config was actually changed,False
if it had already the correct value(s).
- When updating a
- dhcp: fix file removal check in dry-run mode.
v1.1.0 (2021-12-16)
- spicerack.redfish: add new module with support for Redfish API:
- Add a new redfish module that allows to interact with the Redfish API. As Redfish implementation differs
sensibly between vendors, there are some basic functionalities in the
Redfish
class and then there is aRedfishDell
class for Dell-specific functionalities. - At the moment the only supported vendor is Dell (hence the hardcoded
RedfishDell
call inSpicerack.redfish()
.
- Add a new redfish module that allows to interact with the Redfish API. As Redfish implementation differs
sensibly between vendors, there are some basic functionalities in the
- spicerack: add a
management_password
property getter to access the cached management password. If the cache is empty the password will be asked to the user.
- ganeti: add new Ganeti clusters in the new site
drmrs
.
- ipmi: when running an IPMI command that contains sensitive data, allow to hide the sensitive data from the logs and the outputs.
- ganeti: fix up row configuration for ganeti test cluster.
- dhcp: fix missing semicolon in DHCP config.
- remote: intercept bad uptimes in
wait_reboot_since()
.- In some cases the uptime method could fail to parse the host uptime, for example during a shutdown of a system where the login might be prevented to the host.
- Make sure that the
wait_reboot_since()
method catches those errors too and retries.
- Adopt
pathlib.Path
instead of theos
andos.path
functions across the project to modernize it following current best practices. - administrative: add examples to the documentation and documentation for the special method
__str__
. - pylint: fix newly reported issues.
v1.0.6 (2021-10-21)
- dhcp: add support for MAC address based config (T269855):
- Add support for MAC address based configuration snippets to be used in the automation for Ganeti VMs instead of using DHCP Option 82 as the MAC address is retrieved from Ganeti API.
- The MAC address is validated to ensure has the format accepted by the DHCP server.
- Consolidate the filename path for both DHCP Option 82 and MAC address based configuration to be in the same directory, dependent only by the TTY settings as there is no other difference between the two and it allows to prevent duplicated snippets for the same hostname in different directories as the library checks that the file doesn't exists before creating it.
- Consolidate the defult string representation implementation of the DHCPConfiguration derived classes into the
abstract parent one because they are all the same. Define a class property
_template
as part of theDHCPConfiguration
class API.
- mediawiki: add a
get_primary_dc()
method that returns the primary/active datacenter. - kafka: docstrings minor improvements.
- changelog: fix typo in previous entry.
v1.0.5 (2021-10-12)
- kafka: add a new
kafka
module with the following capabilities (T291681):- transferring of offsets between consumer groups and clusters approximating offsets based on timestamp.
- approximating and seeking offsets based on user provided timestamps.
- icinga: add
recheck_failed_services()
method to force a recheck of services which are in failed state.
- puppet: get only the last line of output in
PuppetHosts.get_ca_servers()
to ignore spurious output that might be present in some environments.
v1.0.4 (2021-10-06)
- dhcp: use IP address instead of DNS name:
- Given that all the required data comes from Netbox there is no point to depend on the DNS when generating the DHCP snippets, require to pass the IPv4 instead of the FQDN.
- Renamed
fqdn
parameter toipv4
in theDHCPConfOpt82
class. - Renamed
ip_address
parameter toipv4
in theDHCPConfMgmt
class. - Although technically this is an API change, the whole module is new and still unused except from the experimental reimage cookbook, hence not considering it as a breaking change for the semantic versioning.
- remote: reduce wait time for reboot to 20 minutes.
v1.0.3 (2021-09-28)
- dhcp: fix typo in opt82 file path.
v1.0.2 (2021-09-27)
- dhcp: always require to se the OS version when instantiating a
DHCPConfOpt82
instance. Although technically this is an API change, the whole module is new and still unused, hence not considering it as a breaking change. - remote, puppet: reduce logging verbosity.
- ganeti: use
--force
option in shutdown method when callinggnt-instance shutdown
to work with all states a VM can be in. - puppet: fix check exception inheritance to the correct
SpicerackCheckError
.
v1.0.1 (2021-09-23)
- remote: refactor
wait_reboot_since()
:- As the check for uptime is currently either returning a value for all hosts or raising an exception, remove the existing logic to check for a partial result as that can't happen.
- Catch instead the error and re-raise a check exception with a clear message.
- Also round the printed value of the uptime and the time against which it's checked to 2 decimal values for more readability.
- setup.py: limit elasticsearch max version:
- The latest 7.15.0 release has started to deprecate things for the upcoming 8.0.0 release, and mypy started complaining about some return types.
- Instead of fixing the signatures to be compatible with both versions put a max version limit for now, we'll deal with the upgrade when the time will come, Debian most recent version is 7.1.0.
v1.0.0 (2021-09-22)
- remote: remove
RemoteHosts.init_system()
method:- As systemd is used by all hosts and this method is not used in any cookbook, remove it completely as it's no longer needed.
- remote: add support to enable/disable Cumin output:
- Add support to suppress Cumin's output and progress bars independently to the
RemoteHosts
andLBRemoteCluster
classes. - Add a
print_output
andprint_progress_bars
boolean parameters torun_sync()
,run_async()
andrun()
methods to independently print Cumin's output and progress bars respectively. - Add a simplified
verbose
parameter to the more higher level methodsrestart_services()
andreload_services()
that when set toFalse
will suppress both output and progress bars at once. - Add just the
print_progress_bars
parameter for the high level methodswait_reboot_since()
anduptime()
. - All the new parameters default to
True
right now to keep the existing behaviour, to be changed toFalse
in a future release.
- Add support to suppress Cumin's output and progress bars independently to the
- icinga: reduce verbosity of Cumin's output, taking advantage of the new parameters to control the output of Cumin's commands.
- puppet: reduce verbosity of Cumin's output, taking advantage of the new parameters to control the output of Cumin's commands.
- dhcp: reduce verbosity of Cumin's output, taking advantage of the new parameters to control the output of Cumin's commands.
- ipmi: improve dry-run mode for
force_pxe()
:- When
force_pxe()
can't verify that the next boot will indeed be via PXE it raises an exception. Convert that into a warning logging message when in DRY-RUN mode to let the cookbooks continue the DRY-RUN.
- When
- versioning: moving Spicerack releases to a semantic versioning schema.
- management: deprecate the
Management
class:- As its only purpose was to get the management FQDN of a host, given that the same functionality is now provided
by the netbox module via the
NetboxServer
class and itsmgmt_fqdn
andasset_tag_fqdn
properties, deprecate the class for a subsequent removal.
- As its only purpose was to get the management FQDN of a host, given that the same functionality is now provided
by the netbox module via the
- confctl: fix example code in docstring.
- pylint: fix newly reported issues.
- doc: add how to contribute section.
v0.0.59 (2021-09-09)
- ipmi: refactor class signature:
- API breaking change, but the
Spicerack.ipmi()
accessor is used only in thesre.hosts.decommission
andsre.hosts.ipmi-password-reset cookbooks
, so it should be trivial to change both at once. - Convert the IPMI class to require the FQDN of the management console to target, to avoid the need to pass that around both from the client and internally in the class.
- The caching of the management password is done transparently by the
Spicerack.ipmi()
accessor to avoid the anoyance of being asked the management password for each host.
- API breaking change, but the
- dhcp: small refactor (the module is still unused):
- Rename
switch_port
toswitch_iface
to avoid confusions. - Rename the context manager from
dhcp_push()
toconfig()
as it's more natural to use:with dhcp.config(my_config): # do something
. - Simplify formatting of templates, added ignores to vulture for false positives
- Add constructor documentation to the dataclasses.
- Rename
- icinga: remove the deprecated
Icinga
class:- The Icinga class has been deprecated for a while now and it's time to remove it completely. No cookbook is using it anymore.
- remote: add support for the installer key:
- When instantiating a
remote()
instance, allow to pass a new parameterinstaller
, defaulted toFalse
, that whenTrue
will use the special installer key for the remote instances that allow to connect to the Debian installer environment or a freshly installed host prior to its first Puppet run.
- When instantiating a
- ipmi: add status and reboot capabilities:
- Add a new method
power_status()
that returns the current power status and is also used by the existingcheck_connection()
method. - Add a new method
reboot()
to issue an IPMI power on or power cycle, based on the current status of the device.
- Add a new method
- netbox: add getter
asset_tag_fqdn
for the asset tag mgmt FQDN property. - icinga: add
downtime_services()
andremove_service_downtimes()
and also aservices_downtimed()
context manager to allow to downtime only the host services that matches the given regex.
- puppet: minor improvements:
- Return the results from the
Puppet.first_run()
method to allow to save it to a file like the current reimage script does. - Add an accessor for the
master_host
property in thePuppetMaster
class as this is created and instantiated by Spicerack and was hidden from the user of the API.
- Return the results from the
- decorators: migrate to the wmflib version of
@retry
(T257905):- Use the wmflib version of
@retry
while keeping the dry-run awareness and default to catchingSpicerackError
instead ofWmflibError
like the pre-exsiting version was doing.
- Use the wmflib version of
- code style: migrate all the usage of string
format()
to f-strings. - pylint: addressed newly reported pylint issues and removed unnecessary disable comments.
- prospector: disable
E203
for pep-8 over black. - code style: if there are no local modifications check last commit instead of not checking anything.
v0.0.58 (2021-08-25)
- Class API: add
rollback()
method- Add a new
rollback()
method to theCookbookRunnerBase
base class that by default does nothing. - The method is called by Spicerack when a cookbook exits with a non-zero exit code or raises an un-caught exception.
- This allows cookbooks to define their own cleanup strategy in case of errors, for example to restore a previously coherent state.
- Any exception raised by the
rollback()
method will be caught and logged by Spicerack with its original exit code and will then exit with a reserved exit code for a failed rollback.
- Add a new
- mediawiki: remove cron-specific maintenance implementation details, replaced by systemd timers (T289078).
- icinga: use shlex to quote the command string for bash (T288558):
- This fixes the downtiming that would fail if the admin reason contains an apostrophe, due to lack of escaping.
- mediawiki: ignore php-fpm when stopping cronjobs (T285804):
- On mwmaint, php-fpm is used to serve noc.wikimedia.org so we want to keep it running even when stopping cronjobs.
v0.0.57 (2021-08-02)
- dnsdisc: improved message logged explicitely saying what was checked and what didn't match when checking that a discovery record has been updated (T285706).
- icinga: adapt to the newer API of the
icinga-status
output. - icinga: write directly to the Icinga command file instead of calling the
icinga-downtime
wrapper script where it was used so that the whole module now interacts directly with the Icinga command file. This opens up the route for further improvements (T285803). - ganeti: add ganeti test cluster to the possible Ganeti locations (T286206).
- mysql_legacy: re-add
x2
database section and add support for active/active core sections (T285519):get_core_dbs()
now supports excluding sections from its cumin query. All of the functions that call it in the context of setting the database read-only or read-write will now exclude sections listed inACTIVE_ACTIVE_SECTIONS
.
- puppet: when regenerating the client certificate, do not rely on the exit code of the Puppet command as it might be misleading. It already relies on successfully finding the certificate fingerprint.
- tox: remove
flake8-import-order
plugin as dependency now that the import order is ensured byblack
andisort
.
v0.0.56 (2021-06-26)
- mediawiki: reverted the change of v0.0.55 to make siteinfo API request over HTTPS.
- mediawiki: remove unnecessary and broken disable of systemd timers added in version v0.0.55.
- mysql_legacy: reverted the change of v0.0.49 to add the new
x2
database core section (T285519).
v0.0.55 (2021-06-24)
- mediawiki: Update cronjob code now that most are systemd timers:
- Removed
check_cronjobs_enabled()
. - Renamed
stop_cronjobs()
tostop_periodic_jobs()
. - Added
check_periodic_jobs_disabled()
,check_periodic_jobs_enabled()
andcheck_systemd_timers_enabled()
.
- Removed
- mediawiki: Make siteinfo API request over HTTPS.
v0.0.54 (2021-06-21)
- icinga: rename some
IcingaHosts
methods:- This is an API breaking change, but the newly introduced
IcingaHosts
API is not yet used widely, just one Cookbook uses it so far. - Rename some methods of the
IcingaHosts
class to be more dry and explicit. Namely: *hosts_downtimed
->downtimed
(context manager) *downtime_hosts
->downtime
*host_command
->run_icinga_command
- This is an API breaking change, but the newly introduced
v0.0.53 (2021-06-10)
- icinga: use bash wrapper to allow sudo in the
IcingaHosts
class.
- doc: use
add_css_file()
instead ofadd_stylesheet()
. - doc: fix parameter type in docstring.
v0.0.52 (2021-05-06)
- dhcp: Add module for manipulating dynamic DHCP entries on target data centers and restarting the DHCP server (T269855).
- icinga: pass
verbatim_hosts
option to theicinga-status
script when using verbatim Icinga hostnames that are not real hosts.
- netbox: fix check for server role:
- The physical devices and virtual machines objects in Netbox have different names for the role property (
device_role
vsrole
). Use the correct property each time.
- icinga: fix typo in docstring.
v0.0.51 (2021-05-04)
- dnsdisc: do not configure DNS resolver. As the module is injecting the nameservers of the authoritative DNS, do not
let the DNS module auto-configure itself with
/etc/resolv.conf
.
- tests: fix mock of the DNS module that was not in some cases properly mocked and the tests were relying on a properly
configured
/etc/resolv.conf
.
v0.0.50 (2021-05-04)
- setup.py: relax elasticsearch dependencies:
- In order to be able to build spicerack for Debian bullseye that ships
python3-elasticsearch
7.1.0
andpython3-elasticsearch-curator
5.8.1
, relax the related dependency constraints insetup.py
. - Elasticsearch requires to bump the version above the suggested compatibility matrix, we'll test if all works as expected. See the elasticsearch compatibility matrix.
- Elasticsearch curator matches upstream compatibility matrix, see the elasticsearch curator compatibility matrix.
- As Spicerack is released via debian packages this will not affect the buster builds.
- In order to be able to build spicerack for Debian bullseye that ships
- netbox: improve
as_dict()
:- Instead of calling
serialize()
for the conversion to dictionary, just callingdict()
on the object gives a more useful representation of the object because all the nested properties are converted to string or sub-dictionaries with useful values instead of just the IDs. - As a result any usage of
as_dict()
that relied on the format of specific fields might break. At the moment no cookbook is using it. - See also the "Casting the object as a dictionary" example in pynetbox.core.response.Record.
- Instead of calling
- netbox: add
NetboxServer
class:- Add a
NetboxServer
class in the netbox module to give a higher level abstraction across physical servers and virtual machines. - This is particularly useful to finally have an authoritative way to convert a hostname into a FQDN or get the managment FQDN of a host given its hostname (T240176).
- The class also allow to update the device status only if it's a physical host and the status transition is approved.
- Those new features will be used by the cookbook that will replace the reimage script and then the current usage of
some of the existing methods in the
Netbox
class should be converted to use this class instead.
- Add a
- icinga: add new
IcingaHosts
class (T277740):- Implements the TODO that wanted to move the
Icinga
class into a class that is initialized with the target hosts so that it's not necessary anymore to pass them to each method. - Keep the existing
Icinga
class for now, but mark it as deprecated, both in the documentation ofspicerack.Spicerack.icinga()
andicinga.Icinga()
and emit also aDeprecationWarning
when instantiated. It will be removed in the next release once all the cookbooks have been migrated to the newspicerack.Spicerack.icinga_hosts()
accessor. - Move the detection of the Icinga command file to its own class to allow to cache it across different instances,
making the instantiation of multiple
IcingaHosts
class free after the first one. - Allow to manage also non-servers that are defined as Icinga hosts passing the
verbatim_hosts
parameter, that will not extract the hostname from the given hosts assuming that they are already FQDNs.
- Implements the TODO that wanted to move the
- toolforge.etcdctl: Allow getting the cluster health. This opens up being able to wait/stop if the cluster status is not what's expected when doing operations (T276338).
- icinga: use a bash command wrapper to allow sudo, otherwise the echo command will fail to output to the file.
- icinga: use a sudo-friendly command to detect the Icinga
command_file
. - netbox: improve
as_dict()
:- Instead of calling
serialize()
for the conversion to dictionary, just callingdict()
on the object gives a more useful representation of the object because all the nested properties are converted to string or sub-dictionaries with useful values instead of just the IDs. - See also the "Casting the object as a dictionary" example in pynetbox.core.response.Record.
- Instead of calling
- remote: fix
use_sudo
onsplit()
. - netbox: fix object type returned for status. The status should be returned as string and not as a Netbox object.
- doc: add documentation for the toolforge package.
- doc: remove obsolete configuration.
- setup.py: add missing tag for Python 3.9, already supported.
- tests: fix pip backtracking separating the prospector tests into its own virtualenv.
- tests: fix format checking:
- If no Python files were modified at all, the latest isort would bail out. Skipping the checks if no Python files were modified at all.
- doc: fix documentation checker for sub-packages:
- The existing checker was assuming a flat space of modules inside spicerack, while now we have also subpackages. Adapt the checker to detect those too.
- Convert file operations to pathlib.
- doc: move ClusterShell URL to HTTPS.
- netbox: refactor unit tests.
v0.0.49 (2021-03-04)
- icinga: changed the type for the
hosts
parameter in theget_status()
method fromspicerack.typing.TypeHosts
tocumin.NodeSet
.
- icinga: add
Icinga.wait_for_optimal()
method to pause while hosts converge to an optimal state. - puppet: add
Puppet.get_ca_servers()
method to retrieve the configured Puppetca_server
on the target hosts. - remote: allow prepending every command to execute on the target hosts with sudo. This is a first temporary iteration until Cumin will support it natively.
- toolforge.etcdctl: add new toolforge package with an etcdctl module to run etcdctl commands and retrieve a parsed output. Focused on etcd member management only for now (T267412).
- config: allow to use paths relative to the user's
$HOME
directory expanding~
. - logging: improve logging format:
- Add the
DRY-RUN
prefix also to file logs to allow to distinguish dry-run executions from the real ones just looking at the logs. - Improve the execute cookbook log message including the whole arguments so that it includes also the global args
such as
verbose
anddry-run
.
- Add the
- remote:
RemoteHosts.wait_reboot_since()
is now using a constant backoff. Previously, a linear backoff with a base delay of 10 seconds was used. Since we do expect the reboot of a server to take some time, by the time the server has rebooted, the retry interval has already grown to multiple minutes. A constant backoff should be appropriate and should increase the reactivity of this check significantly. - mysql_legacy.py: Add the new
x2
database core section (T269324).
- cookbooks: force the title to be one line. When reading the title from the cookbooks, pick only the first line to prevent the UI to be cluttered by a title erroneously set to multi-line.
- tox: fix for when the system setuptools is too old.
- elasticsearch_cluster: Revert the return the cluster name in
ElasticsearchCluster.__str__
change added inv0.0.32
. - remote: fix pylint typing confusion.
- gitignore: add vim swap files.
- tests: temporary force
mypy
upper version to avoid a regression in release 0.800. - tests: tox, enable python 3.9 support.
- code style: introduced
black
andisort
as autoformatters (T211750). - doc: add a development page to highlight how the code is formatted and how to integrate the code formatters with an editor/IDE or in the git workflow (T211750).
- git: allow exclude code auto formatters refactor commit from git blame adding the
.git-blame-ignore-revs
file.
v0.0.48 (2021-01-18)
- logging: fix base path and name to setup logging.
- In the recent refactor to the new APIs, the paths passed to the setup_logging function were not anymore correct. Now that the cookbook items have a proper Spicerack-formatted path and name, use them directly.
v0.0.47 (2021-01-13)
- Use newly migrated code from wmflib:
- Some additional functionalites were moved to wmflib (>= 0.0.5), remove the duplicated code from Spicerack and use the wmflib version instead.
- interactive: convert all imports to use the wmflib version, remove the duplicated code. The module is for now left
to hold the
get_management_password()
function. - prometheus: moved entirely to wmflib.
- _log: use the SAL (!log) IRC handler from wmflib.
- The
@retry
decorator will be migrated in a separate patch to keep its dry-run awareness.
- administrative: Add getters for the other Reason fields.
- puppet: update
get_certificate_metadata()
so the pattern is more specific and prevent it to match other hosts. - elasticsearch_cluster: fix call to
@retry
.
- dnsdisc: improve test coverage.
- tests: fix deprecated pytest argument.
- tox: Remove
--skip B322
from Bandit config not supported by newer Bandit versions.
v0.0.46 (2020-12-10)
- icinga: add support for downtimed and notifications_enabled parameters (T269672).
- elasticsearch-cluster: add support for cloudelastic (T268779).
v0.0.45 (2020-11-30)
- Removed config and phabricator modules migrated to wmflib and update imports.
- remote: re-enabled Cumin's output removing its suppression. The work on T212783 will make it more flexible on a per-execution basis, but for now is better to just re-enable it and make the errors surface to the users.
- cookbook API: add class API
- In addition to the simple cookbooks function API interface add support for a more integrated class-based API.
- Spicerack will perform auto-detection of the API used by the cookbook and automatically convert the module-based API cookbooks into class-based cookbooks so that only one interface is actually supported internally.
- The class API defines a
CookbookBase
class that cookbooks that want to use this API must extend creating a derived class. The derived class can have any name. Multiple cookbooks in the same module are supported. - The class-based API allows a more in-depth integration with Spicerack:
- Allow to perform additional initialization and validation steps in the class constructor before the cookbook
execution starts, allowing the cookbook to bail out before execution and any related
!log-ging
. - Allow to define a custom runtime description that will be included, for example, in the
START/END
logging messages that are also sent to IRC and!log-ed
into SAL. - Refactor the Cookbook API documentation to be more detailed and following Sphinx standards to document the cookbooks module interfaces.
- Refactor out from the private
_cookbook
module some functionalities to a_menu
and_module_api
modules.
- Allow to perform additional initialization and validation steps in the class constructor before the cookbook
execution starts, allowing the cookbook to bail out before execution and any related
- spicerack: add
requests_session
accessor to get a requests'sSession
pre-configured bywmflib
with a default timeout, retry logic andUser-Agent
. - decorators: Add an optional custom failure message to
@retry
:- The
@retry
decorator logs the messages from exceptions raised during execution, but when there are chained exceptions ("raise from", etc.) only the top-level error is logged. For example, inMediaWiki._check_siteinfo
, we only logFailed to get siteinfo
and throw away the message from the underlyingRequestException
. Instead, this traverses the exception chain (using the same logic as the built-in default handler for uncaught exceptions) and includes each exception's message in the log entry.
- The
- Convert all usage of the
requests
package to use thewmflib.requests.http_session
instead to have a niceUser-Agent
, a default timeout and a retry logic on some failures acrossSpicerack
. - puppet: suppress deprecation warnings.
- decorators: Log chained exception messages in
@retry
.
- doc: add missing link to the
wmflib
package. - dependencies: remove temporary hacks.
- dependencies: update min version to match the versions in Debian Buster.
- tests: remove
require_*
decorators. - Refactoring: renamed internal modules with a leading underscore:
- Moved
cookbook.py
to_cookbook.py
andlog.py
to_log.py
as all their content is actually internal tospicerack
and no client should use any of that. They were already excluded from the generated documentation for the same purpose.
- Moved
v0.0.44 (2020-10-13)
- dns: the
dns
module has been migrated towmflib
and removed from Spicerack. Its access via thespicerack.dns(()
accessor is unchanged, but any direct imports from thespicerack.dns
module in cookbooks must be replaced withwmflib.dns
(T257905).
- Spicerack now depends on the new
wmflib
package. - log: adjust the return type of
FilterOutCumin.filter()
as required by mypy (upstream documentation incorrect). - doc: refactor and simplify its configuration.
- pylint: allow
logger
as module-scope name given that is used throughout the project so that there is no need for a pylint disable comment.
v0.0.43 (2020-09-16)
- elasticsearch_cluster: Store which datacenters to query for metrics in Prometheus.
v0.0.42 (2020-08-31)
- elasticsearch_cluster: fix prometheus query syntax.
v0.0.41 (2020-08-31)
- dnsdisc: change retry logic to wait up to 27 seconds with more frequent checks instead of the current 9 seconds.
v0.0.40 (2020-08-27)
- elasticsearch_cluster: verify all write queues are empty querying Prometheus (T261239).
- doc: improved logging documentation.
v0.0.39 (2020-08-18)
- Add native mysql spicerack module.
- mysql_legacy: update Cumin queries for DB selection due to Puppet refactors.
- icinga: fix bug for
recheck_all_services()
, the signature of the Icinga command requires a check time too.
- Remove support for Python 3.5 and 3.6.
- actions: refactored to take advantage of more recent Python versions.
- Add type hints for variables and attributes since the support for older Python versions has been dropped.
- Pin to a working version of prospector as 1.3.0 was overenthusiastic with updating its dependencies.
- actions: fix test for pytest regression in version 6.0.0.
v0.0.38 (2020-06-09)
- ganeti: update the list of available rows in the
eqiad
andcodfw
datacenters.
- Add support for Python 3.8.
v0.0.37 (2020-05-18)
- icinga: fix
get_status()
:- The
icinga-status
script that returns the status can be run also in dry-run mode as it's a read-only tool. - The
icinga-status
script exits with a non-zero exit status on non-optimal and missing hosts, accept any exit code.
- The
v0.0.36 (2020-05-18)
- tests: add
@require_caplog
to someactions
module tests to fix the build on Debian Stretch.
v0.0.35 (2020-05-18)
- Rename
mysql
module tomysql_legacy
:- The existing
mysql
module uses remote execution of the mysql client to interact with mysqld's. Moving this out of the way to allow room for a newmysql
module which uses a native mysql client library.
- The existing
- interactive: add
get_secret()
function for requesting secrets interactively with optional ask for confirmation. - icinga: allow to check the status of a host:
- Add a
get_status()
method that allows to get the current status of a set of hosts in Icinga. - The returned status allow to quickly check if all the hosts are in optimal state, get a list of those that are not and the services that are failing on those hosts.
- Add a
- actions: new module to track cookbook actions:
- Add a new actions module that contains an
Actions
class and anActionsDict
class that is an ordered dictionary with default dictionary functionalities ofActions
class instances. - The
Actions
instances allow to keep track of actions performed by acookbook with the following features:- Save the message of the action with different levels (
success
,warning
,failure
). - Log the message of the action with the associated log level.
- Keep track of the presence of any warning or failure.
- Have a nice string representation of the actions, suitable to be used to update a Phabricator task.
- Save the message of the action with different levels (
- The
ActionsDict
class has too a nice string representation of its items. - This is a porting with some generalization of the code present in the sre.hosts.decommission cookbook.
- Pre-create an
ActionsDict
instance in spicerack so that it can be accessed in the cookbooks directly asspicerack.actions
.
- Add a new actions module that contains an
- typing: add a
typing
module for custom type hints:- Add a new typing module to hold all custom types useful across Spicerack.
- Define a custom type
TypeHosts
that can be either aNodeSet
or a sequence of strings. - Use the new type in the icinga module.
- ipmi: fix
subprocess.run()
calls to raise on failure.- The
check
parameter is by default :py:data:`False`, hence not raising an exception if the executed command exit with a non-zero exit code. - Forcing the
check
parameter to be :py:data:`True` to ensure an exception is raised on failure.
- The
- icinga: refactor input parsing:
- The Icinga class needs to use hostnames instead of FQDNs.
- Move the conversion from FQDNs (or hostnames) to hostnames to a static method so that can be used across the class without repetition of code.
- tests: fix newly reported flake8 issues.
- tests: relax Prospector dependency:
- The upstream bug that required to set an upper limit on the version of Prospector has been fixed.
- Removing the upper bound to get newer features.
- Fix newly reported issues.
- tests: relax Bandit dependency:
- The upstream bug that required to set an upper limit on the version of Bandit has now a workaround using a specific syntax for the exclude files.
- Removing the upper bound to get newer features.
- Fix newly reported issues.
- Remove
nosec
comments not needed anymore and convert some of them into skipped checks intox.ini
. This way the affected lines are still checked for other issues.
v0.0.34 (2020-05-06)
- netbox: removed property
device_status_choices
of theNetbox
class, not currently used and removed from Netbox API starting from version 2.8.0.
- netbox: adapt to new Netbox API:
- Netbox API starting with Netbox 2.8.0 have removed the choices API endpoint. Given that it was used only for the status, removing its support completely for now given that is not directly supported by the pynetbox library yet.
- doc: set min version of sphinx_rtd_theme to 0.1.9 to match Debian Stetch.
- doc: fix documentation generation for Sphinx 3.
- changelog: specify breaking change for v0.0.33.
v0.0.33 (2020-05-04)
- netbox: the default instance returned when calling
Spicerack.netbox()
uses a read-only token. To have read-write access to Netbox theread_write
parameter should be set toTrue
.
- netbox: add support for RW and RO tokens:
- Use a RO token by default, allow to request a Netbox instance with a RW token.
- Always use a RO token if in dry-run mode to allow to expose the Netbox API object directly to the clients.
- netbox: expose the pynetbox API object:
- To allow to perform additional operations not yet abstracted by the Netbox class, expose the pynetbox API object directly.
- The dry-run mode support is ensured by the RO token.
- include the username in logfiles.
v0.0.32 (2020-03-11)
- spicerack: allow to override Spicerack's instance parameters from the configuration file. See :ref:`config.yaml`.
- spicerack: allow to cache the
Ipmi
instance so that it can be re-used without re-asking the management password. - spicerack: expose to cookbooks the
_spicerack_config_dir
parameter via a getter. - netbox: fine tune log and exception messages.
- elasticsearch_cluster: return the cluster name in
ElasticsearchCluster.__str__
. - mysql: update
CORE_SECTIONS
for external storage RW instances (T226704).
- elasticsearch_cluster: add
https://
to relforge endpoints.
- tests: remove unused mypy type ignore comments.
v0.0.31 (2020-02-26)
- ganeti: add VM creation capability (T231068).
- spicerack: add support for an HTTP proxy.
- To perform calls to external endpoints it might be necessary to use an HTTP proxy, add support for it.
- Read the
http_proxy
config from the main spicerack configuration file and inject it into Spicerack that will also expose it to the cookbooks. - Add a getter for the
http_proxy
property to Spicerack. - Add a helper that returns a
proxies
dictionary to be used by the Python Requests module.
- ganeti: use canonical Ganeti cluster names (T231068).
- ganeti: add logging for
GntInstance
actions (T231068).
v0.0.30 (2020-02-11)
- netbox: rename injected property in host details (T231068).
- When fetching host details from Netbox, Spicerack injects some properties to distinguish between virtual and
physical hosts. Renaming the
cluster_name
property toganeti_cluster
to avoid possible confusions.
- When fetching host details from Netbox, Spicerack injects some properties to distinguish between virtual and
physical hosts. Renaming the
- spicerack: add getter for the Netbox master host. In some cases is necessary to execute commands on the Netbox master host, add a getter to resolve its real hostname (T231068).
- ganeti: add cluster to
instance()
(T231068).- Allow to specify the Ganeti cluster name when calling
instance()
. If set the instance will be searched only in that cluster. - Pass the cluster name to the
GntInstance
constructor and expose it via a getter to remove the necessity to look it up separately when cluster was not passed toinstance()
for auto-detection.
- Allow to specify the Ganeti cluster name when calling
- ganeti: add initial support for
gnt-instance
(T231068).- Add initial support for
gnt-* commands
to be executed on the cluster master via remote execution. - Add initial support for
gnt-instance
commands to perform Ganeti VMs decommissioning, in particular:shutdown
: to shutdown a Ganeti VM, with its optionaltimeout
parameter.remove
: to shutdown and remove a Ganeti VM, with its optionalshutdown_timeout
parameter.
- Add initial support for
- mediawiki: use Cumin alias instead of role query (T243935).
- dnsdisc: fix typo in docstring.
v0.0.29 (2020-01-16)
- mediawiki: in
stop_cronjobs()
adapt for the migration fromhhvm
tophp-fpm
in production (T229792). - dnsdisc: use port
5353
to query the resolvers. The authdns part is answering to port5353
from now on. - dns: allow to specify a custom port for the resolver. The authdns part is answering to port
5353
from now on, allow to specify a custom port when instantiating a newDns
recursor. - ganeti: Add
esams
,ulsfo
andeqsin
clusters and rows definitions.
- ipmi: the change introduced via I4d4ade351493a548e9e7a578bf9a7acbb45a5c0 to use
subprocess.run()
created a regression causing theipmi
calls to no longer capture stdout. Restored normal behaviour (T147074).
- dns: remove unused type hint ignore comments.
- remote: fix docstring return type.
- doc: updated link to the requests module documentation.
- docstrings: fix pep257 reported errors.
- mypy: Get rid of no longer needed
# type: ignore
annotations that are now detected automatically bymypy
.
v0.0.28 (2019-10-10)
- netbox: Transparently support read-only operations for virtual machines (T231068).
- ganeti: Add ability to get ganeti cluster for given instance (T231068).
- ipmi: add support for channel 2.
- ipmi: use
subprocess.run()
instead ofsubprocess.check_output()
.
v0.0.27 (2019-08-25)
- remote: Move splitting of a
RemoteHosts
instance to asplit()
method. - netbox: Make host private and raise exception on not found.
- netbox: Add method to return host information.
v0.0.26 (2019-08-06)
- Add Netbox module.
- Add the
LBRemoteCluster
class to manage cluster behind a load balancer.
- icinga: Add a function to force a recheck of all sevices.
- confctl: Add
filter_objects
andupdate_objects
. - confctl: add
change_and_revert
contextmanager.
- elasticsearch_cluster: correct ports for relforge cluster.
- elasticsearch_cluster: fix
mypy
newly reported bug. - tests: fix
pytest
caplog
matching. - tests: fix
pep257
newly reported issues.
v0.0.25 (2019-05-10)
- setup.py: fix
urllib3
dependency:- In order to build on Debian Stretch without backported packages, relax a bit the urllib3 dependency as the only goal for to specify it is to avoid conflicts with the latest version.
- doc: fix Sphinx configuration:
- In order to avoid issues while building the Debian package on Stretch where Sphinx
1.4.9
is available, change configuration to:- Reduce minimum Sphinx version to
1.4.9
insetup.py
. - Remove the
warning-is-error
configuration fromsetup.cfg
that is applied to every Sphinx run, and move it directly intotox.ini
as a command line-W
option, that will be executed only bytox
and not during the Debian package build process.
- Reduce minimum Sphinx version to
- In order to avoid issues while building the Debian package on Stretch where Sphinx
v0.0.24 (2019-05-09)
- prometheus: add timeout support to
query()
method. - ganeti: add timeout support.
- cookbook API: drop
get_title()
support:- No current cookbook is using the dynamic way to provide a title through
get_title(args)
. - This abstraction has not proven to be useful and the fact to mangle dynamically the title of a cookbook based on the current parameter while you can then execute it with different ones doesn't seem very useful, dropping it completely from the Cookbook API.
- No current cookbook is using the dynamic way to provide a title through
- doc: mark Sphinx warnings as error:
- To make the documentation building process more robust make Sphinx fail on warnings too.
- This requires
Sphinx > 1.5
and will require to use the backport version while building the package on Debian Stretch.
- doc: add checker to ensure modules are documented:
- It's common when adding a new module to forget to add the few bits required to auto-generated its documentation.
- Add a check to ensure that all Spicerack modules are listed in the documentation API index and that the linked files exists.
- ganeti: Fix RAPI port.
- prometheus: fix base URL template.
- doc: autodoc missing API modules.
setup.py: force
urllib3
version due topip
bug.Add emacs ignores to gitignore.
tests: temporarily force
bandit < 1.6.0
:- Due to a bug upstream bandit 1.6.0 doesn't honor the excluded directories, causing the failure of the bandit tox environments. Temporarily forcing its version.
v0.0.23 (2019-04-19)
- Add basic Ganeti RAPI support.
- Add basic Prometheus support.
- elasticsearch_cluster: add reset all indices to read/write capability (T219799).
- elasticsearch_cluster: logging during shard allocation was too verbose, some messages lowered to debug level.
- flake8: enforce import order and adopt
W504
:- Add
flake8-import-order
to enforce the import order using theedited
style that corresponds to our styleguide, see: mediawiki.org: Coding_conventions/Python. - Mark spicerack as local and do not specify any organization-specific packages to avoid to keep a manually curated list of packages.
- Fix all out of order imports.
- For line breaks around binary operators, adopt
W504
(breaking before the operator) and ignoreW503
, following PEP8 suggestion, see: PEP0008#line_break_binary_operator. - Fix all line breaks around binary operators to follow
W504
.
- Add
v0.0.22 (2019-04-04)
- elasticsearch_cluster: use NodesGroup instead of free form JSON.
v0.0.21 (2019-04-03)
- elasticsearch_cluster: Retrieve hostname and fqdn from node attributes.
- elasticsearch_cluster: make unfreezing writes more robust (T219640).
- elasticsearch_cluster: cleanup test by introducing a method to mock API calls.
- elasticsearch_cluster: rename
elasticsearchclusters
toelasticsearch_clusters
.
- tox: fix typo in environment name.
- Add Python type hints and mypy check, not for variables and properties as we're still supporting Python 3.5.
- setup.py: revert commit 3d7ab9b that forced the
urllib3
version installed as it's not needed anymore. - tests/doc: unify usage of
example.com
domain.
v0.0.20 (2019-03-06)
- ipmi: add password reset functionality.
- elasticsearch_cluster: upgrade rows one after the other.
- remote: suppress Cumin's output. As a workaround for a regression in colorama for stretch.
- Expose hostname from Reason.
- elasticsearch_cluster: use the admin Reason to get current hostname.
- debmonitor: fix missing variable for logging line.
- elasticsearch_cluster: fix typo (xarg instead of xargs).
- doc: fix reStructuredText formatting.
- Drop support for Python 3.4.
- Add support for Python 3.7.
- tests: refactor tox environments.
v0.0.19 (2019-02-21)
- elasticsearch_cluster: support cluster names which have
-
in them. - elasticsearch_cluster:
get_next_clusters_nodes()
raisesElasticsearchClusterError
. - elasticsearch_cluster: systemctl iterates explicitly on elasticsearch instances.
- setup.py: add
long_description_content_type
.
v0.0.18 (2019-02-20)
- elasticsearch_cluster: access production clusters over HTTPS.
v0.0.17 (2019-02-20)
- icinga: add
remove_on_error
parameter to thehosts_downtimed()
context manager to decide wether to remove the downtime or not on error.
- elasticsearch_cluster: raise logging level to ERROR for elasticsearch.
- elasticsearch_cluster: retry on all urllib3 exceptions.
v0.0.16 (2019-02-18)
- elasticsearch_cluster: retry on TransportError while waiting for node to be up.
- Change !log formatting to match Stashbot expectations.
v0.0.15 (2019-02-14)
- elasticsearch_cluster: add doc type to delete query.
v0.0.14 (2019-02-13)
- icinga: add context manager for downtimed hosts:
- Add a context manager to allow to execute other commands while the hosts are downtimed, removing the downtime at the end.
- management: add management module:
- Add a management module with a
Management
class to interact with the management console names. - For now just add a
get_fqdn()
method to automatically calculate the management FQDN for a given hostname.
- Add a management module with a
- puppet: add
check_enabled()
andcheck_disabled()
methods. - decorators: make
retry()
DRY-RUN aware:- When running in DRY-RUN mode no real changes are done and usually the
@retry
decorated methods are checking for some action to be propagated or completed. Hence when in DRY-RUN mode they tend to fail and retry until the tries attempts are exhausted, adding unnecessary time to the DRY-RUN. - With this patch the
retry()
decorator is able to automagically detect if it's a DRY-RUN mode when called by any instance method that has aself._dry_run
property or, in the special case ofRemoteHostsAdapter
derived instances, it has aself._remote_hosts._dry_run
property.
- When running in DRY-RUN mode no real changes are done and usually the
- puppet: add
delete()
method to remove a host from PuppetDB and clean up everything on the Puppet master. - spicerack: expose the
icinga_master_host
property. - administrative: add
owner
getter to Reason class:- Add a public getter for the owner part of a reason, that retuns in a standard format the user running the code and the host where it's running.
- decorators: improve tests.
- doc: fine-tune generated documentation.
- dns: remove unused
dry_run
argument. - Add missing timeout to requests calls.
- dns: fix logging message.
- elasticsearch_cluster: change
is_green()
implementation. - elasticsearch_cluster: fix issues found during live tests.
- spicerack: fix
__version__
. - ipmi: fix typos in docstrings.
v0.0.13 (2019-01-14)
- remote: fix logging for
reboot()
.
v0.0.12 (2019-01-10)
- ipmi: add support for DRY RUN mode.
- config: add load_ini_config() function to parse INI files.
- debmonitor: use the existing configuration file:
- Instead of requiring a new configuration file, use the existing one already setup by Puppet for the debmonitor client.
- Inject the path of the Debmonitor config into the ctor with a default value.
- puppet: add default
batch_size
when running puppet:- Allow to specify the
batch_size
when running puppet on a set of hosts. - Add a default
batch_size
to avoid to overload the Puppet master hosts.
- Allow to specify the
- phabricator: remove unneded pylint ignore.
- mediawiki: update maintenance host Cumin query.
- remote: add workaround for Cumin bug.
- To avoid unnecessary waiting on the most common use case of
reboot()
, that is with only one host, unset the defaultbatch_sleep
as a workaround for T213296.
- To avoid unnecessary waiting on the most common use case of
- puppet: fix regenerate_certificate().
- When re-generating the certificate, Puppet will exit with status code
1
both if successful or on failure. - Restrict the accepted exit codes to
1
. - Detect errors in the output and raises if any.
- When re-generating the certificate, Puppet will exit with status code
v0.0.11 (2019-01-08)
- debmonitor: add debmonitor module.
- phabricator: add phabricator module.
- icinga: fix
command_file
property. - puppet: fix
subprocess
call tocheck_output()
. - dns: include
NXDOMAIN
in theDnsNotFound
exception. - admin_reason: fix default value for task.
v0.0.10 (2018-12-19)
- cookbook: split main into
argument_parser()
andrun()
. - remote: refactor
Remote.query()
API.
- Add administrative module.
- dns: add dns module.
- Add elasticsearch_cluster module.
- Add Icinga module.
- Add ipmi module.
- Add Puppet module.
- puppet: add additional methods to
PuppetHosts
. - puppet: add PuppetMaster class.
- remote: add more host functionalities.
- doc: add documentation and its generation.
- interactive: add
ensure_shell_is_durable()
.
- administrative: fix Reason's signature.
- elasticsearch_cluster: fix tests for Python 3.5.
- icinga: fix typo in test docstring.
- interactive: check TTY in
ask_confirmation()
. - mediawiki: kill also HHVM on stop_cronjobs.
- Fix typo in README.rst.
- tests: fix randomly failing pylint check.
- setup.py: update curator version to match our current elasticsearch version.
- setup.py: force
urllib3
version. - tests: fix lint ignore.
v0.0.9 (2018-09-12)
- mediawiki: improve siteinfo checks.
- dnsdisc: improve TTL checks.
- exceptions: add
SpicerackCheckError
. - tests: improve prospector tests.
- dnsdisc: catch dnspython exceptions.
- setup.py: add missing fields and fix missing comma.
v0.0.8 (2018-09-10)
- mediawiki: ignore exit codes on stop_cronjobs.
- logging: minor improvements and a fix.
v0.0.7 (2018-09-06)
- dnsdisc: fix dry-run in
check_if_depoolable()
.
v0.0.6 (2018-09-06)
- log: remove relic from switchdc.
- mysql: refactor sync check to avoid GTID.
v0.0.5 (2018-09-05)
- mediawiki: improve validation checks.
v0.0.4 (2018-09-04)
- Add redis_cluster module.
- dnsdisc:
- add methods for checking if a datacenter can be depooled.
- add a
pool()
anddepool()
methods.
- mediawiki:
- improve
stop_cronjobs()
method. - add
check_cronjobs_disabled()
method. - refactor to use confctl's
set_and_verify()
. - split
set_readonly()
and add checks.
- improve
- mysql:
- add
get_dbs()
method. - rename the
ensure_core_masters_in_sync()
method.
- add
- confctl: add
set_and_verify()
method.
v0.0.3 (2018-08-30)
- Change PyPI package name and add long description to
setup.py
.
v0.0.2 (2018-08-28)
- mediawiki: add siteinfo-related methods.
v0.0.1 (2018-08-26)
- Initial version.