Release Note for V9000 Family Block Storage Products

This release note applies to the following systems:

IBM FlashSystem V9000 Control Enclosure, machine type and model (MTM) 9846-AC3 and 9848-AC3
IBM FlashSystem V9000 LFF Expansion Enclosure, MTM 9846-12F and 9848-12F
IBM FlashSystem V9000 SFF Expansion Enclosure, MTM 9846-24F and 9848-24F
IBM FlashSystem V9000 LFF Expansion Enclosure, MTM 9846-92F and 9848-92F

This is the release note for the 8.3 release and details the issues resolved in all Program Temporary Fixes (PTFs) between 8.3.0.0 and 8.3.0.3. This document will be updated with additional information whenever a PTF is released.

This document was last updated on 10 September 2021.

New Features
Known Issues and Restrictions
Issues Resolved
1. Security Issues Resolved
2. APARs Resolved
Supported upgrade paths
Useful Links

1. New Features

The following new features have been introduced in the 8.3.0 release:

Multi-tenancy capability with Object-Based Access Control
Support for AWS in Spectrum Virtualize for Public Cloud
Support for 4-port 32Gb FC adapters
Improvements to IP Quorum
Improvements to HyperSwap
Improvements to host status presentation in CLI to better indicate redundancy of host-volume connectivity
Removal of support for DSA-based host key algorithms when using SSH login

2. Known Issues and Restrictions

Details	Introduced
Customers with systems running v8.3.0.0, or earlier, using deduplication cannot upgrade to v8.3.0.1, or later, due to APAR HU02162. Please contact Support for an ifix to enable upgrade. This is a known issue that will be lifted in a future PTF.	8.3.0.1
SRA does not work after changing SSH node method from password to key on AWS. This is a known issue that may be lifted in a future PTF.	8.3.0.0
Customers using iSER attached hosts, with Mellanox 25G adapters, should be aware that IPv6 sessions will not failover, for example, during a cluster upgrade. This is a known issue that may be lifted in a future PTF.	8.3.0.0
Customers using the REST API to list more than 2000 objects may experience a loss of service from the API as it restarts due to memory constraints. This is a restriction that may be lifted in a future PTF.	8.1.3.0
It is not possible to access the REST API using a cluster's IPv6 address. This is a restriction that may be lifted in a future PTF.	8.1.3.0
Customers upgrading systems with more than 64GB of RAM to v8.1 or later will need to run chnodehw to enable access to the extra memory above 64GB. Under some circumstances it may also be necessary to remove and re-add each node in turn.	8.1.0.0
Validation in the Upload Support Package feature will reject the new case number format in the PMR field. This is a known issue that may be lifted in a future PTF. The fix can be tracked using APAR HU02392.	7.8.1.0
Systems, with NPIV enabled, presenting storage to SUSE Linux Enterprise Server (SLES) or Red Hat Enterprise Linux (RHEL) hosts running the ibmvfc driver on IBM Power can experience path loss or read-only file system events. This is cause by issues within the ibmvfc driver and VIOS code. Refer to this troubleshooting page for more information.	n/a
If an update stalls or fails then contact IBM Support for further assistance	n/a

3. Issues Resolved

This release contains all of the fixes included in the 8.2.1.4 release, plus the following additional fixes.

A release may contain fixes for security issues, fixes for APARs or both. Consult both tables below to understand the complete set of fixes included in the release.

3.1 Security Issues Resolved

Security issues are documented using a reference number provided by "Common Vulnerabilities and Exposures" (CVE).

CVE Identifier	Link for additional Information	Resolved in
CVE-2019-5544	6250889	8.3.0.2
CVE-2019-2964	6250887	8.3.0.2
CVE-2019-2989	6250887	8.3.0.2
CVE-2018-12404	6250885	8.3.0.2
CVE-2019-11477	1164286	8.3.0.0
CVE-2019-11478	1164286	8.3.0.0
CVE-2019-11479	1164286	8.3.0.0
CVE-2019-2602	1073958	8.3.0.0

3.2 APARs and Flashes Resolved

Reference	Severity	Description	Resolved in	Feature Tags
HU02143	S2	The performance profile, for some enterprise tier drives, may not correctly match the drives capabilities leading to that tier being overdriven	8.3.0.3	EasyTier
HU02104	S1	HIPER (Highly Pervasive): An issue in the RAID component, in the presence of very high I/O workload and the exhaustion of cache resources, can see a deadlock condition occurring which prevents further I/O processing. The system detects this issue and takes the storage pool offline for a six minute period, to clear the problem. The pool is then brought online automatically, and normal operation resumes. For more details refer to the following Flash	8.3.0.2	RAID
HU02237	S1	HIPER (Highly Pervasive): Under a rare and complicated set of conditions, a RAID 1 or RAID 10 array may drop a write, causing undetected data corruption. For more details refer to the following Flash	8.3.0.2	RAID
HU02238	S1	HIPER (Highly Pervasive): Force-stopping a FlashCopy map, where the source volume is a Metro or Global Mirror target volume, may cause other FlashCopy maps to return invalid data if they are not 100% copied, in specific configurations	8.3.0.2	RAID
HU02109	S1	Free extents may not be unmapped after volume deletion, or migration, resulting in out-of-space conditions on backend controllers	8.3.0.2	Backend Storage, SCSI Unmap
HU02115	S1	Attempting to upgrade all drive firmware, with an inadequate drive package, may lead to multiple node warmstarts, with the possibility of a loss of access to data	8.3.0.2	Drives
HU02102	S3	Excessive processing time required for FlashCopy bitmap operations, associated with large (>20TB) Global Mirror change volumes, may lead to a node warmstart	8.3.0.2	Global Mirror With Change Volumes
HU02142	S3	It is possible for a backend unmap process to become stalled, preventing system configuration changes from completing	8.3.0.2	Distributed RAID
HU01998	S1	HIPER (Highly Pervasive): All SCSI command types can set volumes as busy, resulting in I/O timeouts and multiple node warmstarts, with the possibility of an offline I/O group. For more details refer to the following Flash	8.3.0.1	Hosts
HU02014	S1	HIPER (Highly Pervasive): After a loss of power, where an AC3 node has a dead CMOS battery, it will fail to restart correctly. It is possible for both nodes in an I/O group to experience this issue	8.3.0.1	Reliability Availability Serviceability
HU02064	S1	HIPER (Highly Pervasive): An issue in the firmware for compression accelerator cards can cause offline compressed volumes. For more details refer to the following Flash	8.3.0.1	Compression
HU02083	S1	HIPER (Highly Pervasive): During DRAID rebuilds, an issue in the handling of memory buffers can lead to multiple node warmstarts and a loss of access to data For more details refer to the following Flash	8.3.0.1	Distributed RAID
HU01924	S1	Migrating extents to an MDisk, that is not a member of an MDisk group, may result in a Tier 2 recovery	8.3.0.1	Thin Provisioning
HU02016	S1	A memory leak in the component that handles thin-provisioned MDisks can lead to an adverse performance impact with the possibility of offline MDisks. For more details refer to the following Flash	8.3.0.1	Backend Storage
HU02036	S1	It is possible for commands, that alter pool-level extent reservations (i.e. migratevdisk or rmmdisk), to conflict with an ongoing Easy Tier migration, resulting in a Tier 2 recovery	8.3.0.1	EasyTier
HU02043	S1	Collecting a snap can cause nodes to run out of boot drive space and go offline with node error 565	8.3.0.1	Support Data Collection
HU02044	S1	A deadlock condition, affecting Data Reduction Pool process interaction with DRAID, can cause multiple warmstarts with the possibility of a loss of access to data	8.3.0.1	Distributed RAID, Data Reduction Pools
HU02045	S1	When a node is removed from the cluster, using CLI, it may still be shown as online in the GUI. If an attempt is made to shutdown this node, from the GUI, whilst it appears to be online, then the whole cluster will shutdown	8.3.0.1	Graphical User Interface
HU02077	S1	A node upgrading to v8.3.0.0 will lose access to controllers directly-attached to its FC ports and the upgrade will stall	8.3.0.1	Backend Storage
HU02086	S1	An issue, in IP Quorum, may cause a Tier 2 recovery, during initial connection to a candidate device	8.3.0.1	IP Quorum
HU02089	S1	Due to changes to quorum management, during an upgrade to v8.2.x, or later, there may be multiple warmstarts, with the possibility of a loss of access to data	8.3.0.1	System Update
HU02097	S1	Workloads, with data that is highly suited to deduplication, can provoke high CPU utilisation, as multiple destinations try to deduplicate to one source. This adversely impacts performance with the possibility of offline MDisk groups	8.3.0.1	Data Reduction Pools
IT30595	S1	A resource shortage in the RAID component can cause MDisks to be taken offline	8.3.0.1	RAID
HU02006	S2	Garbage collection behaviour can become overzealous, adversely affect performance	8.3.0.1	Data Reduction Pools
HU02055	S2	Creating a FlashCopy snapshot, in the GUI, does not set the same preferred node for both source and target volumes. This may adversely impact performance	8.3.0.1	FlashCopy
HU02072	S2	An issue in the handling of email transmission can write a large file to the node boot drive. If this causes the boot drive to become full, the node will go offline with error 565	8.3.0.1	System Monitoring
HU02080	S2	When a Data Reduction Pool is running low on free space, the credit allocation algorithm, for garbage collection, can be exposed to a race condition, adversely affecting performance	8.3.0.1	Data Reduction Pools
IT29975	S2	During Ethernet port configuration, netmask validation will only accept a fourth octet of zero. Non-zero values will cause the interface to remain inactive	8.3.0.1	iSCSI
HU02067	S3	If multiple recipients are specified, for callhome emails, then no callhome emails will be sent	8.3.0.1	System Monitoring
HU02073	S3	Detection of an invalid list entry in the parity handling process can lead to a node warmstart	8.3.0.1	RAID
HU02079	S3	Starting a FlashCopy mapping, within a Data Reduction Pool, a large number of times may cause a node warmstart	8.3.0.1	Data Reduction Pools, FlashCopy
HU02087	S3	LDAP users with SSH keys cannot create volumes after upgrading to 8.3.0.0	8.3.0.1	LDAP
HU02126	S3	There is a low probability that excessive SSH connections may trigger a single node warmstart on the configuration node	8.3.0.1	Command Line Interface
HU02131	S3	When changing DRAID configuration, for an array with an active workload, a deadlock condition can occur resulting in a single node warmstart	8.3.0.1	Distributed RAID
IT30448	S3	If an IP Quorum app is killed, during the commit phase of a code upgrade, then that offline IP Quorum device cannot be removed, post upgrade	8.3.0.1	IP Quorum
HU02007	S1	HIPER (Highly Pervasive): During volume migration an issue, in the handling of old to new extents transfer, can lead to cluster-wide warmstarts	8.3.0.0	Storage Virtualisation
HU01888 & HU01997	S1	An issue with restore mappings, in the FlashCopy component, can cause an I/O group to warmstart	8.3.0.0	FlashCopy
HU01909	S1	Upgrading a system with Read-Intensive drives to 8.2, or later, may result in node warmstarts	8.3.0.0	System Update, Distributed RAID, Drives
HU01921	S1	Where FlashCopy mapping targets are also in remote copy relationships there may be node warmstarts with a temporary loss of access to data	8.3.0.0	FlashCopy, Global Mirror, Metro Mirror
HU01933	S1	Under rare circumstances the Data Reduction Pool deduplication rehoming process can become truncated. Subsequent detection of inconsistent metadata can lead to offline Data Reduction Pools	8.3.0.0	Data Reduction Pools, Deduplication
HU01985	S1	As a consequence of a Data Reduction Pool recovery bad metadata may be created. When the region of disk associated with the bad metadata is accessed there may be an I/O group warmstarts	8.3.0.0	Data Reduction Pools
HU01989	S1	For large drives, bitmap scanning, during a rebuild, can timeout resulting in multiple node warmstarts, possibly leading to offline I/O groups	8.3.0.0	Distributed RAID
HU01990	S1	Bad return codes from the partnership compression component can cause multiple node warmstarts taking nodes offline	8.3.0.0	Metro Mirror, Global Mirror, Global Mirror With Change Volumes
HU02003	S1	GUI information presented, during node shutdown operations, may not refresh in a timely manner misleading the user as to the state of the system	8.3.0.0	Graphical User Interface
HU02005	S1	An issue in the background copy process prevents grains, above a 128TB limit, from being cleaned properly. As a consequence there may be multiple node warmstarts with the potential for a loss of access to data	8.3.0.0	Global Mirror, Global Mirror With Change Volumes, Metro Mirror
HU02009	S1	Systems which are using Data Reduction Pools, with the maximum possible extent size of 8GB, and which experience a very specific I/O workload, may experience an issue due to garbage collection. This can cause repeated node warmstarts and loss of access to data	8.3.0.0	Data Reduction Pools
HU02121	S1	When the system changes from copyback to rebuild a failure to clear related metadata can cause multiple node warmstarts, with the possibility of a loss of access	8.3.0.0	Distributed RAID
HU02275	S1	Performing any sort of hardware maintenance during an upgrade may cause a cluster to destroy itself, with nodes entering candidate or service state 550	8.3.0.0	System Update
IT25367	S1	A T2 recovery may occur when an attempt is made to upgrade, or downgrade, the firmware for an unsupported drive type	8.3.0.0	Drives
IT26257	S1	Starting a relationship, when the remote volume is offline, may result in a T2 recovery	8.3.0.0	HyperSwap
HU01836	S2	When an auxiliary volume is moved an issue with pausing the master volume can lead to node warmstarts	8.3.0.0	HyperSwap
HU01904	S2	A timing issue can cause a remote copy relationship to become stuck, in a pausing state, resulting in a node warmstart	8.3.0.0	Global Mirror, Global Mirror With Change Volumes, Metro Mirror
HU01969	S2	It is possible, after an rmrcrelationship command is run, that the connection to the remote cluster may be lost	8.3.0.0	Global Mirror, Global Mirror With Change Volumes, Metro Mirror
HU02011	S2	When a node warmstart occurs on a system using Data Reduction Pools, there is a small possibility that the node will not automatically return online. If the partner node is also offline, this can cause temporary loss of access to data	8.3.0.0	Data Reduction Pools
HU02012	S2	Under certain I/O workloads the garbage collection process can adversely impact volume write response times	8.3.0.0	Data Reduction Pools
HU02051	S2	If unexpected actions are taken during node replacement, node warmstarts and temporary loss of access to data may occur. This issue can only occur if a node is replaced, and then the old node is re-added to the cluster	8.3.0.0	Reliability Availability Serviceability
HU02123	S2	For direct-attached hosts, a race condition between the FLOGI and Link UP processes can result in FC ports not coming online	8.3.0.0	Hosts
HU02288	S2	A node might fail to come online after a reboot or warmstart such as during an upgrade	8.3.0.0	Reliability Availability Serviceability
HU02318	S2	An issue in the handling of iSCSI host I/O may cause a node to kernel panic and go into service with error 578	8.3.0.0	iSCSI
HU01777	S3	Where not all I/O groups have NPIV enabled, hosts may be shown as "Degraded" with an incorrect count of node logins	8.3.0.0	Command Line Interface
HU01868	S3	After deleting an encrypted external MDisk, it is possible for the 'encrypted' status of volumes to change to 'no', even though all remaining MDisks are encrypted	8.3.0.0	Encryption
HU01872	S3	An issue with cache partition fairness can favour small IOs over large ones leading to a node warmstart	8.3.0.0	Cache
HU01880	S3	When a write, to a secondary volume, becomes stalled, a node at the primary site may warmstart	8.3.0.0	Global Mirror, Global Mirror With Change Volumes, Metro Mirror
HU01892	S3	LUNs of greater than 2TB, presented by HP XP7 storage controllers, are not supported	8.3.0.0	Backend Storage
HU01917	S3	Chrome browser support requires a self-signed certificate to include subject alternate name	8.3.0.0	Graphical User Interface
HU01936	S3	When shrinking a volume, that has host mappings, there may be recurring node warmstarts	8.3.0.0	Cache
HU01955	S3	The presence of unsupported configurations, in a Spectrum Virtualize environment, can cause a mishandling of unsupported commands leading to a node warmstart	8.3.0.0	Reliability Availability Serviceability
HU01956	S3	The output from a lsdrive command shows the write endurance usage, for SSDs, as blank rather than 0	8.3.0.0	Command Line Interface
HU01963	S3	A deadlock condition in the deduplication component can lead to a node warmstart	8.3.0.0	Deduplication
HU01974	S3	With all Remote Support Assistant connections closed, the GUI may show that a connection is still in progress	8.3.0.0	System Monitoring
HU01978	S3	Unable to create HyperSwap volumes. The mkvolume command fails with CMMVC7050E error	8.3.0.0	HyperSwap
HU01979	S3	The figure for used_virtualization, in the output of a lslicense command, may be unexpectedly large	8.3.0.0	Command Line Interface
HU01982	S3	In an environment, with multiple IP Quorum servers, if the quorum component encounters a duplicate UID then a node may warmstart	8.3.0.0	IP Quorum
HU01983	S3	Improve debug data capture to assist in determining the reason for a Data Reduction Pool to be taken offline	8.3.0.0	Data Reduction Pools
HU01986	S3	An accounting issue in the FlashCopy component may cause node warmstarts	8.3.0.0	FlashCopy
HU01991	S3	An issue in the handling of extent allocation, in the Data Reduction Pool component, can cause a node warmstart	8.3.0.0	Data Reduction Pools
HU02029	S3	An issue with the SSMTP process may result in failed callhome, inventory reporting and user notifications. A testemail command will fail with a CMMVC9051E error	8.3.0.0	System Monitoring
HU02039	S3	An issue in the management steps of Data Reduction Pool recovery may lead to a node warmstart	8.3.0.0	Data Reduction Pools
HU02059	S3	Event Log may display quorum errors even though quorum devices are available	8.3.0.0	Quorum
HU02134	S3	A timing issue, in handling chquorum CLI commands, can result in fewer than three quorum devices being available	8.3.0.0	Quorum
HU02166	S3	A timing window issue, in RAID code that handles recovery after a drive has been taken out of sync, due to a slow I/O, can cause a single node warmstart	8.3.0.0	RAID

4. Supported upgrade paths

Please refer to the Concurrent Compatibility and Code Cross Reference for Spectrum Virtualize page for guidance when planning a system upgrade.

5. Useful Links

Description	Link
Support Website	IBM Knowledge Center
IBM FlashSystem Fix Central	V9000
Updating the system	IBM Knowledge Center
IBM Redbooks	Redbooks
Contacts	IBM Planetwide