Release Note for V9000 Family Block Storage Products

This release note applies to the following systems:

IBM FlashSystem V9000 Control Enclosure, machine type and model (MTM) 9846-AC2 and 9848-AC2
IBM FlashSystem V9000 Control Enclosure, MTM 9846-AC3 and 9848-AC3
IBM FlashSystem V9000 Expansion Enclosure, MTM 9846-AE2 and 9848-AE2
IBM FlashSystem V9000 LFF Expansion Enclosure, MTM 9846-12F and 9848-12F
IBM FlashSystem V9000 SFF Expansion Enclosure, MTM 9846-24F and 9848-24F
IBM FlashSystem V9000 LFF Expansion Enclosure, MTM 9846-92F and 9848-92F

This is the release note for the 8.1.3 release and details the issues resolved in all Program Temporary Fixes (PTFs) between 8.1.3.0 and 8.1.3.6. This document will be updated with additional information whenever a PTF is released.

This document was last updated on 10 September 2021.

New Features
Known Issues and Restrictions
Issues Resolved
1. Security Issues Resolved
2. APARs Resolved
Supported upgrade paths
Useful Links

1. New Features

The following new features have been introduced in the 8.1.3 release:

Support for deduplicated volumes
RESTful API support
Enhanced Call Home

The following new feature has been introduced in the 8.1.3.4 release:

Write Cache optimization for Data Reduction Pools. (Note: This means that the write cache fullness for Data Reduction Pools will be lower than for standard pools)

The following feature has been introduced in the 8.1.3.6 release:

Removal of support for DSA-based host key algorithms when using SSH login

2. Known Issues and Restrictions

Details	Introduced
The v8.1.3 code level introduces strict enforcement of the IETF RFC1035 specification. If unsupported characters are present in the URL used to launch the management GUI, either a blank page or http error 400 is displayed (depending on the browser that was used). Please see this TechNote for more information.	8.1.3.0
Customers using the REST API to list more than 2000 objects may experience a loss of service from the API as it restarts due to memory constraints. This is a restriction that may be lifted in a future PTF.	8.1.3.0
It is not possible to access the REST API using a cluster's IPv6 address. This is a restriction that may be lifted in a future PTF.	8.1.3.0
All fixes will be applied to MTM 9846/8-AE2 enclosures in the V9000 system. However, for a MTM 9846/8-AE3, in order to get the same updates please load 1.5.1.2 from Fix Central on the AE3 enclosure. The AE3 will only be updated when firmware is loaded directly on these enclosures.	8.1.1.0
Spectrum Control v5.2.15 is not supported for systems running v8.1.0.2 or later. Spectrum Control v5.2.15.2 is supported. If a config has previously been added then all subsequent probes will fail, after upgrade to Spectrum Control v5.2.15. This issue can be resolved by upgrading to Spectrum Control v5.2.15.2.	8.1.0.2
When configuring Remote Support Assistance, the connection test will report a fault and opening a connection will report Connected, followed shortly by Connection failed. Even though it states "Connection Failed", a connection may still be successfully opened. This issue will be resolved in a future release	8.1.0.1
Customers upgrading systems with more than 64GB of RAM to v8.1 or later will need to run chnodehw to enable access to the extra memory above 64GB. Under some circumstances it may also be necessary to remove and re-add each node in turn.	8.1.0.0
RSA is not supported with IPv6 service IP addresses. This is a temporary restriction that will be lifted in a future PTF.	8.1.0.0
AIX operating systems will not be able to get full benefit from the hot spare node feature unless they have the dynamic tracking feature enabled (dyntrk).	8.1.0.0
There is a known issue with 8-node systems and IBM Security Key Lifecycle Manager 3.0 that can cause the status of key server end points, on the system, to occasionally report as degraded or offline. The issue intermittently occurs when the system attempts to validate the key server but the server response times out to some of the nodes. When the issue occurs Error Code 1785 (A problem occurred with the Key Server) will be visible in the system event log. This issue will not cause any loss of access to encrypted data.	7.8.0.0
There is an extremely small possibility that, on a system using both Encryption and Transparent Cloud Tiering, the system can enter a state where an encryption re-key operation is stuck in 'prepared' or 'prepare_failed' state, and a cloud account is stuck in 'offline' state. The user will be unable to cancel or commit the encryption rekey, because the cloud account is offline. The user will be unable to remove the cloud account because an encryption rekey is in progress. The system can only be recovered from this state using a T4 Recovery procedure. It is also possible that SAS-attached storage arrays go offline.	7.8.0.0
Some configuration information will be incorrect in Spectrum Control. This does not have any functional impact and will be resolved in a future release of Spectrum control.	7.8.0.0
Systems, with NPIV enabled, presenting storage to SUSE Linux Enterprise Server (SLES) or Red Hat Enterprise Linux (RHEL) hosts running the ibmvfc driver on IBM Power can experience path loss or read-only file system events. This is cause by issues within the ibmvfc driver and VIOS code. Refer to this troubleshooting page for more information.	n/a
Host Disconnects Using VMware vSphere 5.5.0 Update 2 and vSphere 6.0 Refer to this flash for more information	n/a
If an update stalls or fails then contact IBM Support for further assistance	n/a
The following restrictions were valid but have now been lifted
A 9846-AE3 expansion enclosure cannot be entered into Spectrum Control. If a 9846-AE3 expansion enclosure is part of a V9000 configuration then less information will be displayed on certain screens.	8.1.0.2
Customers with attached hosts running zLinux should not upgrade to v8.1. This is a temporary restriction that will be lifted in a future PTF.	8.1.0.0

3. Issues Resolved

This release contains all of the fixes included in the 8.1.2.1 release, plus the following additional fixes.

A release may contain fixes for security issues, fixes for APARs or both. Consult both tables below to understand the complete set of fixes included in the release.

3.1 Security Issues Resolved

Security issues are documented using a reference number provided by "Common Vulnerabilities and Exposures" (CVE).

CVE Identifier	Link for additional Information	Resolved in
CVE-2018-3180	ibm10884526	8.1.3.6
CVE-2018-12547	ibm10884526	8.1.3.6
CVE-2008-5161	ibm10874368	8.1.3.5
CVE-2018-5391	ibm10872368	8.1.3.5
CVE-2017-17833	ibm10872546	8.1.3.4
CVE-2018-11784	ibm10872550	8.1.3.4
CVE-2018-5732	ibm10741135	8.1.3.3
CVE-2018-11776	ibm10741137	8.1.3.3
CVE-2017-17449	ibm10872364	8.1.3.3
CVE-2017-18017	ibm10872364	8.1.3.3
CVE-2018-1517	ibm10872456	8.1.3.3
CVE-2018-2783	ibm10872456	8.1.3.3
CVE-2018-12539	ibm10872456	8.1.3.3
CVE-2018-1775	ibm10872486	8.1.3.3
CVE-2016-10708	ibm10717661	8.1.3.0
CVE-2016-10142	ibm10717931	8.1.3.0
CVE-2017-11176	ibm10717931	8.1.3.0

3.2 APARs and Flashes Resolved

Reference	Severity	Description	Resolved in	Feature Tags
HU01617	S1	HIPER (Highly Pervasive): Due to a timing window issue, stopping a FlashCopy mapping, with the -autodelete option, may result in a Tier 2 recovery	8.1.3.6	FlashCopy
HU01865	S1	HIPER (Highly Pervasive): When creating a Hyperswap relationship using addvolumecopy, or similar methods, the system should perform a synchronisation operation to copy the data of the original copy to the new copy. In some cases this synchronisation is skipped, leaving the new copy with bad data (all zeros)	8.1.3.6	HyperSwap
HU01913	S1	HIPER (Highly Pervasive): A timing window issue in the DRAID6 rebuild process can cause node warmstarts with the possibility of a loss of access	8.1.3.6	Distributed RAID
HU01876	S1	Where systems are connected to controllers, that have FC ports that are capable of acting as initiators and targets, when NPIV is enabled then node warmstarts can occur	8.1.3.6	Backend Storage
HU01887	S1	In circumstances where host configuration data becomes inconsistent, across nodes, an issue in the CLI policing code may cause multiple warmstarts	8.1.3.6	Command Line Interface, Host Cluster
HU01888 & HU01997	S1	An issue with restore mappings, in the FlashCopy component, can cause an I/O group to warmstart	8.1.3.6	FlashCopy
HU01910	S1	When FlashCopy mappings are created, with a grain size of 64KB, it is possible for an overflow condition in the bitmap to occur. This can resulting in multiple node warmstarts with a possible loss of access to data	8.1.3.6	FlashCopy
HU01928	S1	When two IOs attempt to access the same address, the state of the data may be incorrectly set to invalid causing offline volumes and, possibly, offline pools	8.1.3.6	Data Reduction Pools
HU01957	S1	Due to an issue in Data Reduction Pools, when the system attempts an upgrade, there may be node warmstarts	8.1.3.6	Data Reduction Pools, System Update
HU02013	S1	A race condition, between the extent invalidation and destruction, in the garbage collection process, may cause a node warmstart with the possibility of offline volumes	8.1.3.6	Data Reduction Pools
HU02025	S1	An issue with metadata handling, where a pool has been taken offline, may lead to an out of space condition in that pool preventing its return to operation	8.1.3.6	Data Reduction Pools
IT25850	S1	I/O performance may be adversely affected towards the end of DRAID rebuilds. For some systems there may be multiple warmstarts leading to a loss of access	8.1.3.6	Distributed RAID
IT27460	S1	Lease expiry can occur between local nodes when remote connection is lost, due to the mishandling of messaging credits	8.1.3.6	Reliability Availability Serviceability
IT29040	S1	Occasionally a DRAID rebuild, with drives of 8TB or more, can encounter an issue which causes node warmstarts and potential loss of access	8.1.3.6	RAID, Distributed RAID
FLASH-27910	S2	The system may attempt to progress an upgrade, in the presence of a fault, resulting in a failed upgrade.	8.1.3.6	Hosts
FLASH-27920	S2	A failing HBA may cause a node to warmstart	8.1.3.6	Hosts
HU01507	S2	Until the initial synchronisation process completes, high system latency may be experienced when a volume is created with two compressed copies or when space-efficient copy is added to a volume with an existing compressed copy	8.1.3.6	Volume Mirroring
HU01761	S2	Entering multiple addmdisk commands, in rapid succession, to more than one storage pool, may cause node warmstarts	8.1.3.6	Backend Storage
HU01886	S2	The Unmap function can leave volume extents, that have not been freed, preventing managed disk and pool removal	8.1.3.6	SCSI Unmap
HU01972	S2	When an array is in a quiescing state, for example where a member has been deleted, I/O may become pended leading to multiple warmstarts	8.1.3.6	RAID, Distributed RAID
FLASH-27862	S3	A SNMP query returns a "Timeout: No Response" message.	8.1.3.6	System Monitoring
FLASH-27868	S3	Repeated restarting of the "xivagentd" daemon will prevent XIV installation.	8.1.3.6	Hosts
HU00744	S3	Single node warmstart due to an accounting issue within the cache component	8.1.3.6	Cache
HU01485	S3	When a AC3 node is started, with only one PSU powered, powering up the other PSU will not extinguish the Power Fault LED. Note: To apply this fix (in new BMC firmware) each node will need to be power cycled (i.e. remove AC power and battery), one at a time, after the upgrade has completed	8.1.3.6	System Monitoring
HU01659	S3	Node Fault LED can be seen to flash in the absence of an error condition. Note: To apply this fix (in new BMC firmware) each node will need to be power cycled (i.e. remove AC power and battery), one at a time, after the upgrade has completed	8.1.3.6	System Monitoring
HU01737	S3	On the "Update System" screen, for "Test Only", if a valid code image is selected, in the "Run Update Test Utility" dialog, then clicking the "Test" button will initiate a system update	8.1.3.6	System Update
HU01857	S3	Improved validation of user input in GUI	8.1.3.6	Graphical User Interface
HU01860	S3	During garbage collection the flushing of extents may become stuck leading to a timeout and a single node warmstart	8.1.3.6	Data Reduction Pools
HU01869	S3	Volume copy deletion, in a Data Reduction Pool, triggered by rmvdiskcopy rmvolumecopy or addvdiskcopy -autodelete (or similar) may become stalled with the copy being left in "deleting" status	8.1.3.6	Data Reduction Pools
HU01915 & IT28654	S3	Systems, with encryption enabled, that are using key servers to manage encryption keys, may fail to connect to the key servers if the servers' SSL certificates are part of a chain of trust	8.1.3.6	Encryption
HU01916	S3	The GUI Dashboard and the CLI lssystem command report physical capacity incorrectly	8.1.3.6	Graphical User Interface, Command Line Interface
IT28433	S3	Timing window issue in the Data Reduction Pool rehoming component can cause a single node warmstart	8.1.3.6	Data Reduction Pools
HU01918	S1	HIPER (Highly Pervasive): Where Data Reduction Pools have been created on earlier code levels, upgrading the system, to an affected release, can cause an increase in the level of concurrent flushing to disk. This may result in a loss of access to data. For more details refer to the following Flash	8.1.3.5	Data Reduction Pools
HU01920	S1	An issue in the garbage collection process can cause node warmstarts and offline pools	8.1.3.5	Data Reduction Pools
FLASH-27506	S2	Improved RAID error handling for unresponsive flash modules to prevent rare data error	8.1.3.5	RAID
HU01492	S1	HIPER (Highly Pervasive): All ports of a 16Gb HBA can be affected when a single port is congested. This can lead to lease expiries if all ports, used for inter-node communication, are on the same FC adapter	8.1.3.4	Reliability Availability Serviceability
HU01825	S1	Invoking a chrcrelationship command when one of the relationships in a consistency group is running in the opposite direction to the others, may cause a node warmstart followed by a T2 recovery	8.1.3.4	FlashCopy
HU01833	S1	If both nodes, in an I/O group, start up together a timing window issue may occur, that would prevent them running garbage collection, leading to a related Data Reduction Pool running out of space	8.1.3.4	Data Reduction Pools
HU01855	S1	Clusters using Data Reduction Pools can experience multiple warmstarts, on all nodes, putting them in a service state	8.1.3.4	Data Reduction Pools
HU01862	S1	When a Data Reduction Pool is removed and the -force option is specified there may be a temporary loss of access	8.1.3.4	Data Reduction Pools
HU01878	S1	During an upgrade from v7.8.1 or earlier to v8.1.3 or later if an MDisk goes offline then at completion all volumes may go offline	8.1.3.4	System Update
HU01885	S1	As writes are made to a Data Reduction Pool it is necessary to allocate new physical capacity. Under unusual circumstances it is possible for the handling of an expansion request to stall further I/O leading to node warmstarts	8.1.3.4	Data Reduction Pools
HU02042	S1	An issue in the handling of metadata, after a Data Reduction Pool recovery operation, can lead to repeated node warmstarts, putting an I/O group into a service state	8.1.3.4	Data Reduction Pools
FLASH-26391, 26388, 26117	S2	Improve timing to prevent erroneous flash module failures which in rare cases can lead to an outage.	8.1.3.4	Reliability Availability Serviceability
HU01661	S2	A cache-protection mechanism flag setting can become stuck leading to repeated stops of consistency group synching	8.1.3.4	HyperSwap
HU01733	S2	Canister information, for the High Density Expansion Enclosure, may be incorrectly reported.	8.1.3.4	Reliability Availability Serviceability
HU01797	S2	Hitachi G1500 backend controllers may exhibit higher than expected latency	8.1.3.4	Backend Storage
HU01824	S2	Switching replication direction, for HyperSwap relationships, can lead to long I/O timeouts	8.1.3.4	HyperSwap
HU01839	S2	Where a VMware host is being served volumes, from two different controllers, and an issue, on one controller, causes the related volumes to be taken offline then I/O performance, for the volumes from the other controller, will be adversely affected	8.1.3.4	Hosts
HU01842	S2	Bursts of I/O to Samsung high capacity flash drives can be interpreted as dropped frames, against the resident slots, leading to redundant drives being incorrectly failed	8.1.3.4	Drives
HU01846	S2	Silent battery discharge condition will unexpectedly take a node offline putting it into a 572 service state	8.1.3.4	Reliability Availability Serviceability
HU01907	S2	An issue in the handling of the power cable sense registers can cause a node to be put into service state with a 560 error	8.1.3.4	Reliability Availability Serviceability
HU01657	S3	The 16Gb FC HBA firmware may experience an issue, with the detection of unresponsive links, leading to a single node warmstart	8.1.3.4	Reliability Availability Serviceability
HU01719	S3	Node warmstart due to a parity error in the HBA driver firmware	8.1.3.4	Reliability Availability Serviceability
HU01760	S3	FlashCopy map progress appears to be stuck at zero percent	8.1.3.4	FlashCopy
HU01778	S3	An issue, in the HBA adapter, is exposed where a switch port keeps the link active but does not respond to link resets resulting in a node warmstart	8.1.3.4	Reliability Availability Serviceability
HU01786	S3	An issue in the monitoring of SSD write endurance can result in false 1215/2560 errors in the Event Log	8.1.3.4	Drives
HU01791	S3	Using the chhost command will remove stored CHAP secrets	8.1.3.4	iSCSI
HU01821	S3	An attempt to upgrade a two-node enhanced stretched cluster fails due to incorrect volume dependencies	8.1.3.4	System Update, Data Reduction Pools
HU01849	S3	An excessive number of SSH sessions may lead to a node warmstart	8.1.3.4	System Monitoring
HU02028	S3	An issue, with timer cancellation, in the Remote Copy component may cause a node warmstart	8.1.3.4	Metro Mirror, Global Mirror, Global Mirror With Change Volumes
IT22591	S3	An issue in the HBA adapter firmware may result in node warmstarts	8.1.3.4	Reliability Availability Serviceability
IT25457	S3	Attempting to remove a copy of a volume which has at least one image mode copy and at least one thin/compressed copy in a Data Reduction Pool will always fail with a CMMVC8971E error	8.1.3.4	Data Reduction Pools
IT26049	S3	An issue with CPU scheduling may cause the GUI to respond slowly	8.1.3.4	Graphical User Interface
HU01828	S1	HIPER (Highly Pervasive): Node warmstarts may occur during deletion of deduplicated volumes, due to a timing-related issue	8.1.3.3	Deduplication
HU01847	S1	FlashCopy handling of medium errors, across a number of drives on backend controllers, may lead to multiple node warmstarts	8.1.3.3	FlashCopy
HU01850	S1	When the last deduplication-enabled volume copy, in a Data Reduction Pool, is deleted the pool may go offline temporarily	8.1.3.3	Data Reduction Pools, Deduplication
HU01852	S2	The garbage collection rate can lead to Data Reduction Pools running out of space even though reclaimable capacity is available	8.1.3.3	Data Reduction Pools
HU01858	S2	Total used capacity of a Data Reduction Pool, within a single I/O group, is limited to 256TB. Garbage collection does not correctly recognise this limit. This may lead to a pool running out of free capacity and going offline	8.1.3.3	Data Reduction Pools
HU01870	S2	LDAP server communication fails with SSL or TLS security configured	8.1.3.3	LDAP
HU01790	S3	On the "Create Volumes" page the "Accessible I/O Groups" selection may not update when the "Caching I/O group" selection is changed	8.1.3.3	Graphical User Interface
HU01815	S3	In Data Reduction Pools, volume size is limited to 96TB	8.1.3.3	Data Reduction Pools
HU01856	S3	A garbage collection process can time out waiting for an event in the partner node resulting in a node warmstart	8.1.3.3	Data Reduction Pools
HU01851	S1	HIPER (Highly Pervasive): When a deduplicated volume is deleted there may be multiple node warmstarts and offline pools	8.1.3.2	Data Reduction Pools, Deduplication
HU01837	S2	In systems, where a VVols metadata volume has been created, an upgrade to v8.1.3 or later will cause a node warmstart, stalling the upgrade	8.1.3.2	VVols, System Update
HU01835	S1	HIPER (Highly Pervasive): Multiple warmstarts may be experienced due to an issue with Data Reduction Pool garbage collection where data for a volume is detected after the volume itself has been removed	8.1.3.1	Data Reduction Pools
HU01840	S1	HIPER (Highly Pervasive): When removing large numbers of volumes each with multiple copies it is possible to hit a timeout condition leading to warmstarts	8.1.3.1	SCSI Unmap
HU01829	S2	An issue in statistical data collection can prevent Easy Tier from working with Data Reduction Pools	8.1.3.1	EasyTier, Data Reduction Pools
HU01708	S1	HIPER (Highly Pervasive): A node removal operation during an array rebuild can cause a loss of parity data leading to bad blocks	8.1.3.0	RAID
HU01867	S1	HIPER (Highly Pervasive): Expansion of a volume may fail due to an issue with accounting of physical capacity. All nodes will warmstart in order to clear the problem. The expansion may be triggered by writing data to a thin-provisioned or compressed volume.	8.1.3.0	Thin Provisioning, Compression
HU01877	S1	HIPER (Highly Pervasive): Where a volume is being expanded, and the additional capacity is to be formatted, the creation of a related volume copy may result in multiple warmstarts and a potential loss of access to data.	8.1.3.0	Volume Mirroring, Cache
HU01774	S1	After a failed mkhost command for an iSCSI host any I/O from that host will cause multiple warmstarts	8.1.3.0	iSCSI
HU01780	S1	Migrating a volume to an image-mode volume on controllers that support SCSI unmap will trigger repeated cluster recoveries	8.1.3.0	SCSI Unmap
HU01781	S1	An issue with workload balancing in the kernel scheduler can deprive some processes of the necessary resource to complete successfully resulting in a node warmstarts, that may impact performance, with the possibility of a loss of access to volumes	8.1.3.0
HU01804	S1	During a system upgrade the processing required to upgrade the internal mapping between volumes and volume copies can lead to high latency impacting host I/O	8.1.3.0	System Update, Hosts
HU01809	S1	An issue in the handling of extent allocation in Data Reduction Pools can result in volumes being taken offline	8.1.3.0	Data Reduction Pools
HU01819	S1	During a system upgrade an attempt to upgrade Flash card firmware. With AE2 enclosures, this can cause an out of memory condition, resulting in a loss of access	8.1.3.0	System Update
HU01853	S1	In a Data Reduction Pool, it is possible for metadata to be assigned incorrect values leading to offline managed disk groups	8.1.3.0	Data Reduction Pools
HU01752	S2	A problem with the way IBM FlashSystem FS900 handles SCSI WRITE SAME commands (without the Unmap bit set) can lead to port exclusions	8.1.3.0	Backend Storage
HU01803	S2	The garbage collection process in Data Reduction Pool may become stalled resulting in no reclamation of free space from removed volumes	8.1.3.0	Data Reduction Pools
HU01818	S2	Excessive debug logging in the Data Reduction Pools component can adversely impact system performance	8.1.3.0	Data Reduction Pools
HU01460	S3	If during an array rebuild another drive fails the high processing demand in RAID for handling many medium errors during the rebuild can lead to a node warmstart	8.1.3.0	RAID
HU01724	S3	An I/O lock handling issue between nodes can lead to a single node warmstart	8.1.3.0	RAID
HU01751	S3	When RAID attempts to flag a strip as bad, and that strip has already been flagged, a node may warmstart	8.1.3.0	RAID
HU01795	S3	A thread locking issue in the Remote Copy component may cause a node warmstart	8.1.3.0
HU01800	S3	Under some rare circumstance a node warmstart may occur whilst creating volumes in a Data Reduction Pool	8.1.3.0	Data Reduction Pools
HU01801	S3	An issue in the handling of unmaps for MDisks can lead to a node warmstart	8.1.3.0	SCSI Unmap
HU01820	S3	When an unusual I/O request pattern is received it is possible for the handling of Data Reduction Pool metadata to become stuck, leading to a node warmstart	8.1.3.0	Data Reduction Pools
HU01830	S3	Missing security-enhancing HTTP response headers	8.1.3.0	Security

4. Supported upgrade paths

Please refer to the Concurrent Compatibility and Code Cross Reference for Spectrum Virtualize page for guidance when planning a system upgrade.

5. Useful Links

Description	Link
Support Website	IBM Knowledge Center
IBM FlashSystem Fix Central	V9000
Updating the system	IBM Knowledge Center
IBM Redbooks	Redbooks
Contacts	IBM Planetwide