Release Note for V9000 Family Block Storage Products


This release note applies to the following systems: This is the release note for the 8.1.3 release and details the issues resolved in all Program Temporary Fixes (PTFs) between 8.1.3.0 and 8.1.3.6. This document will be updated with additional information whenever a PTF is released.

This document was last updated on 10 September 2021.

  1. New Features
  2. Known Issues and Restrictions
  3. Issues Resolved
    1. Security Issues Resolved
    2. APARs Resolved
  4. Supported upgrade paths
  5. Useful Links

1. New Features

The following new features have been introduced in the 8.1.3 release: The following new feature has been introduced in the 8.1.3.4 release: The following feature has been introduced in the 8.1.3.6 release:

2. Known Issues and Restrictions

Details Introduced

The v8.1.3 code level introduces strict enforcement of the IETF RFC1035 specification.

If unsupported characters are present in the URL used to launch the management GUI, either a blank page or http error 400 is displayed (depending on the browser that was used).

Please see this TechNote for more information.

8.1.3.0

Customers using the REST API to list more than 2000 objects may experience a loss of service from the API as it restarts due to memory constraints.

This is a restriction that may be lifted in a future PTF.

8.1.3.0

It is not possible to access the REST API using a cluster's IPv6 address.

This is a restriction that may be lifted in a future PTF.

8.1.3.0

All fixes will be applied to MTM 9846/8-AE2 enclosures in the V9000 system.

However, for a MTM 9846/8-AE3, in order to get the same updates please load 1.5.1.2 from Fix Central on the AE3 enclosure. The AE3 will only be updated when firmware is loaded directly on these enclosures.

8.1.1.0

Spectrum Control v5.2.15 is not supported for systems running v8.1.0.2 or later. Spectrum Control v5.2.15.2 is supported.

If a config has previously been added then all subsequent probes will fail, after upgrade to Spectrum Control v5.2.15. This issue can be resolved by upgrading to Spectrum Control v5.2.15.2.

8.1.0.2

When configuring Remote Support Assistance, the connection test will report a fault and opening a connection will report Connected, followed shortly by Connection failed.

Even though it states "Connection Failed", a connection may still be successfully opened.

This issue will be resolved in a future release

8.1.0.1

Customers upgrading systems with more than 64GB of RAM to v8.1 or later will need to run chnodehw to enable access to the extra memory above 64GB.

Under some circumstances it may also be necessary to remove and re-add each node in turn.

8.1.0.0

RSA is not supported with IPv6 service IP addresses.

This is a temporary restriction that will be lifted in a future PTF.

8.1.0.0

AIX operating systems will not be able to get full benefit from the hot spare node feature unless they have the dynamic tracking feature enabled (dyntrk).

8.1.0.0

There is a known issue with 8-node systems and IBM Security Key Lifecycle Manager 3.0 that can cause the status of key server end points, on the system, to occasionally report as degraded or offline. The issue intermittently occurs when the system attempts to validate the key server but the server response times out to some of the nodes. When the issue occurs Error Code 1785 (A problem occurred with the Key Server) will be visible in the system event log.

This issue will not cause any loss of access to encrypted data.

7.8.0.0

There is an extremely small possibility that, on a system using both Encryption and Transparent Cloud Tiering, the system can enter a state where an encryption re-key operation is stuck in 'prepared' or 'prepare_failed' state, and a cloud account is stuck in 'offline' state.

The user will be unable to cancel or commit the encryption rekey, because the cloud account is offline. The user will be unable to remove the cloud account because an encryption rekey is in progress.

The system can only be recovered from this state using a T4 Recovery procedure.

It is also possible that SAS-attached storage arrays go offline.

7.8.0.0

Some configuration information will be incorrect in Spectrum Control.

This does not have any functional impact and will be resolved in a future release of Spectrum control.

7.8.0.0

Systems, with NPIV enabled, presenting storage to SUSE Linux Enterprise Server (SLES) or Red Hat Enterprise Linux (RHEL) hosts running the ibmvfc driver on IBM Power can experience path loss or read-only file system events.

This is cause by issues within the ibmvfc driver and VIOS code.

Refer to this troubleshooting page for more information.

n/a
Host Disconnects Using VMware vSphere 5.5.0 Update 2 and vSphere 6.0

Refer to this flash for more information

n/a
If an update stalls or fails then contact IBM Support for further assistance n/a
The following restrictions were valid but have now been lifted

A 9846-AE3 expansion enclosure cannot be entered into Spectrum Control. If a 9846-AE3 expansion enclosure is part of a V9000 configuration then less information will be displayed on certain screens.

8.1.0.2

Customers with attached hosts running zLinux should not upgrade to v8.1.

This is a temporary restriction that will be lifted in a future PTF.

8.1.0.0

3. Issues Resolved

This release contains all of the fixes included in the 8.1.2.1 release, plus the following additional fixes.

A release may contain fixes for security issues, fixes for APARs or both. Consult both tables below to understand the complete set of fixes included in the release.

3.1 Security Issues Resolved

Security issues are documented using a reference number provided by "Common Vulnerabilities and Exposures" (CVE).
CVE Identifier Link for additional Information Resolved in
CVE-2018-3180 ibm10884526 8.1.3.6
CVE-2018-12547 ibm10884526 8.1.3.6
CVE-2008-5161 ibm10874368 8.1.3.5
CVE-2018-5391 ibm10872368 8.1.3.5
CVE-2017-17833 ibm10872546 8.1.3.4
CVE-2018-11784 ibm10872550 8.1.3.4
CVE-2018-5732 ibm10741135 8.1.3.3
CVE-2018-11776 ibm10741137 8.1.3.3
CVE-2017-17449 ibm10872364 8.1.3.3
CVE-2017-18017 ibm10872364 8.1.3.3
CVE-2018-1517 ibm10872456 8.1.3.3
CVE-2018-2783 ibm10872456 8.1.3.3
CVE-2018-12539 ibm10872456 8.1.3.3
CVE-2018-1775 ibm10872486 8.1.3.3
CVE-2016-10708 ibm10717661 8.1.3.0
CVE-2016-10142 ibm10717931 8.1.3.0
CVE-2017-11176 ibm10717931 8.1.3.0

3.2 APARs and Flashes Resolved

Reference Severity Description Resolved in Feature Tags
HU01617 S1 HIPER (Highly Pervasive): Due to a timing window issue, stopping a FlashCopy mapping, with the -autodelete option, may result in a Tier 2 recovery 8.1.3.6 FlashCopy
HU01865 S1 HIPER (Highly Pervasive): When creating a Hyperswap relationship using addvolumecopy, or similar methods, the system should perform a synchronisation operation to copy the data of the original copy to the new copy. In some cases this synchronisation is skipped, leaving the new copy with bad data (all zeros) 8.1.3.6 HyperSwap
HU01913 S1 HIPER (Highly Pervasive): A timing window issue in the DRAID6 rebuild process can cause node warmstarts with the possibility of a loss of access 8.1.3.6 Distributed RAID
HU01876 S1 Where systems are connected to controllers, that have FC ports that are capable of acting as initiators and targets, when NPIV is enabled then node warmstarts can occur 8.1.3.6 Backend Storage
HU01887 S1 In circumstances where host configuration data becomes inconsistent, across nodes, an issue in the CLI policing code may cause multiple warmstarts 8.1.3.6 Command Line Interface, Host Cluster
HU01888 & HU01997 S1 An issue with restore mappings, in the FlashCopy component, can cause an I/O group to warmstart 8.1.3.6 FlashCopy
HU01910 S1 When FlashCopy mappings are created, with a grain size of 64KB, it is possible for an overflow condition in the bitmap to occur. This can resulting in multiple node warmstarts with a possible loss of access to data 8.1.3.6 FlashCopy
HU01928 S1 When two IOs attempt to access the same address, the state of the data may be incorrectly set to invalid causing offline volumes and, possibly, offline pools 8.1.3.6 Data Reduction Pools
HU01957 S1 Due to an issue in Data Reduction Pools, when the system attempts an upgrade, there may be node warmstarts 8.1.3.6 Data Reduction Pools, System Update
HU02013 S1 A race condition, between the extent invalidation and destruction, in the garbage collection process, may cause a node warmstart with the possibility of offline volumes 8.1.3.6 Data Reduction Pools
HU02025 S1 An issue with metadata handling, where a pool has been taken offline, may lead to an out of space condition in that pool preventing its return to operation 8.1.3.6 Data Reduction Pools
IT25850 S1 I/O performance may be adversely affected towards the end of DRAID rebuilds. For some systems there may be multiple warmstarts leading to a loss of access 8.1.3.6 Distributed RAID
IT27460 S1 Lease expiry can occur between local nodes when remote connection is lost, due to the mishandling of messaging credits 8.1.3.6 Reliability Availability Serviceability
IT29040 S1 Occasionally a DRAID rebuild, with drives of 8TB or more, can encounter an issue which causes node warmstarts and potential loss of access 8.1.3.6 RAID, Distributed RAID
FLASH-27910 S2 The system may attempt to progress an upgrade, in the presence of a fault, resulting in a failed upgrade. 8.1.3.6 Hosts
FLASH-27920 S2 A failing HBA may cause a node to warmstart 8.1.3.6 Hosts
HU01507 S2 Until the initial synchronisation process completes, high system latency may be experienced when a volume is created with two compressed copies or when space-efficient copy is added to a volume with an existing compressed copy 8.1.3.6 Volume Mirroring
HU01761 S2 Entering multiple addmdisk commands, in rapid succession, to more than one storage pool, may cause node warmstarts 8.1.3.6 Backend Storage
HU01886 S2 The Unmap function can leave volume extents, that have not been freed, preventing managed disk and pool removal 8.1.3.6 SCSI Unmap
HU01972 S2 When an array is in a quiescing state, for example where a member has been deleted, I/O may become pended leading to multiple warmstarts 8.1.3.6 RAID, Distributed RAID
FLASH-27862 S3 A SNMP query returns a "Timeout: No Response" message. 8.1.3.6 System Monitoring
FLASH-27868 S3 Repeated restarting of the "xivagentd" daemon will prevent XIV installation. 8.1.3.6 Hosts
HU00744 S3 Single node warmstart due to an accounting issue within the cache component 8.1.3.6 Cache
HU01485 S3 When a AC3 node is started, with only one PSU powered, powering up the other PSU will not extinguish the Power Fault LED.
Note: To apply this fix (in new BMC firmware) each node will need to be power cycled (i.e. remove AC power and battery), one at a time, after the upgrade has completed
8.1.3.6 System Monitoring
HU01659 S3 Node Fault LED can be seen to flash in the absence of an error condition.
Note: To apply this fix (in new BMC firmware) each node will need to be power cycled (i.e. remove AC power and battery), one at a time, after the upgrade has completed
8.1.3.6 System Monitoring
HU01737 S3 On the "Update System" screen, for "Test Only", if a valid code image is selected, in the "Run Update Test Utility" dialog, then clicking the "Test" button will initiate a system update 8.1.3.6 System Update
HU01857 S3 Improved validation of user input in GUI 8.1.3.6 Graphical User Interface
HU01860 S3 During garbage collection the flushing of extents may become stuck leading to a timeout and a single node warmstart 8.1.3.6 Data Reduction Pools
HU01869 S3 Volume copy deletion, in a Data Reduction Pool, triggered by rmvdiskcopy rmvolumecopy or addvdiskcopy -autodelete (or similar) may become stalled with the copy being left in "deleting" status 8.1.3.6 Data Reduction Pools
HU01915 & IT28654 S3 Systems, with encryption enabled, that are using key servers to manage encryption keys, may fail to connect to the key servers if the servers' SSL certificates are part of a chain of trust 8.1.3.6 Encryption
HU01916 S3 The GUI Dashboard and the CLI lssystem command report physical capacity incorrectly 8.1.3.6 Graphical User Interface, Command Line Interface
IT28433 S3 Timing window issue in the Data Reduction Pool rehoming component can cause a single node warmstart 8.1.3.6 Data Reduction Pools
HU01918 S1 HIPER (Highly Pervasive): Where Data Reduction Pools have been created on earlier code levels, upgrading the system, to an affected release, can cause an increase in the level of concurrent flushing to disk. This may result in a loss of access to data. For more details refer to the following Flash  8.1.3.5 Data Reduction Pools
HU01920 S1 An issue in the garbage collection process can cause node warmstarts and offline pools 8.1.3.5 Data Reduction Pools
FLASH-27506 S2 Improved RAID error handling for unresponsive flash modules to prevent rare data error 8.1.3.5 RAID
HU01492 S1 HIPER (Highly Pervasive): All ports of a 16Gb HBA can be affected when a single port is congested. This can lead to lease expiries if all ports, used for inter-node communication, are on the same FC adapter 8.1.3.4 Reliability Availability Serviceability
HU01825 S1 Invoking a chrcrelationship command when one of the relationships in a consistency group is running in the opposite direction to the others, may cause a node warmstart followed by a T2 recovery 8.1.3.4 FlashCopy
HU01833 S1 If both nodes, in an I/O group, start up together a timing window issue may occur, that would prevent them running garbage collection, leading to a related Data Reduction Pool running out of space 8.1.3.4 Data Reduction Pools
HU01855 S1 Clusters using Data Reduction Pools can experience multiple warmstarts, on all nodes, putting them in a service state 8.1.3.4 Data Reduction Pools
HU01862 S1 When a Data Reduction Pool is removed and the -force option is specified there may be a temporary loss of access 8.1.3.4 Data Reduction Pools
HU01878 S1 During an upgrade from v7.8.1 or earlier to v8.1.3 or later if an MDisk goes offline then at completion all volumes may go offline 8.1.3.4 System Update
HU01885 S1 As writes are made to a Data Reduction Pool it is necessary to allocate new physical capacity. Under unusual circumstances it is possible for the handling of an expansion request to stall further I/O leading to node warmstarts 8.1.3.4 Data Reduction Pools
HU02042 S1 An issue in the handling of metadata, after a Data Reduction Pool recovery operation, can lead to repeated node warmstarts, putting an I/O group into a service state 8.1.3.4 Data Reduction Pools
FLASH-26391, 26388, 26117 S2 Improve timing to prevent erroneous flash module failures which in rare cases can lead to an outage. 8.1.3.4 Reliability Availability Serviceability
HU01661 S2 A cache-protection mechanism flag setting can become stuck leading to repeated stops of consistency group synching 8.1.3.4 HyperSwap
HU01733 S2 Canister information, for the High Density Expansion Enclosure, may be incorrectly reported. 8.1.3.4 Reliability Availability Serviceability
HU01797 S2 Hitachi G1500 backend controllers may exhibit higher than expected latency 8.1.3.4 Backend Storage
HU01824 S2 Switching replication direction, for HyperSwap relationships, can lead to long I/O timeouts 8.1.3.4 HyperSwap
HU01839 S2 Where a VMware host is being served volumes, from two different controllers, and an issue, on one controller, causes the related volumes to be taken offline then I/O performance, for the volumes from the other controller, will be adversely affected 8.1.3.4 Hosts
HU01842 S2 Bursts of I/O to Samsung high capacity flash drives can be interpreted as dropped frames, against the resident slots, leading to redundant drives being incorrectly failed 8.1.3.4 Drives
HU01846 S2 Silent battery discharge condition will unexpectedly take a node offline putting it into a 572 service state 8.1.3.4 Reliability Availability Serviceability
HU01907 S2 An issue in the handling of the power cable sense registers can cause a node to be put into service state with a 560 error 8.1.3.4 Reliability Availability Serviceability
HU01657 S3 The 16Gb FC HBA firmware may experience an issue, with the detection of unresponsive links, leading to a single node warmstart 8.1.3.4 Reliability Availability Serviceability
HU01719 S3 Node warmstart due to a parity error in the HBA driver firmware 8.1.3.4 Reliability Availability Serviceability
HU01760 S3 FlashCopy map progress appears to be stuck at zero percent 8.1.3.4 FlashCopy
HU01778 S3 An issue, in the HBA adapter, is exposed where a switch port keeps the link active but does not respond to link resets resulting in a node warmstart 8.1.3.4 Reliability Availability Serviceability
HU01786 S3 An issue in the monitoring of SSD write endurance can result in false 1215/2560 errors in the Event Log 8.1.3.4 Drives
HU01791 S3 Using the chhost command will remove stored CHAP secrets 8.1.3.4 iSCSI
HU01821 S3 An attempt to upgrade a two-node enhanced stretched cluster fails due to incorrect volume dependencies 8.1.3.4 System Update, Data Reduction Pools
HU01849 S3 An excessive number of SSH sessions may lead to a node warmstart 8.1.3.4 System Monitoring
HU02028 S3 An issue, with timer cancellation, in the Remote Copy component may cause a node warmstart 8.1.3.4 Metro Mirror, Global Mirror, Global Mirror With Change Volumes
IT22591 S3 An issue in the HBA adapter firmware may result in node warmstarts 8.1.3.4 Reliability Availability Serviceability
IT25457 S3 Attempting to remove a copy of a volume which has at least one image mode copy and at least one thin/compressed copy in a Data Reduction Pool will always fail with a CMMVC8971E error 8.1.3.4 Data Reduction Pools
IT26049 S3 An issue with CPU scheduling may cause the GUI to respond slowly 8.1.3.4 Graphical User Interface
HU01828 S1 HIPER (Highly Pervasive): Node warmstarts may occur during deletion of deduplicated volumes, due to a timing-related issue 8.1.3.3 Deduplication
HU01847 S1 FlashCopy handling of medium errors, across a number of drives on backend controllers, may lead to multiple node warmstarts 8.1.3.3 FlashCopy
HU01850 S1 When the last deduplication-enabled volume copy, in a Data Reduction Pool, is deleted the pool may go offline temporarily 8.1.3.3 Data Reduction Pools, Deduplication
HU01852 S2 The garbage collection rate can lead to Data Reduction Pools running out of space even though reclaimable capacity is available 8.1.3.3 Data Reduction Pools
HU01858 S2 Total used capacity of a Data Reduction Pool, within a single I/O group, is limited to 256TB. Garbage collection does not correctly recognise this limit. This may lead to a pool running out of free capacity and going offline 8.1.3.3 Data Reduction Pools
HU01870 S2 LDAP server communication fails with SSL or TLS security configured 8.1.3.3 LDAP
HU01790 S3 On the "Create Volumes" page the "Accessible I/O Groups" selection may not update when the "Caching I/O group" selection is changed 8.1.3.3 Graphical User Interface
HU01815 S3 In Data Reduction Pools, volume size is limited to 96TB 8.1.3.3 Data Reduction Pools
HU01856 S3 A garbage collection process can time out waiting for an event in the partner node resulting in a node warmstart 8.1.3.3 Data Reduction Pools
HU01851 S1 HIPER (Highly Pervasive): When a deduplicated volume is deleted there may be multiple node warmstarts and offline pools 8.1.3.2 Data Reduction Pools, Deduplication
HU01837 S2 In systems, where a VVols metadata volume has been created, an upgrade to v8.1.3 or later will cause a node warmstart, stalling the upgrade 8.1.3.2 VVols, System Update
HU01835 S1 HIPER (Highly Pervasive): Multiple warmstarts may be experienced due to an issue with Data Reduction Pool garbage collection where data for a volume is detected after the volume itself has been removed 8.1.3.1 Data Reduction Pools
HU01840 S1 HIPER (Highly Pervasive): When removing large numbers of volumes each with multiple copies it is possible to hit a timeout condition leading to warmstarts 8.1.3.1 SCSI Unmap
HU01829 S2 An issue in statistical data collection can prevent Easy Tier from working with Data Reduction Pools 8.1.3.1 EasyTier, Data Reduction Pools
HU01708 S1 HIPER (Highly Pervasive): A node removal operation during an array rebuild can cause a loss of parity data leading to bad blocks 8.1.3.0 RAID
HU01867 S1 HIPER (Highly Pervasive): Expansion of a volume may fail due to an issue with accounting of physical capacity. All nodes will warmstart in order to clear the problem. The expansion may be triggered by writing data to a thin-provisioned or compressed volume. 8.1.3.0 Thin Provisioning, Compression
HU01877 S1 HIPER (Highly Pervasive): Where a volume is being expanded, and the additional capacity is to be formatted, the creation of a related volume copy may result in multiple warmstarts and a potential loss of access to data. 8.1.3.0 Volume Mirroring, Cache
HU01774 S1 After a failed mkhost command for an iSCSI host any I/O from that host will cause multiple warmstarts 8.1.3.0 iSCSI
HU01780 S1 Migrating a volume to an image-mode volume on controllers that support SCSI unmap will trigger repeated cluster recoveries 8.1.3.0 SCSI Unmap
HU01781 S1 An issue with workload balancing in the kernel scheduler can deprive some processes of the necessary resource to complete successfully resulting in a node warmstarts, that may impact performance, with the possibility of a loss of access to volumes 8.1.3.0
HU01804 S1 During a system upgrade the processing required to upgrade the internal mapping between volumes and volume copies can lead to high latency impacting host I/O 8.1.3.0 System Update, Hosts
HU01809 S1 An issue in the handling of extent allocation in Data Reduction Pools can result in volumes being taken offline 8.1.3.0 Data Reduction Pools
HU01819 S1 During a system upgrade an attempt to upgrade Flash card firmware. With AE2 enclosures, this can cause an out of memory condition, resulting in a loss of access 8.1.3.0 System Update
HU01853 S1 In a Data Reduction Pool, it is possible for metadata to be assigned incorrect values leading to offline managed disk groups 8.1.3.0 Data Reduction Pools
HU01752 S2 A problem with the way IBM FlashSystem FS900 handles SCSI WRITE SAME commands (without the Unmap bit set) can lead to port exclusions 8.1.3.0 Backend Storage
HU01803 S2 The garbage collection process in Data Reduction Pool may become stalled resulting in no reclamation of free space from removed volumes 8.1.3.0 Data Reduction Pools
HU01818 S2 Excessive debug logging in the Data Reduction Pools component can adversely impact system performance 8.1.3.0 Data Reduction Pools
HU01460 S3 If during an array rebuild another drive fails the high processing demand in RAID for handling many medium errors during the rebuild can lead to a node warmstart 8.1.3.0 RAID
HU01724 S3 An I/O lock handling issue between nodes can lead to a single node warmstart 8.1.3.0 RAID
HU01751 S3 When RAID attempts to flag a strip as bad, and that strip has already been flagged, a node may warmstart 8.1.3.0 RAID
HU01795 S3 A thread locking issue in the Remote Copy component may cause a node warmstart 8.1.3.0
HU01800 S3 Under some rare circumstance a node warmstart may occur whilst creating volumes in a Data Reduction Pool 8.1.3.0 Data Reduction Pools
HU01801 S3 An issue in the handling of unmaps for MDisks can lead to a node warmstart 8.1.3.0 SCSI Unmap
HU01820 S3 When an unusual I/O request pattern is received it is possible for the handling of Data Reduction Pool metadata to become stuck, leading to a node warmstart 8.1.3.0 Data Reduction Pools
HU01830 S3 Missing security-enhancing HTTP response headers 8.1.3.0 Security

4. Supported upgrade paths

Please refer to the Concurrent Compatibility and Code Cross Reference for Spectrum Virtualize page for guidance when planning a system upgrade.

5. Useful Links

Description Link
Support Website IBM Knowledge Center
IBM FlashSystem Fix Central V9000
Updating the system IBM Knowledge Center
IBM Redbooks Redbooks
Contacts IBM Planetwide