IBM Spectrum Scale APARs Resolved in 5.1.4.x

Fileset df doesn't report correct limit and usage for a fileset. (show details)

Symptom	Unexpected results
Environment	Linux
Trigger	Enable fileset df and set quota limit on a block but not on an inode.
Workaround	Upgrade the cluster version to 5.1.1.0.

5.1.4.1

Filesetdf

IJ40567

Restart of the pmsensors process in the container environment failed due to a race condition on apid file. pmsensors remained down, not collecting perfmon statistics. (show details)

Symptom	mmhealth event pmsensors_down in CNSA.
Environment	Linux
Trigger	Failover of perfmon singleton node CLUSTER_PERF_SENSOR to another node requires pmsensors restart on old and new node.
Workaround	Manually start pmsensors process.

5.1.4.1

Performance monitoring, Sysmon

IJ40568

On s390x "mmvdisk simulate-dead", and "mmvdisk replace --prepare" commands are setting the device power off even though the power control stanza is not specified. An Error 5 can be seen. (show details)

Symptom	Error output/message
Environment	Linux (s390x)
Trigger	"mmvdisk simulate-dead", or "mmvdisk replace --prepare"
Workaround	On "Pdisk state is missing" or err 5, power on the device before further processing.

5.1.4.1

ESS, GNR

IJ40569

GPFS fails to process the kmipServerUri field in a remote key manager stanza in the RKM.conf file if provided as an IPv6 address, e.g., kmipServerUri = tls://[fd9a:f0d0:1002:11::31]:5696. (show details)

Symptom	Failure to read files from encrypted file systems/sets.
Environment	All
Trigger	None
Workaround	Use the hostname instead.

5.1.4.1

Security

IJ40570

Assert or SIGSEGV in writeAllocSumBlock after offline mmfsckx. (show details)

Symptom	Abend/Crash
Environment	All
Trigger	Offline mmfsckx
Workaround	None

5.1.4.1

FSCK

IJ37871

The mmlsquota reports duplicate lines when issuing the -C option. (show details)

Symptom	Duplicate output
Environment	All
Trigger	Specify the Device argument that also belongs to the remote cluster in the -C argument.
Workaround	Specify the Device argument that does not belong to the -C ClusterName.

5.1.4.1

Admin Commands

IJ33574

Trace parameters set through the mmtracectl command does not keep the node classes. (show details)

Symptom	Unexpected behavior
Environment	All
Trigger	Set trace parameters with mmtracectl command.
Workaround	Explicitly set the trace parameters via mmchconfig command.

5.1.4.1

Admin

IJ40607

When recovery happens, and encounters extra entries within a deleted directory from the cache - it tends to determine the mode of the remote entry and queue Remove/Rmdir accordingly. But sometimes it gets the mode wrong and ends up queuing Rmdir on a file instead of a Remove. This causes the queue to be stuck forever. (show details)

Symptom	Unexpected Behavior
Environment	Linux (AFM Gateway nodes)
Trigger	Recovery on AFM fileset with large number of removes/rmdirs to be captured by the recovery.
Workaround	None

5.1.4.1

AFM

IJ40608

A node delete for in an ECE cluster will cause the declustered array to be stuck in critical rebuild, preventing the system from doing any data rebuild function. (show details)

Symptom	Unexpected Results/Behavior
Environment	Linux
Trigger	Remove an ECE node with mmvdisk.
Workaround	None

5.1.4.1

ESS, GNR

IJ40609

The SUID and SGID bits are not cleared after a successful write/truncate to a file by a non-owner. (show details)

Symptom	Unexpected Results/Behavior
Environment	Linux
Trigger	Create a file with the SUID and SGID bits set. As a non-owner or non-root user, write to the file with the write() system call or truncate the file with the truncate() system call.
Workaround	Ensure that only owners can write to an executable binary file that has the SUID/SGID bit set.

5.1.4.1

Core GPFS

IJ40573

Deadlock while accessing the data from AFM cascading relationship filesets because of token conflicts if the home fileset is AFM+COS enabled. (show details)

Symptom	Unexpected Results
Environment	Linux
Trigger	AFM cascading relationship with AFM+COS fileset.
Workaround	None

5.1.4.1

AFM

IJ40834

mmafmcosconfig options -gcs and -vhb does not work together. vhb is used for virtual hosting of bucket. (show details)

Symptom	Unexpected Results
Environment	Linux
Trigger	Accessing AFM+COS fileset which was created with both -gcs and -vhb options
Workaround	None

5.1.4.1

AFM

IJ40835

AFM Recovery procedure sometimes fails with error 112. (show details)

Symptom	Unexpected Behavior
Environment	Linux (AFM Gateway nodes)
Trigger	Running recovery on a fileset who's .ptrash directory has local bit reset on it.
Workaround	Setting the ptrash bit manually on the .ptrash directory (if it is found to be reset).

5.1.4.1

AFM

IJ40841

mmafmcosconfig options -gcs and -vhb does not work together. vhb is used for virtual hosting of bucket. (show details)

Symptom	Unexpected Results
Environment	Linux
Trigger	Accessing AFM+COS fileset which was created with both -gcs and -vhb options
Workaround	None

5.1.4.1

AFM

IJ40844

readdir/read operation on AFM+COS fileset does not preserve file times causing the file time mismatch after the download operation. (show details)

Symptom	Unexpected Results
Environment	Linux
Trigger	AFM+COS download operation.
Workaround	None

5.1.4.1

AFM

IJ40845

Object readdir messages are not filtered if the multiple readdir operations for the same directory comes to the gateway node. This causes performance overhead and deadlocks. (show details)

Symptom	Long Waiters/Deadlock
Environment	Linux
Trigger	AFM+COS caching mode with multiple readdirs on the same uncached directory.
Workaround	None

5.1.4.1

AFM

IJ40894

The mmauth show can be slow on a cluster that authorized a large number of remote accesses to the file systems it owns. (show details)

Symptom	Performance
Environment	All
Trigger	Large number of remote accesses.
Workaround	None

5.1.4.1

Admin commands

IJ40947

When an user application uses the Fine Grain Write Sharing hint, GPFS_FINE_GRAIN_WRITE_SHARING, to overwrite an existing file which is also in a snapshot, there is a possibility that the file content in the snapshot won't be preserved but will be changed to be the same as the file content in the active file system. (show details)

Symptom	Data loss in the snapshot file when the GPFS_FINE_GRAIN_WRITE_SHARING hint to used to overwrite the file in the active file system.
Environment	Linux
Trigger	Use the GPFS_FINE_GRAIN_WRITE_SHARING hint to overwrite an existing file which is also in a snapshot.
Workaround	None

5.1.4.1

Core GPFS

IJ40959

Objects are not fully prefetched at the Cache on reading 4th block when afmPrefetchThreshold is set to 0, and io pattern is random. (show details)

Symptom	Unexpected Behavior
Environment	Linux (AFM Gateway nodes)
Trigger	In RO/LU/IW/SW mode of operation, with AFM COS as the backend have an uncached file (evict file in case of SW or IW from cache). Read 4 data blocks randomly on the file at cache.. (make sure no 2 blocks are read sequentially).
Workaround	Read 4 blocks sequentially as compared to random.

5.1.4.1

AFM

IJ40987

When rename/remove operations are performed on dependent filesets which are linked inside AFM independent filesets, and these operations get replicated to the remote site - the local removed/renamed inodes are not reclaimed resulting in extra inodes being held inUse than actually necessary. (show details)

Symptom	Unexpected Behavior
Environment	Linux (AFM Gateway nodes)
Trigger	Remove/Rename being performed on the dependent fileset inodes - when this dependent fileset is linked under an AFM independent fileset.
Workaround	None

5.1.4.1

AFM

IJ41004

Files are not re-validated in an AFM cascading relationship because of readdir optimizations. This happens if the home fileset is AFM enabled with COS backend. (show details)

Symptom	Unexpected Results
Environment	Linux
Trigger	AFM cascading relationship with AFM+COS fileset.
Workaround	None

5.1.4.1

AFM

IJ41031

When changing to a new calhome group server node using the`mmcallhome group change` command, this may cause the daily/weekly schedules to no longer properly function. (show details)

Symptom	Callhome information may not be uploaded to IBM accordingly to the established daily and weekly schedules. mmhealth may show callhome in a degraded or failed state if uploads to IBM are not occurring.
Environment	Linux
Trigger	This issue may occur when using the mmcallhome command to change the callhome group server to a different node.
Workaround	The node class CALLHOME_SERVERS may be manually changed to address this issue. The command `mmchnodeclass CALLHOME_SERVERS replace -N <new_server_node>` may be used to update the node class to reflect the new callhome group server node.

5.1.4.1

Callhome

IJ41032

Directory is not re-validated under some conditions in AFM caching modes causing the directory attributes to be not fetched from the home. (show details)

Symptom	Unexpected Results
Environment	Linux
Trigger	AFM caching mode with directory updates.
Workaround	None

5.1.4.1

AFM

IJ41040

Mutex contention could lead to slow write performance on AIX when there are multiple threads trying to flush the same file that contain many blocks at same time. (show details)

Symptom	Performance Impact/Degradation
Environment	AIX/Power, windows (x86_64)
Trigger	Multiple threads invoking sync on the same file at the same time.
Workaround	None

5.1.4.1

Core GPFS

IJ41042

AFM gateway asserts when replicating the Rmdir operation on a dependent fileset. (show details)

Symptom	Assert
Environment	Linux
Trigger	AFM caching with dependent filesets.
Workaround	None

5.1.4.1

AFM

IJ41053

In recovery/resync its not able to find correct path for files which are under a mapped dir and failed with error 2 as mapped dirctory length was skipped. (show details)

Symptom	Operation queue gets dropped.
Environment	Linux
Trigger	Error 2 hits and queue gets dropped and cache state will be Needresync.
Workaround	None

5.1.4.1

AFM COS

1IJ41072

"mmsdrrestore --ccr-repair" is not removing CCR tiebreaker disks from the cluster configuration when those CCR tiebreaker disks aren't available when this command is executed. This happens only when the CCR nodes file '/var/mmfs/ccr/ccr.nodes' is not available on the quorum nodes. (show details)

Symptom	Unexpected results/behavior
Environment	All
Trigger	'/var/mmfs/ccr/ccr.nodes' not available on the quorum nodes in conjunction with CCR tiebreaker disks not accessible on those quorum nodes.
Workaround	None

5.1.4.1

CCR Admin command

IJ40902

Lookup on hardlinks fails intermittently on AFM cache filesets. This is due to a race between multiple threads performing the lookup of the same hardlink from different directories. (show details)

Symptom	Unexpected results
Environment	Linux
Trigger	AFM caching with hardlinks.
Workaround	None

5.1.4.1

AFM

IJ41074

mmvdisk pdisk list --rg all --not-ok -L prints extraneous information when all pdisks are ok. Specifically a recovery group separator will be printed with no pdisk information after it This might be confusing to the user, as the user might expect blank output if all disks are ok. (show details)

Symptom	Unexpected results
Environment	Linux
Trigger	Running the mmvdisk pdisk list command on a healthy system.
Workaround	None

5.1.4.1

ESS, GNR

IJ39624

On latest Cygwin (versions ≥ 3.3), an attempt to uninstall GPFS on Windows might display a dialog box complaining about access denied on uninstall.lnk. The dialog box presents options to Abort, Retry or Ignore the error. Ignoring the error bypasses the issue and results in a successful uninstall. (show details)

Symptom	Upgrade/Install failure.
Environment	Windows (x86_64)
Trigger	Cygwin version ≥ 3.3
Workaround	When presented with the dialog box complaining about uninstall.lnk, click on "Ignore" and that should let the uninstall complete. Then from an elevated Cygwin terminal: cd /usr/lpp/mmfs/support; chmod 777 uninstall.lnk; rm uninstall.lnk

5.1.4.0

Install, Upgrade