HPE software update wipes 77TB from Japanese supercomputer
A Japanese university inadvertently wiped a colossal 77TB of research data from its supercomputer after a software update pushed by Hewlett Packard Enterprise (HPE) caused a script to go rogue and delete backups.
Kyoto University said 34 million files from 14 research groups had been deleted – and nearly a third of them will not get their data back after the incident which it blamed squarely on the HPE supercomputing system.
A software update error meant the Cray/HPE system deleted almost all files older than 10 days held in large capacity disc storage backup rather than just log files. It had initially feared up to 100TB was permanently lost.
Hewlett Packard said in a letter published by Kyoto University on December 29, 2021 that it took “100% responsibility” for the issue – an approach that drew some wry merriment from Japanese Twitter users who noted that they looked forward to companies taking “85%” or “31.3%” responsibility for any issues in future.
The incident occurred between December 14-16, Kyoto University said.
Supercomputer files deleted: HPE “not aware of the side effects”
The incident happened after an update to a script used on the supercomputer that had been rolled out to “improve visibility and readability” a letter of apology from HPE published by Kyoto University said.
HPE said: “The backup script includes a find command to delete log files older than 10 days. In addition to functional improvement of the script, the variable name passed to the find command for deletion was changed to improve visibility and readability.” (Google and DeepL translate, with a light edit by The Stack.)
The company added: “However, there was a lack of consideration in the release procedure of this modified script. We were not aware of the side effects of this behavior and released the [updated] script, overwriting [a bash script] while it was still running,” HPE admitted. “This resulted in the reloading of the modified shell script in the middle of the execution, resulting in undefined variables. As a result, the original log files in /LARGE0 [backup disc storage] were deleted instead of the original process of deleting files saved in the log directory.”
Kyoto University is one of Japan’s premier research institutions, with globally recognised work on chemistry, immunotherapy, materials sciences and more. It was not immediately clear which four departments had permanently lost their research, which will have been expensive and time-consuming to prepare and run.