AWS continues to tackle Redshift issues with new cluster resizing fix
AWS says it has modernised the architecture of its widely used Redshift database to avoid certain resizing efforts taking “clusters” (or groups of computing resources) offline for many hours.
The fix was in Redshift classic resize, which is is used to resize a cluster when customers need to change the instance type or transition to a configuration that cannot be supported by “elastic resize.”
(Elastic resize launched in 2018, letting customers add nodes to get better performance/storage for demanding workloads or by removing nodes to save cost. It was designed to work with minimal disruption to on-going read and write queries, but certain changes, as above, still require the slower classic resize.)
Unusual bedfellows Microsoft, AWS team up
“Previously [classic resize] can take the cluster offline for many hours during resize, but now the cluster can typically be available to process queries in minutes” AWS said on July 14, adding that “clusters can also [now] be resized when restoring from a snapshot and [but] in those cases there could be restrictions.”
Further changes include the ability via either API or Command Line Interface to “restore an encrypted cluster from an unencrypted snapshot or change the encryption key” e.g. “encrypt an unencrypted cluster with an AWS KMS [Key Management Service] key faster by specifying an AWS KMS key ID when modifying a cluster. You can also restore an AWS KMS-encrypted cluster from an unencrypted snapshot” (only on RA3 nodes.)
The move comes as numerous early customers abandoned Redshift in favour of other cloud-native data warehouses out there over latency issues and other criticisms.
AWS has made notable efforts in recent years to improve Redshift performance and UX, with this week’s Redshift cluster resizing efforts coming after it in 2020 made changes that doubled “cold query” performance: the speed at which queries are processed when they need to be compiled (that update also came with the introduction of an unlimited cache to store compiled objects to increase cache hits when mission-critical queries are submitted to Redshift).
In April 2021 AWS also released AQUA, a new hardware-accelerated cache free with Redshift RA3 ra3.4xl and ra3.16xl node types. AQUA puts AWS Nitro chips — adapted to speed up data encryption and compression — and FPGAs next to powerful SSD memory drives for hardware-accelerated data compression, encryption, and tasks like scans, aggregates, and filtering happening next to storage, for a claimed 10-fold query performance.