aws/aws-parallelcluster
AWS ParallelCluster is an AWS supported Open Source cluster management tool to deploy and manage HPC clusters in the AWS cloud.
PythonApache-2.0
Issues
- 14
spot jobs not restarted after spot machine preempted.
#6641 opened by gwolski - 2
Feature Request: Allow Configuration of TreeWidth
#6366 opened by isc-lee - 6
- 6
terraform-provider-aws-parallelcluster fails on parallelcluster 3.11.0 with login nodes enabled
#6489 opened by kondakovm - 3
parallelcluster GUI does not allow viewing of minor versions, but yet complains about incompatible major version.
#6501 opened by gwolski - 5
No S3 permissions for LoginNodes CustomActions
#6507 opened by michaelmayer2 - 1
Feature request for restarting slurmctld upon failure
#6538 opened by gwolski - 4
- 4
Slurmqueue naming validation errors
#6548 opened by QuintenSchrevens - 22
PCluster 3.10.1 and 3.11.0 Slurm compute daemon node configuration differs from hardware
#6449 opened by stefan-maxar - 12
3.11.0 start up time longer than 3.9.1
#6479 opened by gwolski - 2
Unable to bootstrap cluster stacks using custom AMIs built from the ParallelCluster-blessed base ubuntu2204 image
#6552 opened by rmarable-flaretx - 2
- 1
problematic /etc/profile.d/zippy_efa.sh
#6642 opened by ssyed85 - 1
Feature Request: Dynamically update MinCount parameter
#6587 opened by Waqiah - 10
- 2
I tried to test this out today
#6555 opened by Perfect10NickTailor - 4
pcluster image build fails with RHEL 8 and 9
#6567 opened by mclouds2020 - 2
append domain-name in /etc/dhcp/dhclient.conf has extra '.' in appended domainname
#6594 opened by gwolski - 2
ParallelCluster 3.10.1 fails to setup accounting for slurm cluster (port 6819 unreachable)
#6398 opened by ElDeveloper - 6
(3.11.x) Job submission failure with Amazon Linux 2023
#6571 opened by hgreebe - 4
- 1
NFS - Disable V2 and V3
#6622 opened by maestro7879 - 0
Does `qos` configuration works on a AWS Pcluster?
#6621 opened by enlznep - 15
3.11.1 slurmctld core dumps with error message: double free or corruption (!prev)
#6529 opened by gwolski - 1
3.11.1 pcluster list-official-images shows two images for all OSes and architectures.
#6564 opened by gwolski - 3
wiki instructions for Option 2 fix of (3.8.0 ‐ 3.9.3) ParallelCluster Build Image Failing during Installation of Minitar Ruby Gem Dependency aren't quite right.
#6530 opened by gwolski - 3
(3.8.0 - 3.9.3) ParallelCluster Build Image Failing during Installation of Minitar Ruby Gem Dependency
#6405 opened by himani2411 - 5
- 3
v 3.11.0 cloudformation cannot delete the stack for a custom pcluster build_image run.
#6478 opened by gwolski - 0
(3.9.1 - latest) Speculative Return Stack Overflow (SRSO) mitigations introducing potential performance impact on some AMD processors
#6496 opened by hanwen-pcluste - 2
(3.11.0) Job submission failure caused by race condition in Pyxis configuration
#6459 opened by gmarciani - 2
srun --cpus-per-task=1 causing job to be run twice.
#6482 opened by gwolski - 4
Provisioning cluster with a cloudformation custom resource FAILS on warnings.
#6407 opened by snemir2 - 2
Cluster with an external Slurmdbd accounting
#6362 opened by mclouds2020 - 2
Question: Slurm MinCount is this min nodes running or min nodes available to take jobs?
#6417 opened by francisreyes-tfs - 3
Request: increase verbosity of pcluster cli when `Importing CDK` times out
#6451 opened by MDBeudekerCN - 1
[3.11] elasticloadbalancing:DescribeTags permission is missing when run a cluster with login nodes
#6483 opened by kondakovm - 2
parallelcluster 3.9.x failing to build with custom image, even though it worked before and it works with 3.11.0.
#6477 opened by gwolski - 2
Enhancement Request: Custom Slurm GRES Types
#6388 opened by joehellmersNOAA - 0
(3.9.0-current) Cluster creation fails on Rocky 9.4
#6442 opened by gmarciani - 2
Image Building on Pcluster 3.9.3/3.9.2
#6390 opened by QuintenSchrevens - 1
[Feature request] - Support for Intel MKL
#6403 opened by Waqiah - 4
NFS Mount Failure on Compute Nodes in ParallelCluster 3.10.1 with Slurm Scheduler
#6396 opened by aishwaryyasarkar - 0
(3.9.0-3.10.1) Cluster update intermittently fails because some compute nodes don’t execute update procedure
#6412 opened by hanwen-pcluste - 2
NVidia driver installation
#6416 opened by jagga13 - 4
- 2
Pcluster build-image of Rocky 8.9 creating broken AMI
#6397 opened by jagga13 - 0
Maybe some bugs I have met
#6393 opened by wbadyx - 0