Program crashes on login node with message
When running commands or editing files on the login node, users may
notice that their processes end abruptly with the error message
Processes with names such as
are automatically killed on the login node because they may consume
excessive computational resources. Unfortunately, this also means that
benign processes, such as editing a file with the word
matlab as part
of its name could also be killed.
Solution: Request an interactive session on a compute node (
and then run the application/command.
Home or scratch directories are sluggish or unresponsive
/scratch directories can become slow/unresponsive
when a user (or several users) read/write large amounts of data to
these directories. When this happens, all users are affected as these
filesystems are shared by all nodes of the cluster.
To avoid this issue, keep in mind the following:
Never use the
/homedirectory as the working directory for jobs that read/write data. If too many jobs read/write data to the
/homedirectory, it can render the cluster unusable by all users. Copy any input data to one of the
/scratchdirectories and use that
/scratchdirectory as the working directory for jobs. Periodically move important data back to the
Try to use
/local_scratchwhenever possible. Unlike
/scratchdirectories, which are shared by all nodes, each node has its own
/local_scratchdirectory. It is much faster to read/write data to
/local_scratch, and doing so will not affect other users. (see example [here])(https://www.palmetto.clemson.edu/palmetto/userguide_howto_choose_right_filesystem.html).