Info

As scientific fields such as Machine Learning, Deep Learning, Advanced Data Processing, Data Science and other AI subfields evolve at a fast pace, so should educational practices evolve too. The GPU4EDU project aims to provide new generations of students with access to modern facilities that allow them to gain up-to-date knowledge and progressive experience in the field of AI. That is, the aim of the GPU4EDU project is to bring the technical equipment at Tilburg University, Tilburg School of Humanities and Digital Sciences (TSHD) to a level that matches the requirements for current and future AI education. This project expands the hardware fleet of the School of Humanities and Digital Sciences for the educational needs of the Department of Cognitive Science and Artificial Intelligence with high-end, multi-GPU servers, accessible remotely and securely. Furthermore, these servers are dedicated to the educational needs of the students in courses taught within the CSAI department.

DL is a subfield of artificial intelligence that focuses on modeling non-linear problems using artificial neural networks. DL exploits complex neural network architectures which require substantial compute power. To deal with this complexity, a graphics processing unit (GPU) is more suitable than a central processing unit (CPU). This is largely because of the much higher number of cores in comparison to a CPU and substantial volume of dedicated memory that GPUs have. Currently, researchers that work on or with DL models rely heavily on GPU machines. Typically these are shared servers rather than personal workstations that are dedicated solely to working with DL models. Up till now, the available hardware dedicated to education did not include GPU servers. Students were required to resort to google cloud, other limited platforms or paid solutions. The GPU4EDU project has ensured a limited number of machines for student use and is a first step towards a larger educational infrastructure.

Tutorial

  1. Account

  1. Access aurometalsaurus.uvt.nl

    $ ssh [USERNAME]@aurometalsaurus.uvt.nl

where [USERNAME] should be replaced with your u-number.

  1. Access a node:

    $ srun --nodes=1 --pty /bin/bash -l

  2. Setup an anaconda environment

    1. Check if you can run anaconda:

      $ which conda

      if that is not the case, then do the following:

      $ nano ~/.bashrc

      scroll down to the end of the file and add the following line

      # >>> conda initialize >>>
      # !! Contents within this block are managed by 'conda init' !!
      __conda_setup="$('/usr/local/anaconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
      if [ $? -eq 0 ]; then
          eval "$__conda_setup"
      else
          if [ -f "/usr/local/anaconda3/etc/profile.d/conda.sh" ]; then
              . "/usr/local/anaconda3/etc/profile.d/conda.sh"
          else
              export PATH="/usr/local/anaconda3/bin:$PATH"
          fi
      fi
      unset __conda_setup
      # <<< conda initialize <<<
    2. Create an environment: $ conda create -n [ENVNAME]

    3. Intall required software:

      • Navigate to https://anaconda.org/search and search for the tools you require, then copy-paste the installation command

      • You can use conda or pip

    4. Do some small tests to ensure everything works You can run code on the machine that you are connected to in order to ensure that all software works.

  3. Create your job

    • Once you are done with testing, you can submit your job.

    • Create a shell script with a name you associate with your project. Let’s say `sentiment_analysis_1.sh'. A template is here.

    • Add the following lines at the top of your script:

      #!/bin/bash
      #SBATCH -p GPU # partition (queue)
      #SBATCH -N 1 # number of nodes
      #SBATCH -t 0-36:00 # time (D-HH:MM)
      #SBATCH -o slurm.%N.%j.out # STDOUT
      #SBATCH -e slurm.%N.%j.err # STDERR
      #SBATCH --gres=gpu:1

      These are settings for SLURM. For more information on SLURM and tutorials, check here: SLURM

    • If you the script you make using the above header cannot access your conda environment, please add the following at the end of this head and before any code you want to write:

      if [ -f "/usr/local/anaconda3/etc/profile.d/conda.sh" ]; then
          . "/usr/local/anaconda3/etc/profile.d/conda.sh"
      else
          export PATH="/usr/local/anaconda3/bin:$PATH"
      fi
    • After that, add a command to activate the conda environment you created:

      source activate [ENVNAME]

    • Then, add the commands that will invoke your project’s execution. For example, navigate to the directory where your scrips need to be executed.

    • Always double check whether the path is correct.

  4. Submit your job

    • Once you are done with preparing your script, submit it:

      sbatch [MYSCRIPT]

      where [MYSCRIPT] can be sentiment_analysis_1.sh as per the above example.

    • While the script is running it will generate two files - slurm.[NODE].[JOBNUMBER].err and slurm.[NODE].[JOBNUMBER].out, where [NODE] and [JOBNUMBER] are the name of the node which is running your script and the number of the job that has been submitted, respectively. Here is an example: slurm.cerulean.118.err slurm.cerulean.118.out The .err file contains logging information such as warnings and errors; the .out file contains the actual output that you have asked your script to generate.

    • Watch out! Graphics will not be displayed, but can be saved.

  5. Copy data: You can copy large or small files via scp or with FileZilla (download the Client for your OS).

  6. Enjoy #GPU4EDU!

Practical AI seminars