Assignment 01
In this assignment, you will learn how to run deep learning experiments using the ElastiCluster-ClusterJob computing model. In particular, each of you will build your own personal SLURM cluster on Google Compute Engine (GCE) using elasticluster and then run massive computational experiments using clusterjob.
Please follow the following steps to setup your cluster and run experiments. This documents only contains the detail of setting up your cluster and testing that it works properly with GPUs. Once these steps are completed, you should conduct your experiments as assigned to you on Canvas. The details of the experiment will only be available via Stanford Canvas website to students who are taking this course for credit.
Acknowledgements
- We would like to thank Google Cloud Platform Education Grants Team for their generosity and kindness in providing Stats285 course with cloud computing grant.
- We would like to thank ElastiCluster team especially Dr. Riccardo Murri for their help and collaboration on this project.
FAQ
Please visit the frequently asked questions before you submit a question on our Google group.
ElastiCluster-ClusterJob Computing Model
- Claim Your Google Cloud Credit
- Implement ElastiCluster-ClusterJob model
- Run Your Experiment and Submit CJ Package to Canvas
Part-1: Claim Your Google Cloud Credit
- Claim your $300 Google Compute Credit. You will also get $300 free credit from Google Cloud as a first time user by setting up your Billing Account. However, these 300$ allows only very limited CPU-only computation.
Part-2 Implement ElastiCluster-ClusterJob Model
Follow instructions at Painless Computing Models for Ambitious Data Science to implement the ElastiCluster-ClusterJob model.
Run Your Experiment and Submit CJ Package to Canvas
Please run the Deep Learning experiment assigned to you and submit the reproducible CJ package on Canvas.