I am using a software package called DJFFSTB; and it takes 3 hours to run on my desktop. How do I get this running on, say, the Google cloud? I need other people to be able to use it as well; so how do I do that? And how do I determine if it is cost-effective?
This example stays with Google Cloud Platform (GCP) terminology but it actually applies equally well to any cloud. Deciding which cloud to try is a matter of existing inclination (if you have any) or exploration of the factors that differentiate the different clouds. Bottom line is you really can’t go wrong. I’m going to provide an answer that includes notes on how you would bootstrap the answer yourself; which is primarily about formulating the right search string.
To get into this common ‘evaluation’ task we’re going to need two generic cloud terms: Instance and Image. An Instance is a Virtual Machine or VM, the kind you rent by the hour (or on Google I believe it is prorated by the second). Back in your lab you already have a desktop machine with some specifications; so with those in mind the first thing to do is type gcp instance types into your browser search bar; and behold this page comes up. You’ll notice that on Google cloud the VM or instance service is called Compute Engine. Every cloud has their unique terminology for their services. Anyway: There are many options here. Look through this page to identify a compute engine that is perhaps a bit more powerful than your desktop.
Let’s suppose you like the C2D compute optimized instance; so next question is ‘How much is this instance going to cost?’ Again search: c2d compute engine cost to get to a pricing page. Here we find the C2D standard machine types table with a number of options. Suppose the C2D-standard-32 is a bit more powerful than your desktop. It costs $1.45 per hour for on demand which is the most expensive way to go. You can explore Spot instances ($.35 per hour) and other options; but for now this is a reasonable price for the feasibility test we have in mind.
Now you want access to the Google cloud to try out a test run of the DJFFSTB software. Search ‘google cloud free trial’ to arrive at this web page where you can sign up for $300 in credits to use for 90 days. This looks great – you can start a paid account later on – but be aware: Sometimes the try-it-out accounts will restrict which instance types you can use; so you’ll have to jump in and see how that goes.
Once you have your trial credentials: Sign in to the GCP console and look for the Compute Engine service. A wizard walks you through the steps of getting your Compute Engine running; and you log in directly through the console or using ssh. You can move data and/or code in using sftp. So notice you need the ip address of the instance they give you. Also make sure you stop the instance at the end of the day so you don’t pay for it overnight when it is not doing anything. That’s stop, not terminate. The terminate option will blow away the instance. Stop is just turning it off.
Building your compute environment on the GCP compute engine instance is analogous to building it on a new computer that you just purchased. One detail to address is if DJFFSTB is commercial software that requires a license to run. The vendor of that software will typically have instructions on how to use it on a cloud machine.
Now suppose you configure this (Compute Engine) instance and your test of DJFFSTB works and even takes a little less time compared to your desktop. Now you wonder how you’d go about making this compute environment available to other people. This is where the second term Image comes into play. You create an image of your working instance. This is like making a massive zip file out of the entire operating system including your code and data. The image goes into a storage location; and once it is there you can start up another machine from that image. If you like you could even start up seven more machines from that one image. They can be more or less powerful than the instance you started with.
Images are shareable. You can grant them public read permission (for example) so that anyone can come along and use it as a basis for starting up their own VM. This is nice because they do not have to go through the configuration steps you did. They presumably access your image from their own GCP account so they are paying for the instance they start from it. You will want to choose a collaborator to test this out so you can iron out any wrinkles. There are other permission options as well; so you can be selective about who can use your image.
This is the concpetual starting point. There are more options and terms to learn in dealing with data storage and data access.
The cloud console or portal should be able to tell you how much you have spent on your virtual machine / instance / compute engine. You can use this and a little more research into Spot instances to get a sense of what it will cost to run your DJFFSTB models at scale. That’s the next stage in evaluating whether the cloud will work for your research program. Be aware that each cloud provider has their own approach to research credit grants; so it can be further enlightening to explore those programs and options to see what is available to help you get going.
We like to advocate for taking advantage of cost mediation measures when a research team is getting started on the cloud. It removes one big barrier to getting things sorted out on the technical side. At the same time we discourage treating cloud credit programs as a de facto source of ongoing funding. Supposing your research computing really works and benefits from using the cloud, you will ideally integrate cloud costs into grant proposals, analogous to budgeting for compute hardware.
This answer covers the conceptual entry point into evaluating cloud viability for research computing. More information is available at the cloudbank portal and elsewhere.