Why is my laptop faster than the Amazon Cloud?

I have a mac laptop with a 2.2 GHz Intel Core i7. I believe my laptop has four cores.
I just ran an instance on the EC2 Amazon Cloud Cluster Compute Eight Extra Large Instance (cc2.8xlarge). I believe such an instance is equivalent to renting a computer with an Intel Xeon E5-2670 chip if the AWS web pages are to be believed. This chip is often clocked at 2.6 GHz and has 8 real cores (16 virtual ones).
On the one hand, parfor loops definitely run faster in the cloud. However, there are several pieces of my code which are not parallelized where my laptop runs faster than the instance (0.6 vs. 1.2 seconds, 3.2 vs. 5.5 seconds, 4.9 vs. 7.7 seconds). How can my laptop possibly be so much faster than the cloud?
(I believe I am giving CPU times. The wall times in the cloud appear to be substantially longer by factors of two or three.)

2 Comments

Could you provide some example code illustrating the slow-down that you are seeing?
I think something like the following should illustrate what I'm seeing. In plain English, I'm multiplying lots of different big matrices together. The following example is not what I'm actually running, and can easily be optimized. The point is just that as currently written, it executes in 1.2 seconds on my computer while I anticipate on Amazon, it will take probably about 2 seconds of CPU time. (At the moment, I can't even get a batch job to execute, which is painful because it means I just wasted five bucks on the license and the instance to try to run a silly test code.)
tic
Nx = 300;
Ny = 300;
Nz = 20;
Nloops = 20;
x = 1:Nx;
y = 1:Ny;
z = 1:Nz;
[XXX,YYY,ZZZ] = ndgrid(x,y,z);
[XX,YY] = ndgrid(x,x);
MMM = sin(XXX).*cos(YYY)+ZZZ.^2;
NN = XX.^2 + YY.^2;
MM = reshape(MMM,Nx,Ny*Nz);
PP = zeros(Nloops,Nx,Ny,Nz);
for j = 1:Nloops
PPP = NN*MM;
PP(j,:,:,:) = reshape(PPP,Nx,Ny,Nz);
end
toc

Sign in to comment.

 Accepted Answer

Remember that all MDCS workers run in single computational thread mode. This means that where functions are intrinsically multi-threaded by MATLAB, you might expect to see this sort of effect. You might try comparing EC2 workers against the 'local' workers on your laptop which are also in single computational thread mode.

1 Comment

I see. Before I was running the script on my laptop by first opening a matlabpool and then typing the name of the script. If instead I issue the script as a batch job to my laptop, the CPU time is substantially slower, 1.8 seconds, or about the time I would anticipate the cloud taking. (The wall time is again enormous -- over twenty seconds.)
So if I use my computer as the head node, matlab will automatically multi-thread certain commands. If instead I send a batch job, multi-threading is turned off? That's annoying, but I guess it makes sense in a way.

Sign in to comment.

More Answers (1)

I think I may have an answer to my own question. The 2.7 GHz number I quoted for a Xeon E5-2670 is probably per physical core. The virtual cores presumably run at half that, which is almost a factor of two slower than the four physical cores on my laptop.
So I need to figure out a way, if it's possible, to let an 8 core machine actually be an 8 core machine in the Amazon cloud.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!