matlab seg fault on distributed computing cluster
Show older comments
My department has a computing cluster with matlab 2009a and the distributed computing server installed, however, no one has taken the time to get remote submission working, which is what I am trying.
My PC has matlab 2011a installed. I have tried using both the 2009a and the 2011a submission scripts, but both result in the same seg fault on the nodes of the cluster when using the distributed job validation tool. I read this seg fault in the file, for example, .../Job21/Task1.log:
Executing: /fsys2/projects/cluster/mathworks_r2009a/bin/worker
which: no shopt in (/u/ihincks/bin:/usr/local/bin:/usr/bin:/usr/X11R6/bin:/bin:/usr/games:/opt/gnome/bin:/opt/kde3/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin:.)
< M A T L A B (R) >
Copyright 1984-2009 The MathWorks, Inc.
Version 7.8.0.347 (R2009a) 64-bit (glnxa64)
February 12, 2009
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
About to construct the storage object using constructor "makeFileStorageObject" and location "/u/ihincks/remotematlab/ihincks/"
Saved wrapper =
matlabroot: '/usr/local/matlab'
separator: '/'
sentinel: '@'
function_handle: [1x1 struct]
Struct =
function: 'distcomp.simplejob'
type: 'classsimple'
file: ''
class: 'distcomp.filestorage'
------------------------------------------------------------------------
Segmentation violation detected at Tue Jul 19 09:39:25 2011
------------------------------------------------------------------------
Configuration:
MATLAB Version: 7.8.0.347 (R2009a)
MATLAB License: 308754
Operating System: Linux 2.6.16.46-0.12-smp #1 SMP Thu May 17 14:00:09 UTC 2007 x86_64
GNU C Library: 2.4 development
Window System: No active display
Current Visual: None
Processor ID: x86 Family 6 Model 7 Stepping 6, GenuineIntel
Virtual Machine: Java 1.6.0_04-b12 with Sun Microsystems Inc. Java HotSpot(TM) 64-Bit Server VM mixed mode
Default Encoding: UTF-8
Fault Count: 1
Register State:
rax = 00002ac5a1fd4dc0 rbx = 0000000000000000
rcx = 0000000001487390 rdx = 0000000000000021
rbp = 00000000407c43d0 rsi = 0000000000000000
rdi = 0000000002d0e2d0 rsp = 00000000407c43c0
r8 = feff29c3ff646b6f r9 = 00000000407c38f0
r10 = 00000000008bc600 r11 = 00002ac5a0e7c7b0
r12 = 0000000002d0e2d0 r13 = 0000000001487390
r14 = 00002ac5a2219120 r15 = 00002aaac41fb3b8
rip = 00002ac5a1fd530e flg = 0000000000010206
Stack Trace:
[0] libmwm_dispatcher.so:fillFunctionHandle(function_handle_tag*, mdMxarrayFunctionHandle*)(0x2ac5a13078a0, 0x02d0e2d0 ", 0x0118b6d0, 0x2aaac41f8df0) + 30 bytes
[1] libmwm_dispatcher.so:mdFunctionHandleFromStruct(0x2ac5a10b3460, 100, 4, 0x2ac5a342c980) + 101 bytes
[2] libmx.so:_HandleArrayForStream(miStreamRec_tag*, miItem_tag*, miStreamCommandType, int)(0x2ac50000000e, 776, 0, 0) + 1689 bytes
[3] libmx.so:miGetCurrentItem(14, 0, 0, 0) + 341 bytes
[4] libmat.so:matGetValueAtOffset(MATFile_tag*, char*, unsigned long)(0x407c5760 ", 0x2ac5a10b38e4, 128, 0) + 58 bytes
[5] libmat.so:matGetVariable5(MATFile_tag*, char const*)(0x407c57a0, 0x407c5788, 0x01487cf8 "/u/ihincks/remotematlab/ihincks/..", 0x02d23910) + 57 bytes
etc etc., I don't want to clutter the post too much. Let me know if you want to see any more output from anything, or hear anymore details.
Has anyone come across this? Can anyone suggest that I try something? I don't have root permissions on the cluster, and the people who maintain it are more or less useless.
Thanks!
Answers (1)
Edric Ellis
on 19 Jul 2011
0 votes
Parallel Computing Toolbox is only designed to work with the same release of MATLAB Distributed Computing Server, as per the documentation. Could you try installing R2009a on your machine and using that? I suspect it's not enough simply to use the R2009a submission scripts - it looks like the files that are being created by R2011a cannot be loaded correctly on the R2009a workers.
5 Comments
Ian
on 19 Jul 2011
Jason Ross
on 19 Jul 2011
You can't mix and match toolboxes from different releases. The code within the toolbox can be dependent on other parts of MATLAB that changed between releases and won't be compatible.
Ian
on 19 Jul 2011
Edric Ellis
on 20 Jul 2011
It should be fine to have multiple releases of MATLAB installed simultaneously on your machine (we at MathWorks do that all the time). Do you get the seg-fault when you attempt to validate the 'local' configuration?
Ian
on 20 Jul 2011
Categories
Find more on MATLAB Parallel Server in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!