Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDArt doesn't work with Julia 0.6 #76

Open
Cody-G opened this issue Mar 8, 2018 · 4 comments
Open

CUDArt doesn't work with Julia 0.6 #76

Cody-G opened this issue Mar 8, 2018 · 4 comments

Comments

@Cody-G
Copy link
Contributor

Cody-G commented Mar 8, 2018

After a moderate amount of effort I still can't get CUDArt to work with Julia 0.6. It still works on 0.5 on cannon with an out-of-date set of packages. The problems seem to stem from CUDArt's dependence on CUDAdrv and CUDAapi. The latter packages have evolved a lot but according to the README CUDArt is no longer actively developed. Also see the last comments here. (We use the standalone branch of CUDArt in cannon's 0.5 packages, which doesn't require those packages). I can see three options for getting this working on 0.6. @timholy which do you think is best?

  1. Switch back to CUDArt standalone (requires updating that branch for 0.6 and also updating BlockRegistration)
  2. Fix CUDArt to work with current CUDAdrv and CUDAapi packages.
  3. Drop dependence on CUDArt altogether and use CUDAdrv instead.

My feeling is that option 3 is the best practice but may be a lot of work.

@Cody-G
Copy link
Contributor Author

Cody-G commented Mar 8, 2018

In case it's useful, here's the segfault I get with the current releases of CUDArt, CUDAdrv, and CUDAapi on 0.6 (I also tried various combinations of master branches and sometimes received different errors)

ERROR: LoadError: 
signal (11): Segmentation fault
while loading no file, in expression starting on line 0
unknown function (ip: 0x7fddc7f87129)
unknown function (ip: 0x7fddc7f953ee)
unknown function (ip: 0x7fddc7eb550f)
cuModuleUnload at /usr/lib/x86_64-linux-gnu/libcuda.so (unknown line)
macro expansion at /home/cody/.julia/v0.6/CUDAdrv/src/base.jl:143 [inlined]
unsafe_unload! at /home/cody/.julia/v0.6/CUDAdrv/src/module.jl:63
unknown function (ip: 0x7fddcd0b1ed2)
jl_call_fptr_internal at /home/cody/src/julia_06/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/cody/src/julia_06/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/cody/src/julia_06/src/gf.c:1926
jl_apply at /home/cody/src/julia_06/src/julia.h:1424 [inlined]
run_finalizer at /home/cody/src/julia_06/src/gc.c:111
jl_gc_run_finalizers_in_list at /home/cody/src/julia_06/src/gc.c:200
run_finalizers at /home/cody/src/julia_06/src/gc.c:234
jl_gc_enable_finalizers at /home/cody/src/julia_06/src/gc.c:245
jl_mutex_unlock at /home/cody/src/julia_06/src/./julia_threads.h:586 [inlined]
jl_typeinf_end at /home/cody/src/julia_06/src/gf.c:2401
typeinf_code at ./inference.jl:2584
unknown function (ip: 0x7fdde91c41bd)
jl_call_fptr_internal at /home/cody/src/julia_06/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/cody/src/julia_06/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/cody/src/julia_06/src/gf.c:1926
typeinf_ext at ./inference.jl:2622
unknown function (ip: 0x7fdde91a2da2)
jl_call_fptr_internal at /home/cody/src/julia_06/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/cody/src/julia_06/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/cody/src/julia_06/src/gf.c:1926
jl_apply at /home/cody/src/julia_06/src/julia.h:1424 [inlined]
jl_apply_with_saved_exception_state at /home/cody/src/julia_06/src/rtutils.c:257
jl_type_infer at /home/cody/src/julia_06/src/gf.c:262
jl_compile_for_dispatch at /home/cody/src/julia_06/src/gf.c:1661
jl_compile_method_internal at /home/cody/src/julia_06/src/julia_internal.h:307 [inlined]
jl_call_method_internal at /home/cody/src/julia_06/src/julia_internal.h:354 [inlined]
jl_apply_generic at /home/cody/src/julia_06/src/gf.c:1926
#showerror#478 at ./replutil.jl:222
unknown function (ip: 0x7fddcd0b1da9)
jl_call_fptr_internal at /home/cody/src/julia_06/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/cody/src/julia_06/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/cody/src/julia_06/src/gf.c:1926
jl_apply at /home/cody/src/julia_06/src/julia.h:1424 [inlined]
jl_invoke at /home/cody/src/julia_06/src/gf.c:51
showerror at ./replutil.jl:221
unknown function (ip: 0x7fddcd0b1a7d)
jl_call_fptr_internal at /home/cody/src/julia_06/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/cody/src/julia_06/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/cody/src/julia_06/src/gf.c:1926
display_error at ./client.jl:137
unknown function (ip: 0x7fddcd0af08d)
jl_call_fptr_internal at /home/cody/src/julia_06/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/cody/src/julia_06/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/cody/src/julia_06/src/gf.c:1926
do_call at /home/cody/src/julia_06/src/interpreter.c:75
eval at /home/cody/src/julia_06/src/interpreter.c:242
eval_body at /home/cody/src/julia_06/src/interpreter.c:539
jl_toplevel_eval_body at /home/cody/src/julia_06/src/interpreter.c:511
jl_toplevel_eval_flex at /home/cody/src/julia_06/src/toplevel.c:571
jl_toplevel_eval_in at /home/cody/src/julia_06/src/builtins.c:496
eval at ./boot.jl:235
unknown function (ip: 0x7fdde92f240f)
jl_call_fptr_internal at /home/cody/src/julia_06/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/cody/src/julia_06/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/cody/src/julia_06/src/gf.c:1926
print_response at ./REPL.jl:137
unknown function (ip: 0x7fddcd0aeafd)
jl_call_fptr_internal at /home/cody/src/julia_06/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/cody/src/julia_06/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/cody/src/julia_06/src/gf.c:1926
print_response at ./REPL.jl:129
unknown function (ip: 0x7fddcd0ae7ed)
jl_call_fptr_internal at /home/cody/src/julia_06/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/cody/src/julia_06/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/cody/src/julia_06/src/gf.c:1926
do_respond at ./REPL.jl:646
unknown function (ip: 0x7fddcd045eb1)
jl_call_fptr_internal at /home/cody/src/julia_06/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/cody/src/julia_06/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/cody/src/julia_06/src/gf.c:1926
do_call at /home/cody/src/julia_06/src/interpreter.c:75
eval at /home/cody/src/julia_06/src/interpreter.c:242
eval_body at /home/cody/src/julia_06/src/interpreter.c:539
jl_toplevel_eval_body at /home/cody/src/julia_06/src/interpreter.c:511
jl_toplevel_eval_flex at /home/cody/src/julia_06/src/toplevel.c:571
jl_toplevel_eval_in at /home/cody/src/julia_06/src/builtins.c:496
eval at ./boot.jl:235
unknown function (ip: 0x7fdde92f240f)
jl_call_fptr_internal at /home/cody/src/julia_06/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/cody/src/julia_06/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/cody/src/julia_06/src/gf.c:1926
run_interface at ./LineEdit.jl:1583
unknown function (ip: 0x7fdde936c1cf)
jl_call_fptr_internal at /home/cody/src/julia_06/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/cody/src/julia_06/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/cody/src/julia_06/src/gf.c:1926
run_frontend at ./REPL.jl:945
run_repl at ./REPL.jl:180
unknown function (ip: 0x7fddcd041952)
jl_call_fptr_internal at /home/cody/src/julia_06/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/cody/src/julia_06/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/cody/src/julia_06/src/gf.c:1926
_start at ./client.jl:413
unknown function (ip: 0x7fdde9345cd8)
jl_call_fptr_internal at /home/cody/src/julia_06/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/cody/src/julia_06/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/cody/src/julia_06/src/gf.c:1926
jl_apply at /home/cody/src/julia_06/ui/../src/julia.h:1424 [inlined]
true_main at /home/cody/src/julia_06/ui/repl.c:127
main at /home/cody/src/julia_06/ui/repl.c:264
__libc_start_main at /build/glibc-Cl5G7W/glibc-2.23/csu/../csu/libc-start.c:291
unknown function (ip: 0x401668)
Allocations: 19319693 (Pool: 19316851; Big: 2842); GC: 41
Segmentation fault (core dumped)

@Cody-G
Copy link
Contributor Author

Cody-G commented Mar 10, 2018

We could take option 3 a step further and use the CuArrays package to implement the mismatch calculation directly. This currently requires a small workaround for our GPUs (see https://github.com/JuliaGPU/CUDAnative.jl/issues/165). If we could use CuArrays without losing efficiency I think it would be a big win. Then we should be able to use nearly the same code for CPU and GPU mismatch calculation.

@Cody-G
Copy link
Contributor Author

Cody-G commented Aug 23, 2018

I finally updated our code to work with newer versions of CUDArt and CUDAdrv on Julia 0.6. My work is in the branch https://github.com/HolyLab/BlockRegistration/tree/cjg/fix_cuda. I think this would be a good base on which to build 0.7 support, but currently we can't merge with master because I had to remove a Femtocleaner commit that introduced a bug, see #89. Note that one unrelated test still fails as described in #88. In a separate branch julia0.6 I'm putting this work as well as updates to REQUIRE that pin all packages very carefully for 0.6. Let me know what's the preferred merge procedure, if any.

@timholy
Copy link
Member

timholy commented Aug 25, 2018

We can re-run femtocleaner, so I'm fine with bypassing that commit. @kdw503 is starting with smaller repositories than this one, so it may be a while before we get here anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants