I am trying to build two different PyTorch cpp/cuda extensions on Windows. One is Facebook provided “SparseConvNet” and another named “pointgroup_ops” by a research group somewhere else. Seems like they both have generally been built and tested on Linux machines. I need to build them on Windows. After fixing a few compilation issues, I have reached a stage where both of them have a single mysterious looking link error.
For the first library, the error is:
sparseconvnet_cuda.obj : error LNK2001: unresolved external symbol "public: long * __cdecl at::Tensor::data_ptr<long>(void)const " (??$data_ptr@J@Tensor@at@@QEBAPEAJXZ)
build\lib.win-amd64-3.7\sparseconvnet\SCN.cp37-win_amd64.pyd : fatal error LNK1120: 1 unresolved externals
For the second library, the error is:
pointgroup_ops.obj : error LNK2001: unresolved external symbol "public: long * __cdecl at::Tensor::data_ptr<long>(void)const " (??$data_ptr@J@Tensor@at@@QEBAPEAJXZ)
build\lib.win-amd64-3.7\PG_OP.cp37-win_amd64.pyd : fatal error LNK1120: 1 unresolved externals
They both look very similar. Any insights about the cause/fix? Of course, the code is same as the one that builds on Linux; so, it can’t be the case where something is genuinely not defined. ‘QEBAPEAJXZ’ not a string that occurs in code.
I did an online search for the strange ‘QEBAPEAJXZ’ string and seems like people have encountered this in the context of building other libraries as well.
Here are two for example: https://github.com/sshaoshuai/PointRCNN/issues/75
https://zhuanlan.zhihu.com/p/142676847
But, I did not find a solution that has worked. Any help in resolving this would be appreciated by me, and also probably by other people who have encountered this as well.
Thanks.
For the benefit of others running into this: Thanks to a poster on NVidia CUDA developers forum, I was able to solve the mysterious link error. The cause of the link error is the usage of “long” type which happens to be taken to be 32-bit long on Windows 64 bit machines while it is taken to be 64 bit long on Linux 64 bit machines. Replacing all occurrences of “long” in the code by “int64_t” got rid of the link error. As mentioned by that knowledgeable poster on the CUDA forum, it is a really bad idea to use “long” in code that is meant to be cross-platform.