RuntimeError: Tensors must have same number of dimensions: got 2 and 3 · Issue #597

link管理

链接快照平台

输入网页链接，自动生成快照
标签化管理网页链接

相关文章推荐

还单身的红金鱼 · Scala字符串&字符串模板输出_scala ...· 3 周前 ·

温文尔雅的橙子 · printf関数 - ...· 1 周前 ·

想出家的毛豆 · 灵境行者免费阅读全本_张元清_第十章 ...· 4 月前 ·

活泼的芒果 · 大清铜币最新的价格已经超过200万，哪个版本 ...· 1 年前 ·

乖乖的夕阳 · 《他人之舌》音蜗全章节（无弹窗）全文免费阅读· 1 年前 ·

绅士的竹笋 · 如何解决php curl ...· 1 年前 ·

会开车的烈马 · 姜维传（策略游戏）_百度百科· 1 年前 ·

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement . We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@likethesky @Celebio @colesbury @pdollar @minqi
Hi, I have the error at this line when I use multiple points in an image to extract the segments.
My input points and labels used
points = tensor([[ 711, 455], [1578, 611], [2019, 640], [ 412, 739], [1820, 810]], device='cuda:0')
label = tensor([1, 1, 1, 1, 1], device='cuda:0')

for prediction:
masks, _, _ = predictor.predict_torch( point_coords=points, point_labels=label, multimask_output=False, )
segment-anything/segment_anything/modeling/prompt_encoder.py Line 84 6fdee8f

I think the problem is due to the predict_torch(...) function expecting there to be a batch dimension.

One way to fix the problem is to add in the extra dimension that the function expects. You can do this using the unsqueeze(...) method on the tensors:

x = 0 # or 1, if the 5 points/labels are meant to be separate prompts
points = tensor([[ 711,  455], [1578,  611], [2019,  640], [ 412,  739], [1820,  810]], device='cuda:0').unsqueeze(x)
label = tensor([1, 1, 1, 1, 1], device='cuda:0').unsqueeze(x)

If you use unsqueeze(1) then that is interpreted as meaning that the 5 points/labels are meant to generate 5 separate masks (and you'll get 5 masks as the output). Otherwise, if you use unsqueeze(0) , then the 5 points/labels are all interpreted as belonging to a single segmentation prompt, and you'll get 1 mask as an output.

Another solution, if you do mean to only get 1 mask, is to use the predict(...) function (i.e. not the _torch variant), which may be simpler (although it's going to do some extra pre-processing that you may not need if you already did that with your existing point coordinates). It would look something like:

import numpy as np # Need this to make np.array inputs!
points = [[ 711,  455], [1578,  611], [2019,  640], [ 412,  739], [1820,  810]]
label = [1, 1, 1, 1, 1]
masks, _, _ = predictor.predict(
	point_coords = np.array(points),
	point_labels = np.array(label),
	multimask_output=False)