添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

I have two loss variables, A and B.
When I do A.backward() everything works fine, but when I do B.backward() i get the following error:
TypeError: backward() takes 2 positional arguments but 3 were given

I expect that the difference is in the history of the variables but I cannot find it.
Any ideas or directions will be appreciated.

I have an architecture that consists largely of regular Modules and Functions, but also uses two custom Functions, each one associated with a different loss (I add their code below).

The loss functions themselves are also non-trivial but are all a series of torch functions and are probably okay so for now I will only add the code for the custom Functions to maintain brevity.

The custom Function for loss A is simply adding some noise to the input:

    @staticmethod
    # bias is an optional argument
    def forward(ctx, input, stdev):
        normal_sample = torch.normal(torch.zeros(input.size()), torch.zeros(input.size()) + stdev).cuda()
        # re-center the means to the input
        output = input + normal_sample
        ctx.stdev = stdev
        ctx.mark_dirty(input)
        ctx.save_for_backward(input, output)
        return output
    @staticmethod
    def backward(ctx, grad_output):
        input, output = ctx.saved_variables
        stdev = ctx.stdev
        tensor_output = output.data
        tensor_input = input.data
        tensor_output.normal_(0, stdev[0][0])
        # re-center the means to the input
        tensor_output.add(tensor_input)
        del ctx.stdev
        return Variable(tensor_output), None

The custom function for loss B uses its input to weigh multinomial sampling and outputs the resulting indexing:

    @staticmethod
    def forward(ctx, input):
        one = torch.ones(1).cuda()
        if len(input.size()) == 1:
            input = torch.unsqueeze(input, 0)
        output = torch.zeros(input.size())
        if input.is_cuda:
            output = output.cuda()
        # sample from categorical with p = input
        _index = torch.multinomial(input + constants.epsilon, 1, False)
        output.scatter_(1, _index, torch.unsqueeze(one.repeat(_index.size()[0]),1))
        ctx.mark_dirty(input)
        ctx.save_for_backward(input, output)
        return _index.float(), output
    @staticmethod
    def backward(ctx, grad_output):
        input, output = ctx.saved_variables
        gradInput = torch.zeros(input.size()).cuda()
        gradInput.copy_(output)
        gradInput.div_(input)
        gradInput.mul_(grad_output)
        return gradInput
              

This might be a dumb question, but I have a similar scenario and want to give multiple outputs from my custom forward function similar to what mordith was doing. One of these will need to be differentiated but the others will not. For example,

def forward(ctx, input)
return differentiated_output, non_differentiated_output_for_diagnostics, another_one