link管理

链接快照平台

输入网页链接，自动生成快照
标签化管理网页链接

Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

Given groups=1, weight of size [10, 1, 5, 5], expected input[2, 3, 28, 28] to have 1 channels, but got 3 channels instead

Ask Question

I am trying to run CNN with train MNIST, but test on my own written digits. To do that I wrote the following code but I getting an error in title of this questions: I am trying to run CNN with train MNIST, but test on my own written digits. To do that I wrote the following code but I getting an error in title of this questions:

batch_size = 64
train_dataset = datasets.MNIST(root='./data/',
                               train=True,
                               transform=transforms.ToTensor(),
                               download=True)
test_dataset = ImageFolder('my_digit_images/', transform=transforms.ToTensor())
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                           batch_size=batch_size,
                                           shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
                                          batch_size=batch_size,
                                          shuffle=False)
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        #print(self.conv1.weight.shape)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv3 = nn.Conv2d(20, 20, kernel_size=3)
       #print(self.conv2.weight.shape)
        self.mp = nn.MaxPool2d(2)
        self.fc = nn.Linear(320, 10)
    def forward(self, x):
        in_size = x.size(0)
        x = F.relu(self.conv1(x))
        #print(x.shape)
        x = F.relu(self.mp(self.conv2(x)))
        x = F.relu(self.mp(self.conv3(x)))
        #print("2.", x.shape)
       # x = F.relu(self.mp(self.conv3(x)))
        x = x.view(in_size, -1)  # flatten the tensor
        #print("3.", x.shape)
        x = self.fc(x)
        return F.log_softmax(x)
model = Net()
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)
def train(epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = Variable(data), Variable(target)
        optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(output, target)
        loss.backward()
        optimizer.step()
        if batch_idx % 10 == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader.dataset),
                100. * batch_idx / len(train_loader), loss.item()))
def test():
    model.eval()
    test_loss = 0
    correct = 0
    for data, target in test_loader:
        data, target = Variable(data, volatile=True), Variable(target)
        output = model(data)
        test_loss += F.nll_loss(output, target, size_average=False).data
        pred = output.data.max(1, keepdim=True)[1]
        correct += pred.eq(target.data.view_as(pred)).cpu().sum()
    test_loss /= len(test_loader.dataset)
    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))
MNIST dataset contains black and white 1-channel images, while yours are 3-channeled RGB probably. Either recode your images or preprocess it like
img = img[:,0:1,:,:]
You can do it with custom transform, adding it after transforms.ToTensor()
The images in training and testing should follow the same distribution. Since MNIST data is by default in Grayscale and it is expected that you didn't change the channels, then the model expects the same number of channels in testing.
The following code is an example of how it's done using a transformation.
Following the order defined below, it
Converts the image to a single channel (Grayscale)
Resize the image to the size of the default MNIST data
Convert the image to a tensor
Normalize the tensor to have same mean and std as that of during training(assuming that you used the same values).
test_dataset = ImageFolder('my_digit_images/', transform=transforms.Compose([transforms.Grayscale(num_output_channels=1),
                                                                            transforms.Resize((28, 28)),
                                                                            transforms.ToTensor(),
                                                                            transforms.Normalize((0.1307,), (0.3081,))]))
        Thanks for contributing an answer to Stack Overflow!
Please be sure to answer the question. Provide details and share your research!
But avoid …
Asking for help, clarification, or responding to other answers.
Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.