添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

MindSpore除了可以让用户自定义数据增强的使用,还提供了一种自动数据增强方式,可以基于特定策略自动对图像进行数据增强处理。

下面分为 基于概率 基于回调参数 两种不同的自动数据增强方式进行介绍。

基于概率的数据增强

MindSpore提供了一系列基于概率的自动数据增强API,用户可以对各种数据增强操作进行随机选择与组合,使数据增强更加灵活。

RandomApply操作

RandomApply 操作接收一个数据增强操作列表,以一定的概率顺序执行列表中各数据增强操作,默认概率为0.5,否则都不执行。

在下面的代码示例中,通过调用 RandomApply 接口来以0.5的概率来顺序执行 RandomCrop RandomColorAdjust 操作。

import mindspore.dataset.vision as vision
from mindspore.dataset.transforms import RandomApply
transforms_list = [vision.RandomCrop(512), vision.RandomColorAdjust()]
rand_apply = RandomApply(transforms_list)

RandomChoice

RandomChoice操作接收一个数据增强操作列表transforms,从中随机选择一个数据增强操作执行。

在下面的代码示例中,通过调用RandomChoice操作等概率地在CenterCropRandomCrop中选择一个操作执行。

import mindspore.dataset.vision as vision
from mindspore.dataset.transforms import RandomChoice
transforms_list = [vision.CenterCrop(512), vision.RandomCrop(512)]
rand_choice = RandomChoice(transforms_list)

RandomSelectSubpolicy

RandomSelectSubpolicy操作接收一个预置策略列表,包含一系列子策略组合,每一子策略由若干个顺序执行的数据增强操作及其执行概率组成。

对各图像先等概率随机选择一种子策略,再依照子策略中的概率顺序执行各个操作。

在下面的代码示例中,预置了两条子策略:

  • 子策略1中包含RandomRotationRandomVerticalFlip两个操作,概率分别为0.5、1.0。

  • 子策略2中包含RandomRotationRandomColorAdjust两个操作,概率分别为1.0和0.2。

  • import mindspore.dataset.vision as vision
    from mindspore.dataset.vision import RandomSelectSubpolicy
    policy_list = [
        # policy 1: (transforms, probability)
        [(vision.RandomRotation((45, 45)), 0.5),
         (vision.RandomVerticalFlip(), 1.0)],
        # policy 2: (transforms, probability)
        [(vision.RandomRotation((90, 90)), 1.0),
         (vision.RandomColorAdjust(), 0.2)]
    policy = RandomSelectSubpolicy(policy_list)
    

    基于回调参数的数据增强

    MindSpore的sync_wait接口支持按训练数据的batch或epoch粒度,在训练过程中动态调整数据增强策略,用户可以设定阻塞条件来触发特定的数据增强操作。

    sync_wait将阻塞整个数据处理pipeline,直到sync_update触发用户预先定义的callback函数,两者需配合使用,对应说明如下:

  • sync_wait(condition_name, num_batch=1, callback=None):为数据集添加一个阻塞条件condition_name,当sync_update调用时执行指定的callback函数。

  • sync_update(condition_name, num_batch=None, data=None):用于释放对应condition_name的阻塞,并对data触发指定的callback函数。

  • 下面将演示基于回调参数的自动数据增强的用法。

  • 用户预先定义Augment类,其中preprocess为自定义的数据增强函数,update为更新数据增强策略的回调函数。

  • import numpy as np
    class Augment:
        def __init__(self):
            self.ep_num = 0
            self.step_num = 0
        def preprocess(self, input_):
            return np.array((input_ + self.step_num ** self.ep_num - 1),)
        def update(self, data):
            self.ep_num = data['ep_num']
            self.step_num = data['step_num']
    arr = list(range(1, 4))
    dataset = ds.NumpySlicesDataset(arr, shuffle=False)
    aug = Augment()
    dataset = dataset.sync_wait(condition_name="policy", callback=aug.update)
    dataset = dataset.map(operations=[aug.preprocess])
    for ep_num in range(epochs):
        for data in itr:
            print("epcoh: {}, step:{}, data :{}".format(ep_num, step_num, data))
            step_num += 1
            dataset.sync_update(condition_name="policy",
                                data={'ep_num': ep_num, 'step_num': step_num})
    epcoh: 0, step:0, data :[Tensor(shape=[], dtype=Int64, value= 1)]
    epcoh: 0, step:1, data :[Tensor(shape=[], dtype=Int64, value= 2)]
    epcoh: 0, step:2, data :[Tensor(shape=[], dtype=Int64, value= 3)]
    epcoh: 1, step:3, data :[Tensor(shape=[], dtype=Int64, value= 1)]
    epcoh: 1, step:4, data :[Tensor(shape=[], dtype=Int64, value= 5)]
    epcoh: 1, step:5, data :[Tensor(shape=[], dtype=Int64, value= 7)]
    epcoh: 2, step:6, data :[Tensor(shape=[], dtype=Int64, value= 6)]
    epcoh: 2, step:7, data :[Tensor(shape=[], dtype=Int64, value= 50)]
    epcoh: 2, step:8, data :[Tensor(shape=[], dtype=Int64, value= 66)]
    epcoh: 3, step:9, data :[Tensor(shape=[], dtype=Int64, value= 81)]
    epcoh: 3, step:10, data :[Tensor(shape=[], dtype=Int64, value= 1001)]
    epcoh: 3, step:11, data :[Tensor(shape=[], dtype=Int64, value= 1333)]
    epcoh: 4, step:12, data :[Tensor(shape=[], dtype=Int64, value= 1728)]
    epcoh: 4, step:13, data :[Tensor(shape=[], dtype=Int64, value= 28562)]
    epcoh: 4, step:14, data :[Tensor(shape=[], dtype=Int64, value= 38418)]
    

    ImageNet自动数据增强

    下面以ImageNet数据集上实现AutoAugment作为示例。

    针对ImageNet数据集的数据增强策略包含25条子策略,每条子策略中包含两种变换,针对一个batch中的每张图像随机挑选一个子策略的组合,以预定的概率来决定是否执行子策略中的每种变换。

    用户可以使用MindSpore中mindspore.dataset.vision模块的RandomSelectSubpolicy接口来实现AutoAugment,在ImageNet分类训练中标准的数据增强方式分以下几个步骤:

  • RandomCropDecodeResize:随机裁剪后进行解码。

  • RandomHorizontalFlip:水平方向上随机翻转。

  • Normalize:归一化。

  • HWC2CHW:图片通道变化。

  • 定义MindSpore操作到AutoAugment操作的映射:

  • import mindspore.dataset.vision as vision
    import mindspore.dataset.transforms as transforms
    # define Auto Augmentation operations
    PARAMETER_MAX = 10
    def float_parameter(level, maxval):
        return float(level) * maxval /  PARAMETER_MAX
    def int_parameter(level, maxval):
        return int(level * maxval / PARAMETER_MAX)
    def shear_x(level):
        transforms_list = []
        v = float_parameter(level, 0.3)
        transforms_list.append(vision.RandomAffine(degrees=0, shear=(-v, -v)))
        transforms_list.append(vision.RandomAffine(degrees=0, shear=(v, v)))
        return transforms.RandomChoice(transforms_list)
    def shear_y(level):
        transforms_list = []
        v = float_parameter(level, 0.3)
        transforms_list.append(vision.RandomAffine(degrees=0, shear=(0, 0, -v, -v)))
        transforms_list.append(vision.RandomAffine(degrees=0, shear=(0, 0, v, v)))
        return transforms.RandomChoice(transforms_list)
    def translate_x(level):
        transforms_list = []
        v = float_parameter(level, 150 / 331)
        transforms_list.append(vision.RandomAffine(degrees=0, translate=(-v, -v)))
        transforms_list.append(vision.RandomAffine(degrees=0, translate=(v, v)))
        return transforms.RandomChoice(transforms_list)
    def translate_y(level):
        transforms_list = []
        v = float_parameter(level, 150 / 331)
        transforms_list.append(vision.RandomAffine(degrees=0, translate=(0, 0, -v, -v)))
        transforms_list.append(vision.RandomAffine(degrees=0, translate=(0, 0, v, v)))
        return transforms.RandomChoice(transforms_list)
    def color_impl(level):
        v = float_parameter(level, 1.8) + 0.1
        return vision.RandomColor(degrees=(v, v))
    def rotate_impl(level):
        transforms_list = []
        v = int_parameter(level, 30)
        transforms_list.append(vision.RandomRotation(degrees=(-v, -v)))
        transforms_list.append(vision.RandomRotation(degrees=(v, v)))
        return transforms.RandomChoice(transforms_list)
    def solarize_impl(level):
        level = int_parameter(level, 256)
        v = 256 - level
        return vision.RandomSolarize(threshold=(0, v))
    def posterize_impl(level):
        level = int_parameter(level, 4)
        v = 4 - level
        return vision.RandomPosterize(bits=(v, v))
    def contrast_impl(level):
        v = float_parameter(level, 1.8) + 0.1
        return vision.RandomColorAdjust(contrast=(v, v))
    def autocontrast_impl(level):
        return vision.AutoContrast()
    def sharpness_impl(level):
        v = float_parameter(level, 1.8) + 0.1
        return vision.RandomSharpness(degrees=(v, v))
    def brightness_impl(level):
        v = float_parameter(level, 1.8) + 0.1
        return vision.RandomColorAdjust(brightness=(v, v))
    
    # define the Auto Augmentation policy
    imagenet_policy = [
        [(posterize_impl(8), 0.4), (rotate_impl(9), 0.6)],
        [(solarize_impl(5), 0.6), (autocontrast_impl(5), 0.6)],
        [(vision.Equalize(), 0.8), (vision.Equalize(), 0.6)],
        [(posterize_impl(7), 0.6), (posterize_impl(6), 0.6)],
        [(vision.Equalize(), 0.4), (solarize_impl(4), 0.2)],
        [(vision.Equalize(), 0.4), (rotate_impl(8), 0.8)],
        [(solarize_impl(3), 0.6), (vision.Equalize(), 0.6)],
        [(posterize_impl(5), 0.8), (vision.Equalize(), 1.0)],
        [(rotate_impl(3), 0.2), (solarize_impl(8), 0.6)],
        [(vision.Equalize(), 0.6), (posterize_impl(6), 0.4)],
        [(rotate_impl(8), 0.8), (color_impl(0), 0.4)],
        [(rotate_impl(9), 0.4), (vision.Equalize(), 0.6)],
        [(vision.Equalize(), 0.0), (vision.Equalize(), 0.8)],
        [(vision.Invert(), 0.6), (vision.Equalize(), 1.0)],
        [(color_impl(4), 0.6), (contrast_impl(8), 1.0)],
        [(rotate_impl(8), 0.8), (color_impl(2), 1.0)],
        [(color_impl(8), 0.8), (solarize_impl(7), 0.8)],
        [(sharpness_impl(7), 0.4), (vision.Invert(), 0.6)],
        [(shear_x(5), 0.6), (vision.Equalize(), 1.0)],
        [(color_impl(0), 0.4), (vision.Equalize(), 0.6)],
        [(vision.Equalize(), 0.4), (solarize_impl(4), 0.2)],
        [(solarize_impl(5), 0.6), (autocontrast_impl(5), 0.6)],
        [(vision.Invert(), 0.6), (vision.Equalize(), 1.0)],
        [(color_impl(4), 0.6), (contrast_impl(8), 1.0)],
        [(vision.Equalize(), 0.8), (vision.Equalize(), 0.6)],
    def create_dataset(dataset_path, train, repeat_num=1,
                       batch_size=32, shuffle=False, num_samples=5):
        # create a train or eval imagenet2012 dataset for ResNet-50
        dataset = ds.ImageFolderDataset(dataset_path, num_parallel_workers=8,
                                        shuffle=shuffle, decode=True)
        image_size = 224
        # define map operations
        if train:
            trans = RandomSelectSubpolicy(imagenet_policy)
        else:
            trans = [vision.Resize(256),
                     vision.CenterCrop(image_size)]
        type_cast_op = transforms.TypeCast(ms.int32)
        # map images and labes
        dataset = dataset.map(operations=[vision.Resize(256), vision.CenterCrop(image_size)], input_columns="image")
        dataset = dataset.map(operations=trans, input_columns="image")
        dataset = dataset.map(operations=type_cast_op, input_columns="label")
        # apply the batch and repeat operation
        dataset = dataset.batch(batch_size, drop_remainder=True)
        dataset = dataset.repeat(repeat_num)
        return dataset
    # Define the path to image folder directory.
    url = "https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/ImageNetSimilar.tar.gz"
    download(url, "./", kind="tar.gz", replace=True)
    dataset = create_dataset(dataset_path="ImageNetSimilar",
                             train=True,
                             batch_size=5,
                             shuffle=False)
    epochs = 5
    columns = 5
    rows = 5
    fig = plt.figure(figsize=(8, 8))
    itr = dataset.create_dict_iterator()
    for ep_num in range(epochs):
        step_num = 0
        for data in itr:
            for index in range(rows):
                fig.add_subplot(rows, columns, step_num * rows + index + 1)
                plt.imshow(data['image'].asnumpy()[index])
            step_num += 1
    plt.show()