添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

Sync data from external storage

Integrate popular cloud and external storage systems with Label Studio to collect new items uploaded to the buckets, containers, databases, or directories and return the annotation results so that you can use them in your machine learning pipelines.

Set up the following cloud and other storage systems with Label Studio:

  • Amazon S3
  • Google Cloud Storage
  • Microsoft Azure Blob storage
  • Redis database
  • Local storage
  • Troubleshooting

    When working with an external cloud storage connection, keep the following in mind:

  • Label Studio doesn’t import the data stored in the bucket, but instead creates references to the objects. Therefore, you must have full access control on the data to be synced and shown on the labeling screen.
  • Sync operations with external buckets only goes one way. It either creates tasks from objects on the bucket (Source storage) or pushes annotations to the output bucket (Target storage). Changing something on the bucket side doesn’t guarantee consistency in results.
  • We recommend using a separate bucket folder for each Label Studio project.
  • For more troubleshooting information, see Troubleshooting Import, Export, & Storage in the HumanSignal support center.

    How external storage connections and sync work

    You can add source storage connections to sync data from an external source to a Label Studio project, and add target storage connections to sync annotations from Label Studio to external storage. Each source and target storage setup is project-specific. You can connect multiple buckets, containers, databases, or directories as source or target storage for a project.

    Source storage

    Label Studio does not automatically sync data from source storage. If you upload new data to a connected cloud storage bucket, sync the storage connection using the UI to add the new labeling tasks to Label Studio without restarting. You can also use the API to set up or sync storage connections. See Label Studio API and locate the relevant storage connection type.

    Task data synced from cloud storage is not stored in Label Studio. Instead, the data is accessed using a URL. You can also secure access to cloud storage using cloud storage credentials. For details, see Secure access to cloud storage .

    Source storage permissions