A session token is only required in case of temporary security credentials. See the AWS Identity and Access Management documentation on
Requesting temporary security credentials
.
Assign the following role policy to an account you set up to retrieve source tasks and store annotations in S3, replacing
<your_bucket_name>
with your bucket name:
{
"Version": "2012-10-17",
"Statement": [
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
"Resource": [
"arn:aws:s3:::<your_bucket_name>",
"arn:aws:s3:::<your_bucket_name>/*"
note
"s3:PutObject"
is only needed for target storage connections, and "s3:DeleteObject"
is only needed for target storage connections in Label Studio Enterprise where you want to allow deleted annotations in Label Studio to also be deleted in the target S3 bucket.
Set up cross-origin resource sharing (CORS) access to your bucket, using a policy that allows GET access from the same host name as your Label Studio deployment. See Configuring cross-origin resource sharing (CORS) in the Amazon S3 User Guide. Use or modify the following example:[
"AllowedHeaders": [
"AllowedMethods": [
"GET"
"AllowedOrigins": [
"ExposeHeaders": [
"x-amz-server-side-encryption",
"x-amz-request-id",
"x-amz-id-2"
"MaxAgeSeconds": 3000
configure access to your S3 bucket, do the following to set up Amazon S3 as a data source connection:
Open Label Studio in your web browser.
For a specific project, open Settings > Cloud Storage.
Click Add Source Storage.
In the dialog box that appears, select Amazon S3 as the storage type.
In the Storage Title field, type a name for the storage to appear in the Label Studio UI.
Specify the name of the S3 bucket, and if relevant, the bucket prefix to specify an internal folder or container.
Adjust the remaining parameters:
- In the File Filter Regex field, specify a regular expression to filter bucket objects. Use
.*
to collect all objects.
- In the Region Name field, specify the AWS region name. For example
us-east-1
.
- (Optional) In the S3 Endpoint field, specify an S3 endpoint if you want to override the URL created by S3 to access your bucket.
- In the Access Key ID field, specify the access key ID of the temporary security credentials for an AWS account with access to your S3 bucket.
- In the Secret Access Key field, specify the secret key of the temporary security credentials for an AWS account with access to your S3 bucket.
- In the Session Token field, specify a session token of the temporary security credentials for an AWS account with access to your S3 bucket.
- (Optional) Enable Treat every bucket object as a source file if your bucket contains BLOB storage files such as JPG, MP3, or similar file types. This setting creates a URL for each bucket object to use for labeling. Leave this option disabled if you have multiple JSON files in the bucket with one task per JSON file.
- (Optional) Enable Recursive scan to perform recursive scans over the bucket contents if you have nested folders in your S3 bucket.
- Choose whether to disable Use pre-signed URLs.
- All s3://… links will be resolved on the fly and converted to https URLs, if this option is on.
- All s3://… objects will be preloaded into Label Studio tasks as base64 codes, if this option is off. It’s not recommended way, because Label Studio task payload will be huge and UI will slow down. Also it requires GET permissions from your storage.
- Adjust the counter for how many minutes the pre-signed URLs are valid.
- Click Add Storage.
After adding the storage, click Sync to collect tasks from the bucket, or make an API call to sync import storage.
configure access to your S3 bucket, do the following to set up Amazon S3 as a target storage connection:
- Open Label Studio in your web browser.
- For a specific project, open Settings > Cloud Storage.
- Click Add Target Storage.
- In the dialog box that appears, select Amazon S3 as the storage type.
- In the Storage Title field, type a name for the storage to appear in the Label Studio UI.
- Specify the name of the S3 bucket, and if relevant, the bucket prefix to specify an internal folder or container.
- Adjust the remaining parameters:
- In the Region Name field, specify the AWS region name. For example
us-east-1
.
- (Optional) In the S3 Endpoint field, specify an S3 endpoint if you want to override the URL created by S3 to access your bucket.
- In the Access Key ID field, specify the access key ID of the temporary security credentials for an AWS account with access to your S3 bucket.
- In the Secret Access Key field, specify the secret key of the temporary security credentials for an AWS account with access to your S3 bucket.
- In the Session Token field, specify a session token of the temporary security credentials for an AWS account with access to your S3 bucket.
- Click Add Storage.
After adding the storage, click Sync to collect tasks from the bucket, or make an API call to sync export storage.
- Create new import storage then sync the import storage.
- See Create export storage and after annotating, sync the export storage.
Secure access to cloud storage.
GCS bucket with Label Studio, set up the following:
- Enable programmatic access to your bucket. See Cloud Storage Client Libraries in the Google Cloud Storage documentation for how to set up access to your GCS bucket.
- Set up authentication to your bucket. Your account must have the Service Account Token Creator and Storage Object Viewer roles and storage.buckets.get access permission. See Setting up authentication and IAM permissions for Cloud Storage in the Google Cloud Storage documentation.
- If you’re using a service account to authorize access to the Google Cloud Platform, make sure to activate it. See gcloud auth activate-service-account in the Google Cloud SDK: Command Line Interface documentation.
- Set up cross-origin resource sharing (CORS) access to your bucket, using a policy that allows GET access from the same host name as your Label Studio deployment. See Configuring cross-origin resource sharing (CORS) in the Google Cloud User Guide. Use or modify the following example:
echo '[
"origin": ["*"],
"method": ["GET"],
"responseHeader": ["Content-Type","Access-Control-Allow-Origin"],
"maxAgeSeconds": 3600
]' > cors-config.json
Replace YOUR_BUCKET_NAME
with your actual bucket name in the following command to update CORS for your bucket:
gsutil cors set cors-config.json gs://YOUR_BUCKET_NAME
sync import storage.
- Create new import storage then sync the import storage.
- See Create export storage and after annotating, sync the export storage.
Microsoft Azure Blob storage container with Label Studio. For details about how Label Studio secures access to cloud storage, see Secure access to cloud storage.
- shared access signatures. If your tasks contain azure-blob://… links, they must be pre-signed in order to be displayed in the browser.
- Adjust the counter for how many minutes the shared access signatures are valid.
- Click Add Storage.
- Repeat these steps for Target Storage to sync completed data annotations to a container.
After adding the storage, click Sync to collect tasks from the container, or make an API call to sync import storage.
- Create new import storage then sync the import storage.
- See Create export storage and after annotating, sync the export storage.
Redis database. You must store the tasks and annotations in different databases. You might want to use a Redis database if you find that relying on a file-based cloud storage connection is slow for your datasets.
Currently, this configuration is only supported if you host the Redis database in the default mode, with the default IP address.
Label Studio does not manage the Redis database for you. See the Redis Quick Start for details about hosting and managing your own Redis database. Because Redis is an in-memory database, data saved in Redis does not persist. To make sure you don’t lose data, set up Redis persistence or use another method to persist the data, such as using Redis in the cloud with Microsoft Azure or Amazon AWS.
sync import storage.
- Create new import storage then sync the import storage.
- See Create export storage and after annotating, sync the export storage.
Set environment variables for more about using environment variables.
Run Label Studio on Docker and use local storage.
note
If you are using Windows, ensure that you use backslashes when entering your Absolute local path.
- (Optional) In the File Filter Regex field, specify a regular expression to filter bucket objects. Use
.*
to collect all objects.
- (Optional) Toggle Treat every bucket object as a source file.
- Enable this option if you want to create Label Studio tasks from media files automatically, such as JPG, MP3, or similar file types. Use this option for labeling configurations with one source tag.
- Disable this option if you want to import tasks in Label Studio JSON format directly from your storage. Use this option for complex labeling configurations with HyperText or multiple source tags.
- Click Add Storage.
- Repeat these steps for Add Target Storage to use a local file directory for exporting.
After adding the storage, click Sync to collect tasks from the bucket, or make an API call to sync import storage.
- 6).
- All file paths must begin with
/data/local-files/?d=
.
- In the following example, the first directory is
dataset1
. For instance, if you have mixed data types in tasks, including
- audio files
1.wav
, 2.wav
within an audio
folder and
- image files
1.jpg
, 2.jpg
within an images
folder,
construct the paths as follows:
"id": 1,
"data": {
"audio": "/data/local-files/?d=dataset1/audio/1.wav",
"image": "/data/local-files/?d=dataset1/images/1.jpg"
"id": 2,
"data": {
"audio": "/data/local-files/?d=dataset1/audio/2.wav",
"image": "/data/local-files/?d=dataset1/images/2.jpg"
There are several ways to add your custom task: API, web interface, another storage. The simplest one is to use Import button on the Data Manager page. Drag and drop your json file inside the window, then click the blue Import button .
- Create new import storage then sync the import storage.
- See Create export storage and after annotating, sync the export storage.
Run Label Studio on Docker and use local storage.
Troubleshooting Import, Export, and Storage in the HumanSignal support center.
In this article
- Troubleshooting
- How external storage connections and sync work
- Source storage
- Target storage
- Amazon S3
- Configure access to your S3 bucket
- Set up connection in the Label Studio UI
- Set up target storage connection in the Label Studio UI
- Add storage with the Label Studio API
- Google Cloud Storage
- Prerequisites
- Set up connection in the Label Studio UI
- Add storage with the Label Studio API
- Microsoft Azure Blob storage
- Prerequisites
- Set up connection in the Label Studio UI
- Add storage with the Label Studio API
- Redis database
- Task format for Source Redis Storage
- Set up connection in the Label Studio UI
- Add storage with the Label Studio API
- Local storage
- Prerequisites
- Set up connection in the Label Studio UI
- Tasks with local storage file references
- Local Storage with Custom Task Format
- Add storage with the Label Studio API
- Set up local storage with Docker
- Troubleshooting cloud storage
Unlock more with Enterprise
Get a Demo