Data Driven Series: A Comprehensive Guide to Oracle Data Labelling Service: Part 1

Author: Phil Godfrey

What is Data Labelling?

Data Labelling is the process of “identifying properties (labels) of documents, text, and images, and annotating them (labelling)”.

What are some examples?

What you can label is almost endless, it could be in the format of text, documents and even images.

The topic of a news article
The sentiment of a tweet
Objects identified within an image
and many more

Data Labelling Service

The OCI Data Labelling service is an OCI native service that allows customers and business users to leverage labelling functionality. This includes utilizing built-in functionality to

Create and browse datasets
View data records (text, images)
Apply labels for the purposes of building AI/ML models.

The service also provides interactive user interfaces designed to aid in the labelling process, with an interactive user interface to draw bounding boxes used for object detection within images.

Pre-Requisites

Administrators with appropriate privileges must create:

Create a compartment to be utilized by the Data Labelling Service
Specific object storage buckets created (utilized to store data labelling outputs)
Associated policies in IAM for the data labelling service
Dynamic Group to control access

An example of this is included below:

allow dynamic-group <group_name> to read buckets in compartment <compartment_name>

allow dynamic-group <group_name> to read objects in compartment <compartment_name>

allow dynamic-group <group_name> to manage objects in compartment <compartment_name> where any {request.permission='OBJECT_CREATE'}

Accessing the Data Labelling Service

In OCI, navigate to Analytics & AI, and under the Machine Learning subheading, you will find Data Labeling.

Once you click here, providing the necessary pre-requisites have been granted, you should see the overview of the Data Labelling service.

This has lots of useful resources for you to utilize, including video tutorials introducing the service in simple and easy to consume terms, through to more detailed and in-depth release notes and documentation.

On the left-hand side, you have access to three areas:

Overview – the page above
Datasets – area where datasets are stored
Work Requests – view any work requests initiated by the service

Datasets

These are critical components of the data labelling service, and are your data you want to label, whether that’s a set of documents, images, or text.

There are options available to you when working with datasets, you can either:

Create a dataset from scratch
Import a previously annotated dataset

Supported file formats

Dataset Type	Supported File Types
Images	JPEG, JPG and PNG
Text	CSV, TEXT and TXT
Documents	PDF, TIF, TIFF, JPEG, JPG and PNG

Join the next blog in this series where we will create a dataset and begin to work with the Data Labelling Service.

The Oracle Alchemist: Turning Data into Insights

Search This Blog

Data Driven Series: A Comprehensive Guide to Oracle Data Labelling Service: Part 1

Comments

Post a Comment