Author: Philip Godfrey
Oracle Analytics Cloud is a powerful
tool that enables organizations to make data-driven decisions by providing
advanced analytics capabilities.
You can work with data that exists in
the database, or another common way to get started with Oracle Analytics Cloud,
is by loading your own dataset into the platform.
In this blog post, I will walk you
through the process of creating a Data Flow in Oracle Analytics Cloud and applying
Machine Learning to generate an output from your data.
Before we jump in, some key benefits
of why you would use Data Flows should be considered:
Key Benefits of
Data Flows
There are lots of key benefits we
could include here, but to cover off some key benefits:
- Simplified Data
Integration: ability
to integrate data from multiple sources, allowing you to easily combine
and transform data.
- Streamlined Data
Management: enables
you to manage your data in a single place, reducing the complexity of
managing multiple data sources.
- Faster Insights: automating data
integration and transformation, this accelerates the time it takes to get
insights from your data, enabling timely business decisions.
- Increased Data
Consistency: reproducibility
ensures your data is consistent across all systems and applications,
reducing errors and improving data quality.
Creating Data
Flow in OAC
Select the dataset to begin your data flow. In my case, this will be a Disneyland Review dataset we created in the previous blog.
Once this has been selected, you will be presented with an overview of the dataset, and you can confirm which columns you would like to include as part of the Data Flow.
As you can see down the left-hand side
of the page, there are several options you can include within your data flow,
making it fully flexible to your requirements.
Adding steps to a
Data Flow
As the name suggests, a Data Flow in a
process of steps which is used to generate an output. To add a step, you can
either use the left-hand side menu, or click the + button after the existing
step.
If we click the + button, we are
presented with a window where we can select another step. In this case, we’ll
apply some Machine Learning and add in Analyze Sentiment.
Analyze Sentiment
Once the step has been added into the
Data Flow, we need to select which column we want to apply the analyze
sentiment on.
In the Disney dataset we’re working
with, this will be Review_Text.
As the screenshot above suggests, this step will create a column, named emotion, and will append this to our Disney Review Dataset.
Save Data
The final step in our Data Flow is to
save the output, by adding the step Save Data.
As before, we can drag and drop this in, or add it in using the + button.
This will generate a new dataset, that
we’ve named Disney Review Sentiment, and will create in addition to our
original dataset, this won’t be touched by this process.
We can also add any default
aggregations to our newly created dataset, as well as how to treat each column
(attribute / measure) which we covered in the last blog.
Save Data Flow
As that was the last step in our Data
Flow, we now want to save the Data Flow, very useful if I wanted to run this
again, that I wouldn’t need to start from scratch.
All we need to do is provide a Data
Flow name and click OK.
You will then find the Data Flow has been created in the Data Flow tab.

Look out
for the next blog in the series which digs a little deeper on the Data
Preparation step, Analyze Sentiment, to understand emotion within text data.








Comments
Post a Comment