Author: Philip Godfrey
Oracle Analytics Cloud is a powerful
tool that enables organizations to make data-driven decisions by providing
advanced analytics capabilities.
You are able to work with data that
exists in the database, or another common way to get started with Oracle
Analytics Cloud, is by loading your own dataset into the platform.
In this blog post, I will walk you
through the process of loading a CSV dataset into Oracle Analytics Cloud and
utilizing enrichments to enhance the value of your data.
Loading Data into OAC
Select your
dataset, which could be a local csv or xlsx file for example. In my case, this
will be a Disneyland Review document that has been sourced from openly
available data.
Once this has been selected, you will
be presented with an overview of the dataset, and you can confirm the data is
as you would expect.
From this quick view, I can see Review
ID and Rating are Measures, with the remaining fields Year_Month,
Reviewer_Location and Review_Text are Attributes
What are the key
differences between Attributes and Measures?
- Attributes are
typically categorical in nature, while measures are numerical.
- Attributes are used to describe the "what", while measures are used to measure the "how much" or "how many".
As Rating is a numerical value
with a specific scale (1 – 5) as it represents a quantitative measure, we will
keep this as a Measure.
Enriching
your dataset with Oracle Recommendations
Oracle
Analytics provides additional advanced analytic capabilities with the Recommendations
feature.
This
provides a list of in-built data recommendations which can enrich our dataset,
this can include breaking out dates into days / months / quarters / weekends
etc or adding geo-spatial features such as Long and Lat coordinates.
In our
example, we will utilize the Reviewer Location with capital, which
provides additional context which may be useful when presenting the analytics.
Some best practices to keep in mind when loading
and enriching your dataset in OAC:
- Use consistent column
names and data types across all datasets.
- Define clear data
quality rules to ensure data accuracy.
- Use data profiling to
identify missing values and anomalies.
- Apply data transformation and aggregation functions carefully to avoid losing valuable information.
Now we’re
happy with the dataset, we’ll go ahead and save it once we provide a name and a
description.
Look out
for the next blog in the series to see how we utilise this dataset in a Data Flow
and apply a Data Preparation step in Analyze Sentiment.
Comments
Post a Comment