Earth Observation in Oracle Cloud: Part 2

 Author: Philip Godfrey

 

What is Earth Observation data?

Earth Observation (EO) data, as defined by the EU Science Hub as data that is “used to monitor and assess the status of, and changes in, the natural and manmade environment”

With human civilization having an increasingly powerful influence on the Earth system, now seemed like the perfect time to explore what can be done with EO data in Oracle Cloud.

How is Earth Observation data captured?

The process of gathering observations of the Earth's surface and atmosphere via remote sensing instruments. The data is typically in the form of digital imagery.

There are many ways to gather this type of information, through various remote sensing platforms. Instantly with Earth Observation we think of space, but this isn’t the case. It can be through Drone / Aerial or Satellites.

Using Earth Observation data in Oracle Cloud

In this blog series, we will explore all around the Oracle world in terms of technology and will utilise a number of Oracle platforms:

The first blog focused on loading the data into the ADW, and creating a machine learning model in Oracle Data Science, if you missed it you can read it here

This latest blog will focus on understanding model performance and saving the model for future use.

      Our journey begins in the Autonomous Data Warehouse (ADW) - to store the data 

      We then move onto Oracle Data Science – to explore the data and utilise Machine Learning with our Earth Observation data

      We come back into ADW – to store the results back to the ADW database

      To enable the business to see the results we can present them in Oracle Analytics Cloud (OAC)

 

Machine Learning Model

In the last blog we created the Machine Learning model - as it’s a supervised Machine Learning model, where we learn from historic data, we need to split our data and will generate a train, validation, and test dataset.

We will use these datasets to help evaluate the performance of our model.

Confusion Matrix

Another way to understand how our model is performing is through a confusion matrix. A confusion matrix is “a technique for summarizing the performance of a classification algorithm”

The 97% accuracy score suggests it’s performing very well, but what the score doesn’t tell us is what the model might be getting wrong. The confusion matrix can help us understand this, and we can see if there are any class imbalances in the dataset.

 

We can see that a number of cases the model has predicted as HerbaceousVegetation was actually PermanentCrop – so we may want to take a look at some of these images to check the quality of the images.

Saving models

As Machine Learning models can take a long time to complete, it’s great to know we’re able to load previously created models and re-use them later – with one line of code!

Ways to improve model performance

Machine Learning models are very rarely complete in one iteration, you may work through several versions until you find one that is suitable for the need, without overfitting. Some tips to help improve model performance are: 

  • Feature Engineering

      Sometimes, the key lies in the data itself. Adding new features or transforming existing ones can help the model capture more relevant patterns.

  • Cross-Validation

·       Make sure you’re evaluating your model correctly by using cross-validation. It helps prevent overfitting and gives a better estimate of your model's true performance.

  • Data Augmentation

·         If you’re working with images or text (like we are in this example) then augmenting your data—such as rotating images or adding noise—can help your model learn more robust patterns.


There are countless strategies to enhance the performance of machine learning models, and it's important to remain open to experimentation. Don't shy away from testing new hypotheses or revisiting and refining your models over time. 

As new data becomes available and methodologies evolve, periodic reassessment can lead to significant improvements. Embracing an iterative approach allows you to adapt to changing circumstances and optimize your models for better accuracy and effectiveness.

Comments