In the agriculture sector, the problem of identifying and counting the amount of fruit on trees plays an important role in crop estimation. The concept of renting and leasing a tree is becoming popular, where a tree owner leases the tree every year before the harvest based on the estimated fruit yeild. The common practice of manually counting fruit is a time-consuming and labor-intensive process. It’s one of the hardest but most important tasks in order to obtain better results in your crop management system. This estimation of the amount of fruit and flowers helps farmers make better decisions—not only on only leasing prices, but also on cultivation practices and plant disease prevention.

This is where an automated machine learning (ML) solution for computer vision (CV) can help farmers. Amazon Rekognition Custom Labels is a fully managed computer vision service that allows developers to build custom models to classify and identify objects in images that are specific and unique to your business.

Rekognition Custom Labels doesn’t require you to have any prior computer vision expertise. You can get started by simply uploading tens of images instead of thousands. If the images are already labeled, you can begin training a model in just a few clicks. If not, you can label them directly within the Rekognition Custom Labels console, or use Amazon SageMaker Ground Truth to label them. Rekognition Custom Labels uses transfer learning to automatically inspect the training data, select the right model framework and algorithm, optimize the hyperparameters, and train the model. When you’re satisfied with the model accuracy, you can start hosting the trained model with just one click.

In this post, we showcase how you can build an end-to-end solution using Rekognition Custom Labels to detect and count fruit to measure agriculture yield.

Solution overview

We create a custom model to detect fruit using the following steps:

  1. Label a dataset with images containing fruit using Amazon SageMaker Ground Truth.
  2. Create a project in Rekognition Custom Labels.
  3. Import your labeled dataset.
  4. Train the model.
  5. Test the new custom model using the automatically generated API endpoint.

Rekognition Custom Labels lets you manage the ML model training process on the Amazon Rekognition console, which simplifies the end-to-end model development and inference process.

Prerequisites

To create an agriculture yield measuring model, you first need to prepare a dataset to train the model with. For this post, our dataset is composed of images of fruit. The following images show some examples.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

We sourced our images from our own garden. You can download the image files from the GitHub repo.

For this post, we only use a handful of images to showcase the fruit yield use case. You can experiment further with more images.

To prepare your dataset, complete the following steps:

  1. Create an Amazon Simple Storage Service (Amazon S3) bucket.
  2. Create two folders inside this bucket, called raw_data and test_data, to store images for labeling and model testing.
  3. Choose Upload to upload the images to their respective folders from the GitHub repo.
    Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

The uploaded images aren’t labeled. You label the images in the following step.

Label your dataset using Ground Truth

To train the ML model, you need labeled images. Ground Truth provides an easy process to label the images. The labeling task is performed by a human workforce; in this post, you create a private workforce. You can use Amazon Mechanical Turk for labeling at scale.

Create a labeling workforce

Let’s first create our labeling workforce. Complete the following steps:

  1. On the SageMaker console, under Ground Truth in the navigation pane, choose Labeling workforces.
  2. On the Private tab, choose Create private team.
    Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,
  3. For Team name, enter a name for your workforce (for this post, labeling-team).
  4. Choose Create private team.
    Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,
  5. Choose Invite new workers.
    Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,
  6. In the Add workers by email address section, enter the email addresses of your workers. For this post, enter your own email address.
  7. Choose Invite new workers.
    Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

You have created a labeling workforce, which you use in the next step while creating a labeling job.

Create a Ground Truth labeling job

To great your labeling job, complete the following steps:

  1. On the SageMaker console, under Ground Truth, choose Labeling jobs.
  2. Choose Create labeling job.
    Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,
  3. For Job name, enter fruits-detection.
  4. Select I want to specify a label attribute name different from the labeling job name.
  5. For Label attribute name¸ enter Labels.
  6. For Input data setup, select Automated data setup.
  7. For S3 location for input datasets, enter the S3 location of the images, using the bucket you created earlier (s3://{your-bucket-name}/raw-data/images/).
  8. For S3 location for output datasets, select Specify a new location and enter the output location for annotated data (s3://{your-bucket-name}/annotated-data/).
  9. For Data type, choose Image.
  10. Choose Complete data setup.
    This creates the image manifest file and updates the S3 input location path. Wait for the message “Input data connection successful.”
    Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,
  11. Expand Additional configuration.
  12. Confirm that Full dataset is selected.
    This is used to specify whether you want to provide all the images to the labeling job or a subset of images based on filters or random sampling.
    Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,
  13. For Task category, choose Image because this is a task for image annotation.
  14. Because this is an object detection use case, for Task selection, select Bounding box.
  15. Leave the other options as default and choose Next.
    Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,
  16. Choose Next.
    Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,
    Now you specify your workers and configure the labeling tool.
  17. For Worker types, select Private.For this post, you use an internal workforce to annotate the images. You also have the option to select a public contractual workforce (Amazon Mechanical Turk) or a partner workforce (Vendor managed) depending on your use case.
  18. For Private teams¸ choose the team you created earlier.Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,
  19. Leave the other options as default and scroll down to Bounding box labeling tool.It’s essential to provide clear instructions here in the labeling tool for the private labeling team. These instructions acts as a guide for annotators while labeling. Good instructions are concise, so we recommend limiting the verbal or textual instructions to two sentences and focusing on visual instructions. In the case of image classification, we recommend providing one labeled image in each of the classes as part of the instructions.
  20. Add two labels: fruit and no_fruit.
  21. Enter detailed instructions in the Description field to provide instructions to the workers. For example: You need to label fruits in the provided image. Please ensure that you select label 'fruit' and draw the box around the fruit just to fit the fruit for better quality of label data. You also need to label other areas which look similar to fruit but are not fruit with label 'no_fruit'.You can also optionally provide examples of good and bad labeling images. You need to make sure that these images are publicly accessible.
  22. Choose Create to create the labeling job.
    Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

After the job is successfully created, the next step is to label the input images.

Start the labeling job

Once you have successfully created the job, the status of the job is InProgress. This means that the job is created and the private workforce is notified via email regarding the task assigned to them. Because you have assigned the task to yourself, you should receive an email with instructions to log in to the Ground Truth Labeling project.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

  1. Open the email and choose the link provided.
  2. Enter the user name and password provided in the email.
    You may have to change the temporary password provided in the email to a new password after login.
  3. After you log in, select your job and choose Start working.
    Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,
    You can use the provided tools to zoom in, zoom out, move, and draw bounding boxes in the images.
  4. Choose your label (fruit or no_fruit) and then draw a bounding box in the image to annotate it.
  5. When you’re finished, choose Submit.

Now you have correctly labeled images that will be used by the ML model for training.

Create your Amazon Rekognition project

To create your agriculture yield measuring project, complete the following steps:

  1. On the Amazon Rekognition console, choose Custom Labels.
  2. Choose Get Started.
  3. For Project name, enter fruits_yield.
  4. Choose Create project.
    Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

You can also create a project on the Projects page. You can access the Projects page via the navigation pane. The next step is to provide images as input.

Import your dataset

To create your agriculture yield measuring model, you first need to import a dataset to train the model with. For this post, our dataset is already labeled using Ground Truth.

  1. For Import images, select Import images labeled by SageMaker Ground Truth.
    Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,
  2. For Manifest file location, enter the S3 bucket location of your manifest file (s3://{your-bucket-name}/fruits_image/annotated_data/fruits-labels/manifests/output/output.manifest).
  3. Choose Create Dataset.
    Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

You can see your labeled dataset.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

Now you have your input dataset for the ML model to start training on them.

Train your model

After you label your images, you’re ready to train your model.

  1. Choose Train model.
  2. For Choose project, choose your project fruits_yield.
  3. Choose Train Model.
    Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

Wait for the training to complete. Now you can start testing the performance for this trained model.

Test your model

Your agriculture yield measuring model is now ready for use and should be in the Running state. To test the model, complete the following steps:

Step 1 : Start the model

On your model details page, on the Use model tab, choose Start.
Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,
Rekognition Custom Labels also provides the API calls for starting, using, and stopping your model.

Step 2 : Test the model

When the model is in the Running state, you can use the sample testing script analyzeImage.py to count the amount of fruit in an image.

  1. Download this script from of the GitHub repo.
  2. Edit this file to replace the parameter bucket with your bucket name and model with your Amazon Rekognition model ARN.

We use the parameters photo and min_confidence as input for this Python script.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

You can run this script locally using the AWS Command Line Interface (AWS CLI) or using AWS CloudShell. In our example, we ran the script via the CloudShell console. Note that CloudShell is free to use.

Make sure to install the required dependences using the command pip3 install boto3 PILLOW if not already installed.
Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

  1. Upload the file analyzeImage.py to CloudShell using the Actions menu.
    Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

The following screenshot shows the output, which detected two fruits in the input image. We supplied 15.jpeg as the photo argument and 85 as the min_confidence value.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

The following example shows image 15.jpeg with two bounding boxes.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,

You can run the same script with other images and experiment by changing the confidence score further.

Step 3:  Stop the model

When you’re done, remember to stop model to avoid incurring in unnecessary charges. On your model details page, on the Use model tab, choose Stop.

Clean up

To avoid incurring unnecessary charges, delete the resources used in this walkthrough when not in use. We need to delete the Amazon Rekognition project and the S3 bucket.

Delete the Amazon Rekognition project

To delete the Amazon Rekognition project, complete the following steps:

  1. On the Amazon Rekognition console, choose Use Custom Labels.
  2. Choose Get started.
  3. In the navigation pane, choose Projects.
  4. On the Projects page, select the project that you want to delete.
    1. Choose Delete.
      The Delete project dialog box appears.
  5. If the project has no associated models:
    1. Enter delete to delete the project.
    2. Choose Delete to delete the project.
  6. If the project has associated models or datasets:
    1. Enter delete to confirm that you want to delete the model and datasets.
    2. Choose either Delete associated models, Delete associated datasets, or Delete associated datasets and models, depending on whether the model has datasets, models, or both.

    Model deletion might take a while to complete. Note that the Amazon Rekognition console can’t delete models that are in training or running. Try again after stopping any running models that are listed, and wait until the models listed as training are complete. If you close the dialog box during model deletion, the models are still deleted. Later, you can delete the project by repeating this procedure.

  7. Enter delete to confirm that you want to delete the project.
  8. Choose Delete to delete the project.

Delete your S3 bucket

You first need to empty the bucket and then delete it.

  1. On the Amazon S3 console, choose Buckets.
  2. Select the bucket that you want to empty, then choose Empty.
  3. Confirm that you want to empty the bucket by entering the bucket name into the text field, then choose Empty.
  4. Choose Delete.
  5. Confirm that you want to delete the bucket by entering the bucket name into the text field, then choose Delete bucket.

Conclusion

In this post, we showed you how to create an object detection model with Rekognition Custom Labels. This feature makes it easy to train a custom model that can detect an object class without needing to specify other objects or losing accuracy in its results.

For more information about using custom labels, see What Is Amazon Rekognition Custom Labels?


About the authors

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,Dhiraj Thakur is a Solutions Architect with Amazon Web Services. He works with AWS customers and partners to provide guidance on enterprise cloud adoption, migration, and strategy. He is passionate about technology and enjoys building and experimenting in the analytics and AI/ML space.

Hyperedge- . IoT, Embedded Systems, Artificial Intelligence,Sameer Goel is a Sr. Solutions Architect in the Netherlands, who drives customer success by building prototypes on cutting-edge initiatives. Prior to joining AWS, Sameer graduated with a master’s degree from Boston, with a concentration in data science. He enjoys building and experimenting with AI/ML projects on Raspberry Pi. You can find him on LinkedIn.

Read more about this on: AWS