Machine Learning For African Lovegrass Weed Detection In Drone Images – Shared Services Weeds Portal
Hosted on the AWS cloud
Overview and Background
This initiative draws from the original vision of Neil Murdoch, a Biosecurity Officer at Snowy Monaro Council, to improve the processes for acquisition of drone images covering the near region so that farmers and other local landscape stakeholders could benefit from the best of modern technologies to improve land management practices.
Towards advancing his vision, Neil sought technical assistance from 2pi Software to support the creation of the following capabilities in the Amazon Web Services (hereafter AWS) Cloud Platform environment :-
- A Cloud Upload facility to allow drone images captured by council workers and local community members to be fed directly to an AWS S3 cloud storage via an easy to use web interface
- Usage of AWS SageMaker – a service that enables developers to create, train, and deploy machine-learning models in the cloud
- With the aim of training a neural network (within the AWS SageMaker environment) cycling of images through a manual labelling step to identify with a relative degree of visual precision a square area where a target invasive species (weed) is located
- Evaluation of suitable low-cost drone packages that could be used by members of the local community to encourage drone image footage from the near region
After some initial work with Orange Hawkweed in 2019, it was agreed between all the major stakeholders to switch the focus of investigations to African Lovegrass in line with greater public discourse about the increasing threat that it appears to pose to agricultural productivity, particularly across the Monaro region of NSW.
Based on this work, a subsequent project to enable this weed image capture and platform to be extended and scaled up for use by multiple councils was discussed and supported by NSW Local Land Services, courtesy of the involvement of Invasive Species Coordinator Megan Wyllie.
The resulting digital platform, the Shared Services Weed Portal (SSWP), represents the next evolution of this programme and this solution is being expanded and popularised on an ongoing basis since 2020.
Using the cloud to break barriers to Machine Learning for biosecurity and general applications
Machine Learning (ML) is transforming the world but the range of tools and skills needed to build the complex processes that drive outcomes in this field are not always easily acquired. To address this challenge for Regional Council Biosecurity teams, a cloud-based platform has been created by a multidisciplinary group of stakeholders that facilitates large volume data upload and storage capabilities linked to an end to end machine learning multi-stage workflow that can deliver high-accuracy machine learning-powered determinations. The platform is known as the Shared Services Weeds Portal (SSWP).
The collaborating stakeholder group is made up of representatives from Snowy Monaro Regional Council, Local Land Services (NSW South East), University of Sydney, and 2pi Software. To date the project has focussed efforts on African Lovegrass identification and geo-location tagging in drone-acquired landscape footage collected across the Monaro region of NSW.
It is widely acknowledged that African Lovegrass proliferation has been an enormous challenge for the Monaro region of NSW over many decades and the existing mechanisms to minimise and control the extent of spread have had only limited success.
Neil Murdoch, Biosecurity Officer at Snow Monaro Regional Council commented “Artificial Intelligence is transformative technology that is impacting virtually every industry, and will do so for the foreseeable future. The construction of the SSWP will enable us to move quickly as new machine learning techniques evolve, and “future proof “our service in the biosecurity sphere.”
AI solves real world problems – weed detection/remediation
Machine learning is an application of artificial intelligence that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. The technology is particularly useful when coupled with an aerial platform such as a UAV (drone) to gather the input data, and will be used to better:
- Detect the presence of an invasive species
- Quantify the extent of an infestation
- Use as a planning tool for treatment
African Lovegrass detection – a difficult proposition
The application of machine learning in a cropping environment has been proven, particularly with the detection of broad-leaved weeds, however identifying “grassy” weeds from a drone flying in a rangeland setting is inherently more difficult. The progress in this space is still in the developmental stages, with some encouraging early success. Simply put, the way we “train” a machine is by inputting images (lots of them) that contain the target species within. We identify the target for the machine until it learns for itself. Each image is geotagged, enabling the weed, or weed clusters’, precise location to be identified for mapping or treatment.
AI has an insatiable appetite for data – the ‘Shared Services Weeds Portal’ (SSWP) offers a clear solution
A crucial but resource hungry element of the machine learning process is the provision of sufficient images for “training” data. Many thousands of images are required to be collected and stored as input data for each target species, and the lack of quality data could present a barrier to success. The SSWP significantly advances development of a solution to overcome this hurdle.
Key challenge #1 – allow weeds experts and stakeholders to collaborate by offering data upload and long-term storage in a central accessible location
The SSWP provides a central image/data repository as a core function and this capability can be accessed via a user friendly interface enabling uploading and storage of large files needed for training. Intrinsic to the portal’s capacity to take us forward into the future of machine learning is that it will allow a range of stakeholders to contribute to the image repository thereby encompassing a wide range of problem resolution needs in many geographies across Australia.
Currently the portal is available to an initial test group comprising Local Councils, Land Services group, and a landholder test group, but this will be expanded in the near future.
Ultimately, being cloud-based, the image library is highly scalable and need not be restricted to weeds. Any object or species can be included in the archive by simply opening a new folder.
The great added benefit of this approach is that data collected as part of this process remains available as a resource to be used in other weed identification and remediation activities – where previously science groups were not enabled to pool data for multiple purposes, adopting a cloud-based approach opens up a whole myriad of new possibilities. And equally, as technology improves year on year, the ability to reprocess historical data with new algorithmic enhancements is another exciting benefit facilitated by the SSWP solution.
Key challenge #2 – poor connectivity in remote/rural areas
The SSWP also has capacity to accept files uploaded by landholders, and in a future phase of development we envisage a system whereby a landholder will upload aerial images (possibly taken with their own drone), and receive back image & GPS references of infestations.
A unique feature of the SSWP is its inbuilt tolerance to “line dropouts” of the internet connection frequently experienced in rural areas that would otherwise hamper a landholders’ ability to upload data.
Key challenge #3 – criticality of meta-data and other data-sets
Where raw data such as aerial drone footage is uploaded to the SSWP, the solution mandates input of additional accompanying information to a series of targeted form fields to enable granular categorisation of the feed data for use in later stages of the end to end AI workflow. Criteria associated with the bulk data ingested by the platform is highly valuable to a data scientist charged with identifying the most suitable series of further processing steps needed to develop an effective trained neural network, the core mechanism underpinning AI.
All uploaded information is date and time-stamped and attributed to a registered user of the platform for future traceability. Additionally SSWP is equipped with features to extract meta-data automatically embedded within the uploaded files by virtue of their compliance to a standardised file format – for example Exif data represents an additional cache of meta-data that is useful to the data scientist to aid in effective neural net model training.
A ‘seasonal gate’ approach to neural network training for the ‘Monaro Wet Spring’ test case
Given the considerable costs and resources involved in data acquisition, upload, labelling and training, the SSWP team decided at the outset to focus on a narrowly-defined ‘seasonal gate’ within which to focus the AI targeted efforts during the initial test phases of the programme.
This additionally made sense as a first proof of concept approach as it was deemed overly ambitious to attempt to build an all-of-year functional AI tool given the wide diversity of weather conditions and landscape colouration that is commonly experienced within the target geography (Monaro region of NSW) over a typical 12 month period. In the cycle of work undertaken in 2020, the drone footage was captured during a 4 week period from, early to late October.
In taking this ‘seasonal gate’ approach, the Neural Net capability only needs to be highly effective within that narrowly defined set of feed imagery – it’s general purpose capabilities across the rest of the year are effectively ignored (year-round capability will be developed in future phases of the overall SSWP programme). A healthy amount of variation (e.g. drone parameters, paddock variations, land surface variations, prevailing weather conditions) within this narrowed range minimise the risk of what is commonly known in AI as ‘overfit’ (excessively trained to focus on a particular signal).
SSWP – 5 Step AI workflow in the cloud – ‘Upload-View-Label-Train-Infer’ – includes screenshots
Upload (Permissions-based with meta-data)
Security governing Users and access restrictions is enabled in the SSWP :
Via an easy-to-use web browser interface, upload image files ‘in bulk’ to the cloud :
Add clear distinguishing meta-data for later search or grouping
View (Browse and explore images and meta data including Exif data)
Enable visual inspection of images within the web browser
View meta-data including Exif Data
Label (for training and or data validation) –
Work task Assignments (Labelling jobs)
Labelling jobs can be assigned to nominated operators :-
Labelling Images – Bounding Box or, current option, Classification
The nominated operators can expertly confirm or denote location of weeds using the supported approaches – these include
- A Bounding Box canvas-drawing tool mechanism to demarcate accurately locations of weed ‘clusters’ in the supplied images
- A Classification approach to confirm presence or not (i.e. yes or no) in image sets (generally of a smaller dimension than the original sources imagery, as generated via a ‘splice’ pre-processing step). Currently this method is enabled for the SSWP.
Train – build a neural network specialising in ALG detection
Set up and configure the Training model to develop the capability to identify presence of African Lovegrass in Drone images. Developing and refining models over many cycles and iterations to optimise the detection capability is the great challenge of Machine Learning/Artificial Intelligence. The AWS Sagemaker config screens assist greatly in this process :-
Infer – submit candidate data to evaluate the model’s effectiveness
Pass in new images ‘through’ the trained model for assessment and generation of a resulting confidence metric regarding presence or non presence of a target weed :-
SSWP – Pre and Post-processing steps (includes Colour-Picking)
To support the critical work of the overseeing data scientist a number of pre and post data processing steps can be enabled :-
- Colour-picking – focussing on particular colour characteristics within the data as a means of highlighting areas of particular interest or unambiguously identifiable plantlife. Available in SSWP since October 2021.
- Splicing – commonly splitting large images into smaller ones to reduce the area-size of detection
- Segmentation – demarcating areas to exclude from consideration (e.g. miscellaneous objects within the landscape) – typically identified by colour or pixel concentration
- Orthorectification – removing the effects of image perspective (tilt) and relief (terrain) effects for the purpose of creating a planimetrically correct image that features a constant scale wherein features are represented in their ‘true’ geo-locations
- Grayscaling – removing colour from images to accentuate image characteristics that may aid the neural network model training process
NOTE : Not all steps are currently enabled for the SSWP. Further development of the SSWP solution will include addition of a greater range of discrete data processing steps/modules.
The Tech Stack
The SSWP solution in its current form takes full advantage of the AWS Machine Learning capability combined with custom developed software to operate in this compute intensive environment.
AWS Cloud Services including SageMaker
Reducing the ‘undifferentiated heavy lifting of AI/ML’ on the AWS cloud platform
The Amazon Web Services (AWS) Cloud Platform, featuring the SageMaker Machine Learning capability, is at the heart of the solution. The SageMaker pipeline facilitates neural network training and other highly advanced data management and analysis support functions. From a user perspective, the seamless and efficient ‘Upload-View-Label-Train-Infer’ workflow removes the ‘heavy lifting’ of more traditional, commonly on-premise, Machine Learning tool-chains that can divert focus from the primary goal of sophisticated neural net enablement.
Modern Software Engineering Techniques in the cloud to enable ‘Industrial Grade’ AI workflow
The SSWP takes the fullest advantage of modern software development practices on offer an enterprise grade scaled-up platform to aid and support the complex data science processes involved in developing effective Machine Learning solutions to real-world problems. Key architectural and technological approaches adopted in developing the SSWP include :
- Adherence to Devops/Site Reliability Engineering principles and practise
- Agile Methodology (including SCRUM) governance of development cycles
- MicroServices – modular decoupled granular API services (AWS Lambda)
- Continuous integration/Continuous Deployment (of both application and infrastructure) – this allows faster feature updates of fixes to be applied to production software systems
- Infrastructure as Code (using the powerful AWS CDK toolkit)
Use of Go(lang) for performance optimisation in AWS Lambda (serverless capability)
While the serverless capabilities represented by AWS Lambda can be implemented using a number of popular coding approaches, the 2pi Software team opted to use the emerging Go programming language for compute-intensive processing steps in the workflow. Based on experiences to date with Go, in the AWS Lambda environment, it appears to offer optimal performance levels.
Extensive use of a range of powerful AWS Services
The AWS environment greatly facilitates fast moving software development for secure scale environments with mature integration and inter-operability between cooperating sub-systems. The SSWP takes advantage of a number of AWS features and services including :-
- SageMaker and GroundTruth Machine Learning capability suite
- Storage and Caching – S3, EBS, CloudFront
- Monitoring and Events – CloudTrail, CloudWatch
- VPC (Virtual Private Cloud) design and management
- High Availability Databases via AWS Relational Database Service – RDS/Aurora
- Lambda functions (serverless computing) and API Gateway
- Complex Account and Sub-Account management using AWS Control Tower
Cloud Architecture Diagram referencing AWS Services used
Custom components built using free and Open Source software toolsets (includes Cmfive)
The globally popular LAMP stack (Linux, Apache, MySQL, PHP) provides the core software building blocks enabling the SSWP user role and permissions features, in addition to the bulk file upload capability and asynchronous invocation of AWS Lambda functions.
Also underpinning the solution is the free and open source Cmfive PHP framework. More information about Cmfive can be found on its website at http://cmfive.com. Features of this framework include :
- User and group management and role based permission management
- Task and workflow management
- Modular architecture for easy extensibility and separation of concerns
- Restful API Integration with other software
The critical role of the Data Scientist
Although the successful applications of Artificial Intelligence have in recent years made this class of technology one of the most exciting and impactful of the modern era, the process of building these solutions is arduous and complex. While the SSWP solves key steps in the enablement and data provision aspects of AI, without expert data science guidance and the attendant exhaustively iterative trial and error cycles, training of neural networks can, in some circumstances, prove insurmountably difficult. Key areas where the skill of the data scientist is most needed in AI projects of the nature attempted as part of the SSWP programme include :-
- Feed data quality assurance
- Blending of data preprocessing steps to amplify underlying signals within data
- Labelling strategies and resourcing of suitable personnel to complete the manual processes involved
- Strategies to take advantage of existing established trained neural networks (so-called ‘Transfer Learning’
- Selection of an optimal AI core algorithm (there are many varying mathematical underpinning techniques available) and key configurational settings
- Evaluation and tuning of HyperParameter settings governing Neural Network Training cycles
- Integration of data signals from other sources – so called Fusion Learning
SSWP current status and near future roadmap
The best of modern cloud technology, using the globally popular AWS service, has been utilised to create a Minimum Viable Product (MVP) SSWP system. The work to date validates the core proposition that a nationally accessible feature-reach cloud-centralised AI data and processing solution can be created and supported.
Machine Learning (ML) requires large data-sets which can take time to build up. To optimise the ability of this technological approach to detection and eradication of ALG on the Monaro region of NSW (and landscapes sharing similar bio-characteristics), an initial narrowed focus on the Winter Wet season has been pilotted. Furthermore, as a practical step to ultimately aid material eradication, the training of the underlying neural network focusses on ALG clusters, rather than individual specific plants. A target 80% confidence interval recognition accuracy, for two of the so-called ‘seasonal gates’ (referred to earlier), is sought by December 2023 and it is anticipated that this will progressively improve every year into the future.
Once the hoped-for success metrics are established, Natural Resource Management (NRM) groups Australia-wide will be contacted to provide suitable data and to be involved in subsequent annual detection and eradication efforts.
Progressively over a nominal 5 year period, with a willing producer group and other stakeholder communities who embrace this new approach, the resulting ALG cluster location data (ostensibly GPS coordinates and local maps/visualisation) can be made publicly available (compliant with any applicable fair use legislative provisions) for usage by parties acting to eradicate ALG cluster occurrences.
Additionally, it is widely accepted that Machine Learning approaches can be enhanced materially by integrating inference conclusions with signals/indications from other indirectly related data sources. It is planned that partner teams will in the future contribute to the further development of the SSWP solution by making satellite imagery data sets available as an additional detection pathway for African Lovegrass to supplement the early Drone-acquired image datasets.
The call for more data – enable many channels for aggregate data collection via the SSWP
Historically large-scale AI and ML projects are initiated in academic, government and or larger corporate groups. Generally speaking, many such projects are undertaken with a view to achieving a target outcome for the sponsoring organisation – the teams involved provision computing infrastructure and algorithmic development to address their own immediate needs. As such, it could be stated that even in cases where groups may seek common goals, there does not appear to be a strong precedence for sharing of acquired data, and AI/ML know-how.
To address this perceived deficit, it is proposed that SSWP can become a centralised storage repository for many groups seeking to explore AI/ML techniques and capability to detect, and ultimately remediate, weed infestations in primarily agricultural landscapes.
It is planned that a community-linked function of the SSWP will be to collaborate with other groups acquiring bio-security data, and imagery in particular, via popular sources such as Satellite feeds, Drone imagery (as in the case of the initial SSWP project), Ground images from handheld cameras and mobile devices, or cross-terrain vehicles equipped with imaging devices. Data from a wide range of sources and groups can potentially be readily stored in the SSWP cloud environment – it is, in essence, an infinitely expandable storage facility. To increase the utility of the SSWP solution, and attract participation from the widest possible audience, the platform will be promoted as a central repository for many stakeholder groups.
As ML/AI capability remains an emerging field, many of the trained algorithms and weed detection facilities are in the early stages of development – theoretically, it may advantage the process of developing these capabilities if data is opportunistically amassed in anticipation of future solutions that may be established. The SSWP potentially can act as a central location for the aggregation from multiple sources of a large pool of data available for as-yet-undeveloped future neural network models. As such the SSWP could play a significant role in aiding to develop a collaborative community wide solution to invasive species control.
Special Thanks to Lachlan Ingram and Jed Brown (University of Sydney)
Advisory assistance and data collection contributions were made to this project by Dr Lachlan Ingram and PHD Student Jed Brown. Contact details are as follows :-
School of Life and Environmental Sciences (SOLES)
The University of Sydney
|Dr Lachlan Ingram|
Senior Lecturer in Sustainable Grassland ManagementSydney Institute of AgricultureSchool of Life and Environmental Sciences
The University of Sydney
About 2pi Software
Rural/Remote Australian digital capability
2pi Software is a software engineering company based in the Bega Valley with ambitions to develop products and service markets Australia-wide.
2pi Software is staffed with people who passionately pursue excellence in the software engineering craft and consistently apply the intellectual rigour required to build long-lasting, supportable and well-documented systems.
The 2pi Software vision includes the following goals and initiatives :-
- Establishing software development services as a viable long-term business activity in the region
- Promoting greater participation in software development as a career option for young people through regular ‘coding’ events, and liaison with local schools
- Advocacy of Intellectual Property asset creation as a potential key driver of job creation in our region
- Active organisation and involvement in entrepreneurial events such as the Annual StartUp Camp
- Continuous communication with local policy-makers about the possibilities of ICT as an economic growth enabler for the region
- Frequent networking with other professionals in the region to increase awareness of the potential impact that technological innovation can have in their respective domains, so-called Sector Seeding
- Maintaining strong business links to like-minded groups in nearby centres Cooma and Canberra
Community engagement for upskilling local people
The 2pi Software office is located on Carp St, the main street of Bega town, and the company management and staff live and work amongst the almost 60 farming families that currently make up the producers in the local Dairy industry.
Since the company’s inception, the 2pi Software team have promoted a community goal of creating 1000 tech sector jobs in our region by 2030.
To build a tech sector, the company has been at the heart of IntoIT Sapphire Coast (www.intoitsapphirecoast.com.au).
In 2014, 2pi Software funded the creation of Bega’s first ever Digital Co-Working Space called ‘CoWS Near The Coast’ (visit cowsnearthecoast.com.au). The site was honoured in February 2015 to have been visited by the then NSW Governor (now Governor General) General David Hurley who was encouraging of the initiatives underway.
In May 2016 the 2pi Software team were the lead organisers behind Regional Innovation Week in the Bega Valley (begainnovation.com.au) a programme featuring 10 events celebrating a number of aspects of creativity, technology and entrepreneurship.
Coding was also a big part of Innovation week, and 2pi Software has played a big role in running hackathons since November 2014, and continue to do so via the long-running Bega monthly Code-night.
The company actively takes part in National Science week and have been instrumental in organising demonstration of scientific applications of drones/quadcopters, robotics-building workshops, synthesiser-making, 3D printing, virtual reality and coding.
In 2018, members of the 2pi Software team ran the highly popular Bega’s AgTech Days programme (begaagtech.com.au) featuring a keynote address by Barry Irvin, Executive Chair of Bega Cheese, as well as invaluable showcasing of the relevance of tech to the local farming community.
In 2019 and 2020 2pi Software delivered a Federal Government sponsored Regional Employment Trials programme in the Bega Valley helping a number of Job Seekers learn Job Ready technology. A Department Newsroom website article celebrated the inaugural kick-off Hackathon activity here – https://www.employment.gov.au/newsroom/regional-employment-program-rethinking-local-recruitment
In 2020, the 2pi Software team delivered one of the first ever AgTech ‘Micro-Credential’ Courses to be run in Australia from Bega. Details here – https://agtechmicro-credential2020.eventbrite.com/