PeekingDuck: A Computer Vision Framework
AI Singapore’s Computer Vision (CV) Hub has worked on several industry projects over the last few years. We found that while the types of CV problems are very varied, many of these projects share repeated code, such as reading from a video or image. In some projects, even pre-trained CV models for object detection and pose estimation can be reused. To save development time for future projects, we decided to build a framework to simplify CV inference workloads. We have greatly benefited from this framework, and today, we are releasing it as an open-source project for you to reap the benefits as well.
We call this framework “PeekingDuck”. The name is a play on these words: “Peeking” in a nod to CV; and “Duck” from duck typing as we are using Python, a dynamically typed language. PeekingDuck is pip-installable and can be run from the command line, or imported into your Python code or Jupyter notebooks.
How PeekingDuck Works
PeekingDuck is a modular framework, with nodes as its building blocks. There are currently 5 categories of nodes as shown below:
Different nodes can be bundled together to form a pipeline, where the output of one node will be the input to another. In the example below, 4 nodes form a simple object detection pipeline. The input.live node reads from a webcam and produces “img” or image, passing it to the object detection model.yolo node, which predicts “bboxes” or bounding boxes.
The essence of CV are the CV models. Thus, we’ve bundled a few pre-trained object detection and pose estimation model nodes with PeekingDuck, which can be used right away. We included these first as they can be used to tackle a wide range of CV problems, and will be adding more model nodes to PeekingDuck over time.
Solving Real-World Problems
At CV Hub, we are focused on extending CV models to solve real-world problems. This is where PeekingDuck really shines – we combine different model, dabble and draw nodes to solve use cases such as:
- Social distancing, which was deployed in HP Inc’s factory floors in 2020 to ensure safety of their employees
- Zone counting, which has many applications such as assessing crowd density and retail analytics
- Group size checking, which helps ensure that group limits of social gatherings are adhered to
- Object counting, which can be used to count “objects” such as humans, vehicles, animals
We recognise that many CV problems are unique, and customisation is required. For example, you may need to take a snapshot of a video frame, and post it to your API endpoint; perhaps you have a model trained on a custom dataset, and would like to use PeekingDuck’s input, draw, and output nodes. PeekingDuck addresses this by allowing you to create your own custom nodes, and use them in conjunction with our existing nodes.
Moving Forward
As CV continues to have new developments, we are committed to maintaining and updating PeekingDuck to ensure that it stays relevant. We will also continue to add new features – in fact, we are already working on new model nodes and use cases to be released in a few months. You are welcome to use our Community page to suggest potential problems that could be solved by CV, and we will consider building nodes to solve it, if viable.
As a final case study, we recently embarked on a project with Genesis Gym to use CV to provide coaching feedback for exercises. We paired custom models and coaching heuristics with existing PeekingDuck nodes, and deployed the solution on Google Cloud, all within a challenging time frame. We found that using PeekingDuck greatly cut down our development time and made our code easier to debug – and we hope that it would help you as well.
Find Out More
To find out more about PeekingDuck and start using it, check out our documentation below:
- GitHub repository: https://github.com/aimakerspace/PeekingDuck
- Read the docs page: https://peekingduck.readthedocs.io/en/stable/
Update : We hosted a Zoom webinar on August 26 to share and answer questions on PeekingDuck. You can view the recording below.