Computer vision is a field that has undergone great development in recent years and it is becoming more widespread, both among engineering and industry, as well as in everyday activities and consumer/user applications.
The goal of this post/article is to help researchers, developers and enthusiasts involved in computer vision to gain an overview over some of the most popular and widely used software libraries and tools in this field.
Introduction of computer vision and machine learning
Before focusing on current computer vision libraries and platforms, we will provide definitions for computer vision and machine learning. Then we will clarify what is the difference between the two areas and how they are related, because in recent research and development, these concepts are often mentioned together and their meaning and relation are often not well explained.
There are a lot of definitions for Computer Vision [1, 2, 3]. Our way of defining computer vision is:
Definition 1. Computer vision is the property of a computer system to gather information and understand the content from an image or a video stream. The result of that ability/function/action is for example object recognition or detection.
Similarly for Machine Learning there are also multiple definitions [e.g. 4, 5]. In Perelik Soft we are using the following one:
Definition 2. Machine Learning is a technique that enables computer systems to make decisions, without the need for special rule-based programming but with using automated self-learning from data points. The result of the application of machine learning is a system, capable of making predictions.
In modern technology, machine learning has become an integral part of computer vision systems and algorithms . Using machine learning approaches, the research community has developed and applied algorithms, which successfully recognise a very large percentage of the objects in an image, while achieving quick inference times .
With Machine Learning approaches we need to train a model on data, which can be a very resource intensive task. Whereas years ago there were significant limitations in terms of hardware resources, today there are powerful computing devices, e.g. GPUs, that allow training of systems with very large amounts of data . This advancement in the hardware has significantly expanded the applications of computer vision in combination with machine learning.
Some of the main applications of computer vision are: manufacturing, robotics, automotive, healthcare, social networks, mobile/smart technologies, space exploration, security, agriculture.
It is not always necessary to reinvent the wheel in software projects. There is already a significant set of libraries and platforms for developing computer vision applications . With their help, already established algorithms and models can be used, both for computer vision and machine learning individually and in combination. In the following lines we will look at some of the most popular and applicable libraries and platforms used in modern R&D in computer vision through machine learning.
Software platforms and libraries
The provided data and statistics herein is to be read as of December 2020. If you are reading the article in the distant future some of the data, e.g. community statistics, might have changed.
- OpenCV – Official site: https://opencv.org/.
Origins: It was created by Intel and originally released in 2000. It has been in the industry since 2000. It is an open source library under the BSD license.
Compatibility: It has C++, Python, Java and MATLAB interfaces and supports Windows, Linux, Android and Mac OS.
Integration and compatibility with other libs: OpenCV supports importing from both Keras, Tensorflow and CUDA.
Pretrained Models and Algos: OpenCV provides support for state-of-the-art, pre-trained neural networks, including ResNet, Inception, SqueezeNet, and more, all of which are capable of performing automatic image classification.
Companies using OpenCV: Google, Yahoo, Microsoft, Intel, IBM, Sony, Honda, Toyota, Ocado Technology, Evergreenteam stack, Goopy and many others, more than 6000 companies.
Community: More than 50k stars in GitHub. StackOverflow – more than 60k questions.
- SimpleCV – Official site: http://simplecv.org/.
Origins: It was developed by the engineers at Sight Machine, and it is an open source library licensed under the BSD license.
Compatibility: Written in Python, runs on Mac, Windows, and Ubuntu Linux.
Integration and compatibility with other libs: Python packages of OpenCV
Pretrained Models and Algos: Basic machine learning and image processing algos
Companies using SimpleCV: TechDynasty, ML Hub.
Community: 2440 stars in GitHub. StackOverflow – 184 questions.
- The Accord.NET – Official site: http://accord-framework.net/.
Origins: developed by César Roberto de Souza and originally released in 2010 under the terms of the Gnu Lesser Public License and open source.
Compatibility: Microsoft Windows, Xamarin, Unity3D, Windows Store applications, Linux or mobile. Written in C# and compatible with .NET.
Integration and compatibility with other libs: –
Pretrained Models and Algos: a large number of image processing, machine learning and vision samples and algos
Companies using Accord.NET: Robert Gordon University, New Innovations Inc., Banque Nationale, Info Origin Inc., The University of Auckland and others, about 120.
Community: More than 4k stars in GitHub. StackOverflow – 288 questions. Available as Nuget library.
- TensorFlow – Official site: https://www.tensorflow.org/.
Origins: It was created by the Google Brain team and initially released on November 9, 2015, open source and under the Apache License.
Integration and compatibility with other libs: Keras, CUDA, OpenCV.
Pretrained Models and Algos: large set of pre-trained ML models for CV – ResNet, RetineNet, Mask R-CNN and more.
Companies using Tensorflow: Intel, DeepMind, CocaCola, USAA, NVIDIA, Qualcomm, Apple, CapitalOne, Twitter, Uber, and many others, more than 8000 companies.
Community: More than 150k stars in GitHub. StackOverflow – more than 60k questions.
- Keras – a Python written high-level API for a faster and more convenient work with TensorFlow. https://keras.io/about/
- CUDA – Official site: https://developer.nvidia.com/cuda-zone.
Origins: a parallel computing platform that was created by Nvidia and released in 2007. It is free for use under the CUDA EULA license terms – https://docs.nvidia.com/cuda/eula/index.html.
Compatibility: supported OS – Linux, Windows, MacOS. Developers can program in various languages like C, C++, Fortran, MATLAB, Python, etc.
Integration and compatibility with other libs: Some libraries and collections include GPU4Vision, OpenVIDIA for popular computer vision algorithms on CUDA, MinGPU which is a minimum GPU library for Computer Vision, etc.
Pretrained Models and Algos: supports training on some of the most popular object detection architectures, such as YOLOv3, FasterRCNN, SSD/DSSD, and RetinaNet, as well as popular classification networks such as ResNet, DarkNet, and MobileNet.
Companies using CUDA: Lockheed Martin, Honeywell, Intuit, Raytheon, Apple, ANSYS Inc., Trivago, Abeja, Haptik, Cruise and many others, more than 6000.
Community: StackOverflow – more than 12k questions.
For detailed statistics about the adoption of the mentioned libraries and platforms see .
There is a wide range of libraries an tools for working with computer vision. The list of the ones we mentioned is surely not complete. Besides the above mentioned there are a number of other libraries and tools with a similar application out there.
Of course, there is no best library or platform. Each of the above mentioned has its advantages and disadvantages when used for different tasks and projects. Also different development teams might find one or another platform more convenient to work with.
In case that you are new in computer vision our recommendation is to start with SimpleCV or OpenCV. These libraries consist of ready to use fundamental/basic programs for image processing and basic object recognition.
If you want to experiment with Machine Learning, we recommend to start with Tensorflow and Keras. When using these platforms you can get involved in ML quite quickly.
If you are a .NET developer, you might want to try Accord.NET. There are a lot of ready to use programs for image processing, computer vision and machine learning. Although keep in mind that the project has been archived.
And finally if you want to create a large project for real-time object recognition in a complex environment we recommend using multiple libraries. For such kind of projects the combination of OpenCV+Tensorflow+CUDA might lead to a solid architecture. OpenCV will be used for handling the camera video stream, TensorFlow will be used for model training/inference and CUDA will be used for GPU parallel accelerated processing.
- Computer Vision: Algorithms and Applications – https://szeliski.org/Book/
- Redmon, Joseph, Santosh Divvala, Ross Girshick, and Ali Farhadi. “You only look once: Unified, real-time object detection.” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779-788. 2016
- Companies Search source: https://enlyft.com/tech/ , https://stackshare.io/feed, https://discovery.hgdata.com/
— — —
We put a lot of effort in the content creation in our blog. Multiple information sources are used, we do our own analysis and always double check what we have written down. However, it is still possible that factual or other mistakes occur. If you choose to use what is written on our blog in your own business or personal activities, you do so at your own risk. Be aware that Perelik Soft Ltd. is not liable for any direct or indirect damages you may suffer regarding the use of the content of our blog.