PyTorch and TensorFlow are two of the most popular technologies in the field of AI programming today. Both are higher level libraries/frameworks that make development more efficient by providing out-of-the-box code modules and tools. They are probably the most compared libraries in the field of machine learning and deep learning.
However, despite having approximately the same scope of use, as well as the fact that both are open source technologies, there are some differences. In this post, we highlight some of these differences and look at their use in industry and research, plus give an idea of which technology to recommend and when.
What is Machine Learning (ML)?
Machine Learning, along with deep learning, is one of the subcategories of artificial intelligence. Typically, machine learning is described as the ability of a machine to “artificially” generate knowledge from experience. This means that after the completion of so-called learning phases, e.g. being fed collected data, algorithms learn to exhibit certain responses to specific and changing conditions. In this way, complex tasks are to be solved in a way that resembles the human approach to problem solving.
For example, AI pioneer Arthur Samuel describes machine learning as “the field of research that gives computers the ability to learn without being explicitly programmed”. For a more detailed explanation of machine learning, we recommend this article published by Sara Brown and MIT.
What are ML libraries like PyTorch and TensorFlow?
As Machine Learning has evolved, higher order frameworks that can be compared to the better known JavaScript libraries and frameworks like React, Angular and Vue have also appeared to make the development process more efficient. While Machine Learning developers still work with vanilla programming languages like Python and C++, an increasing amount of development is now based on libraries and frameworks.
The advantage of using an ML framework or library is that you do not have to deal with the basics or core algorithms to develop applications involving Machine Learning. They are provided out-of-the-box as modules or components of code that are mixed, matched and customised to combine as custom software applications.
Libraries make it easier for less experienced developers to program in the field of Machine Learning and increase the productivity of more experienced developers.ML frameworks such as TensorFlow or PyTorch are also most often used in commercial and academic research and development.
Machine learning is an emerging field in information technology and is considered one of the key technologies that will shape the next generation of technology. Consequently, many companies and organisations have shown great interest in working and researching in this field. As a result, many ML frameworks have been developed by companies such as the Microsoft Cognitive Toolkit (CNTK) from Microsoft or by institutions such as Theano from the University of Montreal.
However, the two most popular machine learning frameworks by some are PyTorch and TensorFlow. Both are partly referred to as frameworks and as libraries. The difference between a library and a framework is the level of freedom developers have in the way out-of-the-box modules and components can be combined and not a theoretical debate that adds much value here; which is why we won’t get into it. Ъou can learn more about the theoretical difference between a library and framework here.
What is TensorFlow?
TensorFlow is an ML framework developed by the Google Brain Team that can be used for a range of artificial intelligence tasks. Released by Google under the Apache License 2.0, the open-source technology was first published in 2015 until the updated version TensorFlow 2.0 was released in 2019.
The computing operations performed on TensorFlow are executed on so-called tensors (multilinear mappings), on which the name TensorFlow is based. The computational operations are carried out by artificial neural networks whose structure is compared to the natural neural networks that power human problem solving.
TensorFlow is used in all areas of artificial intelligence. For example, numerous Google applications, such as speech recognition, Gmail, Google Photos or the Google search engine, were developed using TensorFlow. Google uses neural networks based on TensorFlow to improve their products or to add new features.
TensorFlow runs on all major operating systems, i.e. Microsoft Windows, MacOS, Linux, iOs as well as Android and supports a variety of programming languages whose list can be extended with third-party libraries. For example, TensorFlow runs primarily on C++, JavaScript and Java, with Python probably being the most commonly used with TensorFlow and in machine learning itself.
What is PyTorch?
PyTorch, an open source technology released under the BSD licence, is an ML library developed by Facebook’s AI Research Lab (FAIR). Developed using the Python and C++ programming languages and the CUDA API, the ML technology was first released in September 2016 and has since been used for many machine learning applications.
PyTorch, as the name suggests, is primarily designed for use in Python, although PyTorch also has a C++ interface. Torch, on the other hand, is also an open-source machine learning library on which PyTorch is fundamentally based and, in combination with Python, PyTorch’s namesake.
As in TensorFlow, tensors can be analysed and artificial neural networks can be created in PyTorch. In order to use the library properly, proven Python libraries such as SciPy, Cython but mostly NumPy are used. Since 2019, the use of PyTorch is no longer limited to the operating systems Windows, macOS and Linux; with the introduction of PyTorch Mobile, the Android and iOS platforms are now also supported.
PyTorch is developing enormously (as is TensorFlow) due to the increasing interest in Machine Learning applications. The strength of PyTorch’s community is future-proofing the tool and encouraging more and more companies and other organisations to integrate it into their tech stacks.
For example, the non-profit organisation OpenAI, which deals almost exclusively with research into artificial intelligence, has announced that it will use PyTorch as the standardised framework for the implementation of Deep Learning (a subfield of ML) projects.
PyTorch vs TensorFlow: Features and Functions
PyTorch vs TensorFlow: strategic considerations for your company
For sustainable software projects, the choice of the right tech stack is crucial. Successful companies also plan their software solutions for the long term, which means choosing the right technologies for the company from both a technical and strategic point of view based on considerations such as the availability of developers with knowledge of the stack.
In the fast-changing IT industry, new technologies are constantly being released which further increases the possibilities of usable tech stacks. However, companies have to be conservative about jumping on the ‘shiny new thing’ even if it looks technically promising. Using tools and technologies that are not yet clearly future-proofed can create major headaches down the line if they become redundant or difficult to hire for.
So let’s look at some non-technical considerations for the selection of the two ML libraries, PyTorch and TensorFlow. If you look at the annual StackOverflow Survey, published by StackOverflow and recognised in the programming community, you can see that TensorFlow is probably the most popular pure ML library, with almost 70% more users than PyTorch.
In 2021, about 16.5% of the developers surveyed used TensorFlow and almost 10% Torch/PyTorch. Even though the gap is still relatively wide, it has closed considerably over the past few years. In the 2018 Survey, TensorFlow still had almost five times as many users as PyTorch. Three years later it was not even twice as many. Both technologies are growing significantly in popularity, which is also due to the increasing use of artificial intelligence and Machine Learning. In any case, PyTorch’s current usage growth rate is higher than TensorFlow’s which may well mean the gap is closed further over coming years.
TensorFlow will almost certainly remain a heavily used framework and may maintain an edge over Pytorch in general usage. However, use of Pytorch is expected to surpass that of TensorFlow in research. Besides OpenAI, other organisations involved in the further development of AI technologies have also spoken out in favour of PyTorch.
The overview below shows the growth in independent mentions of PyTorch in scientifically published papers in contrast to TensorFlow. These are sorted by papers published at some of the largest research conferences in the IT industry.
CVPR, ICCV, ECCV – Computer Vision Conferences
NAACL, ACL, EMNLP – Natural language processing (NLP)-Conferences
ICML, ICLR, NeurIPS – General ML-Conferences
Source: thegradient
As the graph shows, PyTorch’s popularity among researchers has skyrocketed. Before 2017, PyTorch was hardly mentioned alone in scientific publications at the research conferences shown here. By 2019, however, a large number of papers mentioned PyTorch alone, and almost 80% of papers published at the ACL or NAACL named PyTorch without mentioning TensorFlow.
Conclusion
TensorFlow’s ‘first mover advantage’, combined with the fact it is also a great solution, means it is currently the leading ML library in business and industry by usage. PyTorch, despite its rapid growth in business and industry, is still lagging behind its rival framework/library in frequency of use. But the fact that PyTorch has established itself in research and development suggests that more companies could migrate to PyTorch in future.
The fact is, PyTorch and TensorFlow are both very good options in Machine Learning and which technology is the optimal choice depends on the specifics of project.
In general, PyTorch is recommended if you are just starting out in Machine Learning with its simplicity offering beginners a gentler learning curve. Besides the simplicity, performance and API are PyTorch strengths frequently cited by developers, especially in research.
For generally good performance and stability of the production environment, TensorFlow is often recommended. However, it is unusual to encounter use cases where TensorFlow cannot do something that PyTorch can and vice versa. As such, the choice may often come down to available developer resources, long term strategic considerations and personal preferences.