July 8, 2020

In Python, we t[Rust]?

With the competition increasing daily and the introduction of various new languages, it is essential to be updated to outperform the state-of-the-art.

The biggest question that arises with the thought above is - whether Python, the current undisputed king, will be unseated by Rust?

In order to deep dive further into this, we explore and compare the features of these two commonly-used, super popular languages. We further outline the differences between them, giving the developer a use-case wise analysis of when to use either language.

What is Python? Why is it so popular?

Python is a widely-used general-purpose programming language often praised for its simplicity and readability. Guido van Rossum first released it in 1991. It was deemed the fastest-growing major programming language in the Stackoverflow Developer Survey of 2019. The majority of the data science libraries, including Pytorch and Tensorflow, have a Python implementation, which is widely used.

Python is supported by a large community and hence boasts better documentation than most other programming languages. It also consists of great libraries for various functions and use-cases, not just limited to data science. The programming language features high readability of code due to its more straightforward and more readable syntax. Python is an inherently object-oriented programming language that enables the developers to exploit the principles of Modularity, Inheritance, and Polymorphism with ease.

However, Python suffers from the curse of simplicity.

It has been observed that the language becomes slower than other comparable languages for some use-cases [1]. Although Python is “fast enough” for most applications, Rust may pave the way for use-cases requiring razor-sharp optimization and lower computation times.

And what is Rust? Why the buzz?

Rust is a multi-paradigm programming language that obtains fast performance and reliable compile-time concurrency, focusing on performance and safety. Although it is syntactically similar to C++, it adds the features of memory safety without using garbage collection. This feature provides the developer with better memory management.

Rust’s strength lies in providing guaranteed memory safety, which leads to better reliability. This reliability is often essential in critical systems. Rust is fast and competitive with idiomatic C and C++ [2]. Its core principle of zero-cost abstractions ensures that there is no global performance penalty and no traditional runtime overhead, leading to minimal runtime. Since Rust is based on systems programming, it provides the developer with a higher level of customization.

Rust has been gaining traction in the developer community, having been rated as the ‘most loved’ programming language by various developers in the 2016 developer surveys conducted by Stackoverflow.

However, it is a relatively new programming language, which first appeared in 2010. Therefore, a lack of a larger developer community and novelty of the language results in significantly less libraries than Python.

For the data science community, as of now, this implies that the majority of the codebase has to be written from scratch for more complex architectures for creating the state-of-the-art. Moreover, due to its roots in systems programming languages, although possible, it becomes hard to learn without a background in low-level programming.

Which language should you learn?

The table below shows a feature-based comparison containing pros and cons for each language, illustrating that both languages are strong candidates for their respective use-cases.

Rust has the potential to perform better in low-level system-oriented use cases.

Python has a stronghold for an overall easy-to-use language.

If the developer wants to focus solely on the high-level abstractions necessary for data science applications and numerical analysis, then Julia is another great candidate, albeit with a newer yet growing developer community.

Screenshot 2020-07-08 16:56:05

Feature-based Comparison. + represents the Advantages, - represents the Disadvantages.

CrowdStrike performed a head-to-head performance-based analysis between the two languages for a data science use-case of computing entropy for byte sequences. Python with Rust backend outperforms the commonly used numpy-based method by a margin of more than 4 times, in terms of the execution time.

Rust’s characteristics make it a potent candidate for being an efficient and reliable data science backend.

It can be used in high-level libraries implemented in easy-to-use languages like Python for developer productivity.

Although Python holds a firm place in the data science community, Rust has a high potential to be used in the future as a more efficient backend for the Python libraries.

Conclusion

The language for a data scientist should be Python or say, Julia, if they want to pursue front-end data science involving architectural research and implementation.

However, if backend data science involving low-level code optimizations and parallelization is the goal, then Rust is a better choice.

References

  1. https://www.nicolas-hahn.com/python/go/rust/programming/2019/07/01/program-in-python-go-rust/

  2. https://github.com/kostya/benchmarks

Group 9
Group 9