Bloomberg has open-sourced a memory profiler for Python code on Linux called Memray — the tool is designed to reduce memory usage, find memory leaks, and identify hotspots in code.
The tool was published by Bloomberg’s Python Infrastructure team, which supports over 3,000 engineers at Bloomberg who write code for the business using the Python programming language.
It is fully open source (Apache 2.0-licensed) rather than some left-field copy-left obscurity and has already attracted significant attention and appreciation from the Python community globally.
Memory profilers help you to understand the memory allocation and garbage collection behavior of your applications over time; the leaks and memory churn that can lead to poor app performance.
Python core developer Yury Selivanov (Python is open source and run by a non-profit) was effusive in his praise: “Until now you never could have such a deep insight in how your app allocates memory.”
He added on Twitter: “The tool is a must for any long-running services implemented with Python. With memray you can generate flame charts or all allocations and trace absolutely everything.
“It’s sophisticated enough to peek into native code. So you can profile your numpy and pandas code with it. And it has a live mode. You can just run your code and see how it allocates memory as it runs.”
Bloomberg open sources Python memory profiler Memray
Bloomberg, the privately held business and financial data specialist, has a technology function that spans over 6,500 people and runs one of the world’s largest private networks: an expansive network of its own 140+ sites in over 100 countries linked by a mix of its own fiber infrastructure and those of other carriers.
Application performance for tools serving financial services clients do not have scope for app jitters: Bloomberg provides real-time access to 575+ exchange products from more than 365 trading venues around the globe, with 24/7 customer support in 17 different languages and clients need low latency data feeds.
It also ships a lot of its own Python-based tools, including BQuant Enterprise, a cloud-based platform for quantitative analysts and data scientists in the financial markets that lets users write a Python function to access Bloomberg’s data sets; its own Quant research team recently used BQuant and an alternative data set of weather feeds to assess snowfall impact on US retailers’ performance or cyclones’ impact on manufacturers for example.
Memray creators Matt Wozniski and Pablo Galindo Salgado noted of the Memray release that it is:
- “The core of the library is in C++ and it binds against CPython using domain-specific interfaces
- Works with native threads that allocate outside the Python interpreter
- Gathers a huge amount of data to a file, so different specialized reports can later be generated
- Can be activated and deactivated at runtime on specific regions of code so your application doesn’t slow down in the areas where you are not using the profiler.
- Shows you the merged Python/C/C++ stack for every allocation, allowing you to understand how memory is allocated in native extensions.”
As Salgado noted: “A couple of years ago, our internal users at Bloomberg were unsatisfied with the available options for memory profilers and asked our team to provide or recommend a solution that could cover the specific problems that Bloomberg’s developers face. After doing some user studies, researching the current landscape of profilers, and analyzing the situation, we decided that there was room to develop a new kind of profiler that would leverage our expertise in Python, Linux, ELF, DWARF, and low-level systems programming tooling… [our] engineers work with a lot of Python code, but this Python code very often calls into native extensions. These extensions can be very different in nature, ranging from classic packages from the Python numeric ecosystem (like NumPy and pandas) to high-performance Bloomberg-specific networking middleware written in C++. The inner workings of these extensions have an enormous impact on different aspects of the lifecycle of Python programs at Bloomberg, including performance, memory usage, and allocation patterns.”
Wozniski added in a Bloomberg report: “Once upon a time, everyone used 32-bit processes, and if your interpreter tried to use more than 3GB of memory, the kernel would happily kill it and leave you with a core file to investigate. Those days are long gone. Today, on modern server hardware, it’s entirely possible to accidentally write a program that consumes 100GB of memory and works fine – except it makes everything else on the box slow.”