A vulnerability in the ubiquitous programming language Python that went unpatched for 15 years and which could be exploited with “a single line of simple code” percolated downstream into over 350,000 projects.
CVE-2007-4559 – a 15-year-old path traversal vulnerability with potential to allow an attacker to overwrite arbitrary files – was reported to the Python project in 2007 but was overlooked by the community.
See also: Check your Memray: Bloomberg open sources Python tool
Security researchers at endpoint security specialist Trellix who resurfaced the vulnerability have now executed a months-long automated effort to patch open-source projects known to use the vulnerable code.
Their efforts have resulted in the patching of a massive 61,895 open-source projects previously susceptible to the vulnerability, Trellix – a company born of the merger of FireEye and McAfee Enterprise – said this week.
Several hundred thousand other open source projects as well as proprietary software remains vulnerable.
About Python vulnerability CVE-2007-4559
The Python vulnerability, CVE-2007-4559, is “incredibly easy to exploit” Trellix’s security researchers said when they first resurfaced the vulnerability – which is in Python’s tarfile module – in September 2022.
“While the vulnerability was originally only marked as a 6.8 [CVSS] we were able to confirm that in most cases an attacker can gain code execution from the file write” its researchers noted at the time, demonstrating exploitation of Spyder IDE – an open-source scientific environment written for Python that can be run on Windows and macOS – and Polemarch, an IT infrastructure management service running on Linux and Docker.
“For an attacker to take advantage of this vulnerability they need to add “..” with the separator for the operating system (“/” or “\”) into the file name to escape the directory the file is supposed to be extracted to. Python’s tarfile module lets us do exactly this… [meaning hackers can] create their exploits with as little as the 6 lines of code.
That's according to Trellix's team detailing the vulnerability and exploit in detail here.
How did Trellix patch the Python vulnerability in so many projects?
Trellix's efforts to patch Python vulnerability CVE-2007-42559 in so many open source projects deserve real community recognition. The company said in a blog on January 23 that "our team took inspiration from Jonathan Leitschuh’s DEFCON 2022 talk on fixing vulnerabilities at scale. Our Advanced Research Center vulnerability team was able to automate most of the processes, except for quality control. We broke the process into two steps, the patching phase and the pull request phase, both of which were automated and simply needed to be executed."
Follow The Stack on LinkedIn for news and event invitations
(The company's researchers Kasimir Schulz and Charles McFarland led the project, scanning for repositories and files containing the vulnerability, before forking the project, patching and testing the patched version from the fork, then sending pull requests for the code change from the fork to the original maintainers, with a message; at that point it was up to project maintainers to review the pull request and decide whether to accept the changes.)
They added: "GitHub was a great partner in this process, and after receiving a list of repositories and files that contained the keyword, “import tarfile,” our team was able to compile a unique list of repositories to scan. We could not have executed this large-scale effort without quick delivery of actionable data from GitHub. Once the list was delivered, we cloned and scanned each repository using Creosote – a free tool we built for developers to check if their applications are vulnerable – to determine which repositories needed to be patched. If a repository was determined to contain the vulnerability, we patched the file and created a local patch diff containing the patched file so users can easily compare the two files, the original file, and some metadata about the repository."