At CyberCube, we pride ourselves on developing new solutions and exploring alternative options. One of our flagship products, Portfolio Manager, is now on v3.2 and as our client base grows, we are looking into how to scale this product. I have been tasked with helping identify solutions to achieve this goal.
Portfolio Manager has to handle thousands (if not hundreds of thousands) of companies, and execute thousands of simulations based on a few different scenarios. If we start multiplying the number of companies by the number of simulations and scenarios, we end up with billions of operations. All those will consume a lot of resources, such as CPU time and different I/O (disk and network) and those processes have to scale properly.
Scalability and performance are critical for both CyberCube and its clients, because at the end of the day, if it takes less resources and less time to run analyses, it reduces the cost and it’s a better customer experience.
A bit about me
Before diving into the steps I’ve taken to trial a new System Programming Language, let me give you a little background on myself.
Prior to joining CyberCube and working on Cloud solutions, I worked in a few different security companies. During these years, I focused on malware detection, which included some reverse engineering and pretty much hacking Windows. I also worked on anti-virus solutions as well as some network security. This was mostly on Windows, and in 2014 I worked on Mac OSX, wrote a couple KEXT and I also modified the XNU Kernel to add some callbacks from the OS back to a daemon I also wrote.
All that work required me to write code using C and Assembly, and for close to 20 years I was pretty much centered on System Programming. I also used C# and Go at my previous job, and while those are really good programming languages, I always keep my eye out for new languages that might serve our needs better.
Before that, while I was still in France, my hobby was to write programs for MS-DOS. My friend Fred and I wrote some TSRs (Terminate and Stay Resident), we also wrote some “Demos”, which were essentially some CGA-based graphics with music, and programming the Soundblaster was, erm, a blast (bad pun intended). Of course, all of this was x86 Assembly.
Nowadays, I use NodeJS for the most part. Clearly not as fun, but using assembly to write Lambda functions would be quite the (fun) nightmare.
Experimenting with Rust
Out of curiosity, I started to learn Rust a few months ago. It’s a System Programming Language and is famous for making it really hard to write unsafe code. When you write code with any language you always risk having null or dangling pointers, memory leaks, stack and buffer and overflows, etc. With Rust, not so much. All in all, Rust sounded very interesting.
There were also a couple of things that I thought were particularly great:
- Rust is fast!
- There is no garbage collector (more on this later!)
Trying to solve performance issues
At CyberCube, we have some processes that take a long time to complete. This is somewhat understandable since we have to do some tricky simulations.
When those processes run, mostly on EC2 instances and Containers, because they are written using Java, the (evil) garbage collector kicks in every so often, and the process gets stuck for a few seconds.
I wanted to try something different and I wrote a small proof of concept using Rust.
The program is actually fairly simple. It reads and parses an XLSX file that contains data about companies, and then it queries our ElasticSearch (ES) database to find a match for the company.
Once the companies are found and validated, they are exported back to another storage, namely MySQL.
The current process takes between 20 and 30 minutes to match a portfolio that contains 100,000 companies.
I started small, with a 1,000 companies portfolio, so that I could come up with something that worked, and once I got the program to work, I used a 200,000 companies portfolio.
One big difference between the two programs is that I do not export that data back to MySQL, instead, I store it in an EFS directory.
Once I compiled the program, I copied it onto an EC2 instance in order to run it inside our VPC. I did this to avoid skewing the results because if I ran it from my machine, the queries to ES would have to go through the VPN, the VPC, the ES instance, and back. Granted that it’s not a huge overhead, but milliseconds do matter when you make that many queries.
Preliminary Results
Once I ran the program the timing was as such:
- Time to read and parse the XLSX file (41Mb): ~3 seconds
- Time to query ES and match every company: Roughly 160-180 seconds
- That time is dependent on the ES query that’s being made, if using the synonym modifier, this doubles the time.
I haven’t yet looked at how long it will take to write the output onto EFS, but since it’s mounted onto the host file system and because it uses a 10G NIC, I would assume no longer than 30 seconds.
Not counting the output to EFS, I reduced the current processing time from 20min-30min down to 2.5min-3min, so roughly 20X improvement.
Note: I know you’re thinking that I can’t do math (and you would be right!), but 20-30min currently is for a 100K portfolio, and I used a 200K portfolio, therefore, it’s not 10X but 20X!
So, Rust is great?
Well, it certainly does make you wonder.
The performances are really good, but there’s no magic in this. First of all, reading and parsing an XLSX file is pretty quick, and making queries to ES using a connection pool in a multi-threaded environment makes for good performance.
That being said, this is just a proof of concept, and I’m not so sure that it should be productized for a couple of reasons.
- The Rust learning curve is steep, even for someone proficient in C. People that are used to Java, Python or NodeJS would have to spend a considerable amount of time to learn it. I’m clearly a beginner and I’ve had my fair share of banging my head against my keyboard.
- Productizing a program that only one person understands is just a bad idea.
Rust at CyberCube?
Honestly, I would love to adopt Rust. My proof of concept shows that it would help with the performance of long-running, memory-hungry processes.
Right now, all our code is a combination of Java, Javascript (NodeJS) and Python. This is very much the norm with regards to how SaaS companies build their software. It makes sense because those languages are mature, easy to use and most, if not all developers know them.
That said, when I look at our programs that run on fairly large EC2 instances, it is clear to me that a lot of performance problems could be solved with a compiled language, instead of an interpreted one (JavaScript & Python) or a hybrid such as Java (The .class files generated by the compiler are byte code that still needs to be interpreted by the JVM).
By increasing the performance of those programs, we can decrease the time they take to run, and we can also reduce the size of the virtual machines they run on, which means reducing the overall cost.
Rust in a wider SaaS context
It was quite fun to play around with Rust, and I definitely think that it’s a programming language that has its place even in a SaaS enterprise.
The performance aspect that it brings is undeniable, but the downside is that you need to be committed to it, meaning that you need a small team of developers and not a single person.
The engineering team at CyberCube is fortunately able to explore new languages like Rust in order to work more efficiently. Find out more about us and the career opportunities we have available here — Careers.