In my last blog post, I shared the latest tool that I have added to my arsenal - Apache Arrow (or the libraries based on it). In this post, I will delve deeper into the topic and demonstrate some of the techniques I employed.
In the field of Data Engineering, the Apache Spark framework is one of the most known and powerful ways to extract and process data. It is well-trusted, and it is also very simple to use once you get the infrastructure set up. Understandably, most engineers will choose it for every task. However, in a lot of ways, it can be overkill. And a very expensive one.
This is a (small) presentation I gave to my coworkers at CEPESC. At the time, most of the software there was written in Java, and our team (researchers from the Federal University of Brasilia) was developing in Python. I was advocating using Golang in some performance-critical areas of our system.
If you want to write LaTeX on your machine, VS Code is a great option for you! Installing all the necessary packages is a simple process. And with the power of Git, you can sync with web-based editors like Overleaf, and have satisfying versioning and backup.