Fabian Kostadinov

Yes, but I'm a software engineer...

Being a software engineer is not an excuse to complain about your data science colleagues. You don’t have to become an expert in machine-learning or statistical analysis, but it’s actually a lot of fun to dive a little deeper into some of these topics and learn more about, let’s say, categorisation algorithms. And it even looks sexy in your CV.

Just because Java is your first language does not mean it’s so much cooler/better/more stable than Python or R. Have you ever tried to implement advanced matrix algebra in Java? It’s a pain.

I know there exists Apache Math that claims to do exactly that. But this still does not change that it’s a pain to do data science in Java.

Knowing how k-means algorithm work does not yet make you a data scientist. There’s quite a difference between someone who understands the underlying math and someone who does not.

Being a software engineer you should actually be concerned about your data scientist colleagues. There’s plenty areas where they can profit from your help, for example checking in code to Git.

Not everyone needs to understand how application servers works. They’re anyway somewhat outdated, I’d say. Not that they will disappear suddenly, but do you really need an application container when you can have Docker containers?

Your data science colleagues might not know what a data lake is. Nor why they should bother about it. Only once they’ve seen it will they become to love it. So, don’t count on their enthusiastic support during the inception or implementation phase.

You should learn Scala. (And when you’re onto it, add a bit of Clojure too. It’s like taking LSD. It will open your eyes to a reality that’s truer than what you believed true.)

Data warehouse is not a data lake. OLAP cube is not a data lake neither. Hadoop is not a data lake. (It’s just a technology stack to build one.) In fact, nobody knows what a data lake is or why they’d need it until they’ve seen one.

comments powered by Disqus