Data Science Jobs are often a source of misunderstanding, and are also sometimes subject to a hierarchy – wrongly so, since they are quite different professions. The uninitiated, also known as Muggles in the trade, may see some grey areas.
Indeed, the mistake often made is to speak of “Data Scientist” indiscriminately to encompass all data professions, when in fact they are different roles.
The aim of this article is to lift the veil on the main data professions, and to better understand their differences and specificities. What’s more, at the end of this article you’ll find a diagram illustrating the different professions and Data Science Jobs…
Data Analyst
As the name suggests, the Data Analyst is responsible for analyzing data in order to derive indicators that can be used by the company’s managers to help them make strategic decisions. As you can see, this is a key role within a company. These indicators will be materialized in the form of graphs or tables that decision-makers can understand.
Data Analysts use business intelligence software such as Qlik or Power BI. They will also need to be comfortable with SQL, and master the Python and R languages.
Data Engineer
This relatively unknown profession was often confused with the Data Scientist, and as a result, more and more job vacancies are being advertised today, but with few qualified profiles. The data engineer (a term which, incidentally, can scare muggles), is the person in charge of retrieving data.
The sources can be diverse: websites, applications, surveys, etc.
The Data Engineer is responsible for ensuring that the data is available, secure and high-performance, so that it can be exploited. Of course, the Data Engineer needs to be at ease with the enormous volumes of data potentially generated. Because of his or her relative promiscuity with other professions, he or she will also need a grounding in Machine Learning or algorithms.
He/she will be using technologies such as Hadoop and Spark. Of course, you’ll also need to be very comfortable with Sgbd SQL, NoSQL and Neo4j, as well as Python and Scala programming languages.
Data Scientist
Less than 10 years ago, Linkedin listed barely a hundred job offers for Data Scientists; today, there are several thousand. Which just goes to show how popular this profession has become. With a more scientific profile (mathematics or science) than the Data Engineer or Data Analyst, their primary role is to analyze the data provided to them in greater depth, in order to draw conclusions and make predictions about future behavior. Unsurprisingly, this is a very important role within a company for future decision-making.
He or she will use the R (very science-oriented), Python and Matlab languages, as well as libraries such as Scikit-Learn or PyTorch.
Machine Learning Engineer
The Machine Learning Engineer is an evolution or bifurcation of the Data Scientist profession. Their role will be to optimize and maintain the algorithms developed by Data Scientists, using the data prepared by Data Engineers. The ML Engineer is therefore a demanding job, requiring solid skills in development, algorithms and mathematics.
They must also be able to explain their work to managers and other non-experts.
ML Engineers need to master several programming languages, including Python, R, C, Scala, Matlab and Java. They will be working primarily in cloud environments such as Azure, as well as on platforms such as Rapidminer.
Data Science Jobs - Conclusion
As you can imagine, these professions are increasingly sought after by companies, as we live in an era where data is omnipresent and its importance paramount. However, in order to carry out these activities, proper training is essential.