DATA SCIENCE : WHAT? WHERE? HOW?

Data Science. It is the discipline that allows a company to explore and analyze the raw data of the transformers into valuable information that helps solve business problems. The idea is to create methods of recording, storing and analyzing data to allow its exploitation as a source of information, as science for the company.

The first step is the apprehension of the problem, to successfully go through it, it will sometimes be necessary to look at fields far from data science. You have to be curious and ask yourself the right questions to become familiar with the problem. Then it is stamina that you will have to show. Take the time necessary to prepare the data, find the right team, competent and proactive. You will have to be quick to deliver the best model in the given time and be able to challenge it. The most important thing to win a data science competition can be summed up in two words: Persistence and Patience. Take time in preparing the data, don’t ignore areas outside of data science to better understand the problem. Find a good, competent and proactive team. You will also need to be quick and clever to arrive at the best predictive model in the given time. To acquire these good practices, there is nothing like following our Data technology watch on Linkedin or directly on our Blog! Sometimes the only trick is: think out of the box! 

To claim the title of Data Scientist, you must meet certain prerequisites.
Required to work with numbers and large amounts of information, a data scientist must have excellent analytical skills and a solid foundation in statistics. Those with a master’s or doctoral degree in math and statistics or computer science are most likely to shine in this field.
Proficiency in analytical tools such as SAS, R or Python software is also important.
These basic prerequisites allow the Data Scientist to develop a technical approach to better understand, comprehend and manage Big Data.

To become a data scientist, there are several courses of study in business schools or engineering schools. You can also train as an autodidact thanks to an online or hybrid course such as DataScientest, the B2B leader in data science training. Choosing DataScientest means benefiting from a full SaaS secure online platform that can be easily deployed and that offers unique data science course content and a notebook format that is resolutely oriented towards active teaching and learning by doing.

In order to be properly trained, an aspiring Data Scientist must have a good knowledge of Python and SQL. But why these two languages in particular?

Python is the fastest growing programming language, it has a wide variety of libraries useful in machine learning, data analysis, dataviz, API integrations … Moreover, it is one of the easiest computer languages to learn.

As far as SQL is concerned, it allows you to better understand, explore and use the data collected by your company. It is also the reference language for database management, both in terms of popularity and efficiency.

That’s why our courses focus on learning and improving these two tools. Learning on R is also possible.

JOB AND DATA SCIENCE:

DATA SCIENTIST​

To claim the title of Data Scientist, you must meet certain prerequisites.
Required to work with numbers and large amounts of information, a data scientist must have excellent analytical skills and a solid foundation in statistics. Those with a master’s or doctoral degree in math and statistics or computer science are most likely to shine in this field.
Proficiency in analytical tools such as SAS, R or Python is also important.

For this purpose, university courses have been available for some years. But also, as the need to increase skills has become a performance issue for many sectors of activity, courses on platforms such as DataScientest allow to acquire the essential skills required.
These basic prerequisites allow the Data Scientist to develop a technical approach to better understand, apprehend and manage Big Data.

The competence of data in its understanding as well as in its exploitation and processing by the company is now sought after in all types of sectors, for a wide variety of missions.

Data Scientists can work in large groups, for example in the banking, insurance and finance sectors, in auditing and consulting services, via large conventional industrial groups or in the defense sector to predict the behavior of terrorists for example.
They will also be strongly solicited by startups or new data processing software.

Data centers, Internet service providers, hosting companies or infrastructure manufacturers are all potential recruiters following a course in data science.

According to Glassdoor, in 2020 a data scientist earns an annual salary of over €44k. This salary varies greatly depending on experience. For the most senior profiles, it is around 55k€. 

Like data science, the professions related to it are constantly evolving and at an increasing speed. However, for the moment, it seems that the possible hierarchical evolutions of this profession are mainly to become a researcher or a statistical engineer. Nevertheless, Data Scientists can also evolve towards the profession of Business Analyst, Software Engineer etc.

A Data Scientist is a multidisciplinary profile whose main mission is to extract useful information (insights) from raw data. The Data Scientist profession is at the intersection of Data Analyst and Data Engineer without neglecting the business knowledge in the field in which he operates. A software developer can therefore more easily become a data scientist, but will still need additional course.

  1. Python data science handbook
  2. R for data science
  3. Think Python
  4. Introduction to statistical learning
  5. Understanding Machine Learning: From Theory to Algorithms

The duration of the course is 280 hours excluding projects or 400 hours including projects.
This takes place in the form of 85% of the course in e-learning (self-training) on our platform and 15% in coaching sessions lasting from 2 to 3.5 hours, which punctuate each training sprint. These sessions, led by our data scientists, take place approximately every 30 hours of the course.

There are two different formats: bootcamp or intensive course, and continuous training.

Bootcamp format: the course lasts 11 weeks and 2 days at a rate of 37 hours per week

Continuous training format: the course will last 9 months and will involve 6 hours of work per week, excluding the project.

DATA ANALYST

To carry out his work, the data analyst must have specific skills, particularly in computer engineering. To exploit the raw data available to the company, he/she must master specific data processing tools such as Hadoop or Spark. A mastery of computer language is crucial to make the data speak, to transform it into insights.

The Data Analyst will also use various statistical tools and methods to identify trends that may affect recommendations on strategies to adopt. Marketing skills will be necessary to enable him/her to advise business leaders in this field. Rigor is essential to be able to correctly process the large amount of data available.

The job of Data Analyst is obviously not fixed and will follow the speed of evolution of Data Science. The digital explosion generates an ever-increasing amount of data to be processed by companies who must ensure the proper management of this flow of information and put it to good use in their business. 

Data analytics offers job opportunities in an increasingly wide range of industries. Large financial companies, commerciales, marketing, industrielles, médicales en sont quelques exemples .

According to Glassdoor, in 2020 a data scientist will earn an annual salary of more than 40k€. This salary varies greatly depending on experience. For the most senior profiles, it is around 49k€.

The Data analyst course, which lasts 290 hours in total (220 hours without a project), is also divided into these two formats:

Bootcamp format: 11 weeks and 2 days at a rate of 25 hours per week of involvement.

Continuous course format: The course will be spread over a period of 6 months, at a rate of 6 hours per week of involvement.

DATA ENGINEER

The two functions complement each other. The data engineer is a systems designer. He develops, tests and implements data architectures. He creates databases and organizes pipelines, i.e. data flows between sources and storage databases. The Data Engineer prepares the groundwork for the Data Scientist by implementing data architectures as close as possible to the company’s needs. The role of the Data Scientist is to exploit the data, make something out of it, learn from it, and make decisions based on it. His role is to make the data speak for itself. He transforms raw data into useful information. To do this, he uses machine learning techniques. He detects models, builds data models.

The data engineer must have a high level of expertise enabling him to successfully carry out his mission of developing data flows. He is a specialist in structured languages such as Javascript, Scala and Python. He also has skills in the design of databases that he creates using SQL and NoSQL. The production of the Data Engineer must be readable and easy to manipulate afterwards.

The Data Engineer designs systems that enable the processing of large volumes of data and their exploitation by Data Analysts and Scientists. He/she must ensure that the data pipelines deployed are secure and clear enough to be analyzed by Data Analysts and then transformed by Data Scientists who will apply algorithms to them.
The preferred sectors of activity for data engineers are the same as for data scientists. This ranges from large industrial groups (banks, insurance companies, etc.) to data centers, including startups and data processing software publishers.

According Glassdoor, in 2020 a data engineer will earn an annual salary of more than 43k€. This salary varies greatly depending on experience. For the most senior profiles, it is around 50k€.

The duration of the course is 280 hours excluding projects, i.e. 400 hours including projects.
This takes place in the form of 85% of the course in e-learning (self-training) on our platform and 15% in coaching sessions, the duration of which varies from 2 hours to 3.5 hours and which punctuate each of the training sprints. These sessions, led by our data scientists, take place approximately every 30 hours of the course.

It is therefore important to distinguish between two formats: the bootcamp or intensive course and the continuing education course.

Bootcamp format: the course lasts 11 weeks and 2 days, 37 hours per week.

Continuous training format: the course will last 9 months and will involve 6 hours of work per week, excluding the project.

DATASCIENTEST COURSES

With a prerequisite of Bac +3 in Mathematics or Bac +5 in Science, a 3-month bootcamp with DataScientest will give you all the skills you need to become a data scientist.

DataScientest Data Bootcamps are mostly held online on our course platform and are punctuated by coaching sessions in our Asnières-sur-Seine office or in your company’s premises.

The breakdown of the course for the Data Analyst and Data Scientist bootcamps is 90% online and 10% face-to-face, while for the Data Engineer bootcamp: 85% online and 15% face-to-face.

The prerequisites are different depending on the chosen course: 

  • Data Analyst: a Bac+3 level in Mathematics or Computer Science is generally required.
  • Data Scientist: a 5-year degree in mathematics and/or statistics and some programming skills are generally recommended.
  • Data Engineer: 5 years of higher education in computer science and 3 years of higher education in statistics are generally required.

Whether it is for people with a basic knowledge of mathematics, statistics or computer science who are looking to retrain or for developers wishing to increase their skills in Data Sciences, our courses are adapted to many profiles. Moreover, our “à la carte” offer will allow Data specialists already in place to increase their skills on certain notions.

Three bootcamps are available: Data Analyst, Data Scientist and Data Engineer.

Feel free to discuss this with our team by email at contact@datascientest.com or directly during the coaching sessions.

Our bootcamps are divided into several modules that are evaluated by a certification exam. Thus, each module is certified individually.
A certification from the University Paris 1 Panthéon Sorbonne will be delivered to you at the end of the DataScientest course for Data Analyst, Data Scientist and Data Engineer.

Our course platform is available in French and English and our coaching sessions are also given in these two languages.

It is best to apply at least three weeks before the course begins to ensure that you can best manage your course. Nevertheless, do not hesitate to contact us, even in case of a tight schedule, and we will try our best to get you access to the course that suits you.

Le projet n’est pas imposé, il est choisi puis défendu par les users. Il s’agit donc d’un vecteur de promotion de l’intra ou entreprenariat selon les contextes. Les projets sont ensuite choisis selon une grille de sélection qui prend en compte leur viabilité scientifique, l’accès aux données et l’intérêt des autres participants et sponsors pour la problématique choisie. En effet, un projet intéressant et bien mené peut tout à fait être mis en production dès la fin de la formation.

In addition to the course dates communicated on our site, departures in courses are regularly made for intra-company cohorts. To start a cohort, the minimum number of employees is 10. The associated pricing is studied on a case-by-case basis, on request.

If the user is already familiar with concepts included in the course, he or she may request a change in his or her course on the support unit or from the cohort director. His request will then be studied by our data scientist team. The objective will be to remain consistent with the main objectives of the core curriculum.

Throughout the course, whether the chosen format is bootcamp or continuing education, live chat support is available every working day from 9am to 7pm. Our data scientists are available to answer your technical or pedagogical questions.

The Cohort Manager and the Daniel team are the first to detect any difficulties, they check the connection times of the beneficiaries, identify any learning difficulties. If a learner is in difficulty, the Cohort Manager then triggers the remediation process involving the entire monitoring team.
The remediation process is divided into 3 stages:

  • Identification of the difficulty
  • Solution envisaged to remedy the difficulty
  • Intervention
  • Result

The beneficiary will be proposed during an interview with the monitoring officer various solutions according to the problem, the monitoring officer will activate all possible levers to help the beneficiary to remedy this situation and allow him to devote himself to his course.

If the learner has a drop in motivation that is neither linked to learning difficulties nor to a particular personal situation, an interview with the Career Manager will be proposed. During this interview, the Career Manager will identify the reasons for this loss of motivation and will try to re-motivate the beneficiary by focusing on the job prospects at the end of the course. Following the application of a solution, the follow-up team measures 2 weeks after the results of their intervention with the beneficiary in a dropout situation.

Our team, which is responsible for the creation of content and the correction of exams, masters all the ins and outs of the course in order to be able to answer your questions as accurately as possible.

On the other hand, the coaching sessions that punctuate each of the training sprints are also an excellent way to get a more global overview of the course. Your cohort leader will be available to answer all your questions and ensure the most personalized follow-up possible.

Finally, cohort operation ensures an emulation similar to that of a classroom and it is very important that the users of each cohort progress together along the way.

These different building blocks now ensure an average course completion rate of 100%.

Before dealing with this subject, 2 types of recognition should be determined. Informally, DataScientest courses and certifications are very widely recognized by influential players in the data world, at least in France. Indeed, the thirty or so groups who have benefited from our courses are, for the overwhelming majority of them, from the CAC 40, and the expertise of our content is now well known.

In addition, our diplomas are recognized by the University of Paris I, Panthéon Sorbonne. Indeed, after a thorough audit of our content, tests, and certification process, the prestigious university has deemed DataScient to be eligible for a Paris 1 Panthéon Sorbonne certification. This particular certification is available upon request and will generate a slight additional cost.

Finally, we have applied to the French Ministry of National Education and Competence for our diplomas to be recognized by the State (and at the same time, eligible for the CPF) . Given the confinement and the recent reform of education, the deadlines have been considerably extended, but we are confident that we will obtain this recognition before the end of 2020.

Financing of the courses

Depending on your professional situation, several options are available to you. 

To help you see more clearly, we have listed the different possibilities in this article. 

In order to allow everyone to assume the costs of the course in the way that suits them best, we have set up a payment schedule that can be broken down as follows: €500 application fee at the time of registration, regardless of the course. The rest of the fees (variable depending on the course chosen) will be divided into 3 equal parts payable at the beginning of the first 3 months of the course.

If you are a job seeker, you can actually apply for funding from your advisor by mobilizing the AIF (Aide Individuelle à la Course) scheme. Find our courses and all our sessions directly on your personal Pôle Emploi space and request a quote directly online.

Your quote will be processed by our teams, and your advisor will be notified as soon as it is sent. Concerning financing, your advisor will decide whether or not to grant you financing, based on several criteria including your motivation and the suitability of the course with your professional project. Our teams are at your disposal to help you in these steps.

Indeed, under the POEI, the course will be entirely financed by Pôle Emploi, but a promise of employment prior to the course will be required. In the current crisis situation of Covid 19, this seems particularly complicated except for the “top profiles”. Do not hesitate to contact us we will be able to direct you taking into account our expertise on this subject.

To know more about the financing schemes of the course, go here.

THE DATASCIENTEST METHOD

Starting from the observation that there was a lack of a purely B2B solution for data science courses, we created DataScientest more than 4 years ago.

Very quickly, we chose a hybrid format: 90% remote and 10% face-to-face.

The course takes place on a secure platform and is complemented by support, face-to-face coaching sessions and a big data project.

If most courses on the Internet are rather a combination of video courses and quizzes, DataScientest has bet on a device at the opposite of this method.

Our active pedagogy revolves around our platform, which provides the learner with a ready-to-encode environment requiring no installation.

This technology is made possible thanks to the hosting of our GPU and cluster CPUs in AWS servers and allows us to deliver a learning-by-doing course notebook, the theory being based on the exercises that the learner will be asked to solve.

The curricula are divided into sprints, themselves composed of modules.

For example, most Python curricula start with Sprint 1 Introduction to Python which is composed of the following 4 modules: “Introduction to Python” “Numpy for Data Science”// “Pandas for Data Science”// “Introduction to Scikit learn”.

Each sprint is closed by an unlocked evaluation after the validation of all the modules composing it. This will be done directly on the platform and timed.
The correction will be done, by hand, by our data scientists. Far from an automated and impersonal correction, they will take into account the quality of the reasoning, the comments added to the codes as well as the time management (copy historized every 5 minutes.)

A real cornerstone of our course, the Big Data project is used for courses lasting more than 6 months. It will be carried out by bi or tri nome and its selection will lead to a fully dedicated coaching session. Intended to be put into production at the end of the course, it will be carried out with the company’s data to which we do not, of course, have access.

The project will therefore have a double advantage: Not only will it provide the company with a real POC, but it will also be the best motivation vector for the learners who will immediately apply the theoretical notions acquired on the platform.

The project is not imposed, it is chosen and then defended by the users. It is therefore a vector for promoting intra or entrepreneurship depending on the context. Projects are then chosen according to a selection grid that takes into account their scientific viability, access to data and the interest of other participants and sponsors in the chosen issue. Indeed, an interesting and well conducted project can be put into production as soon as the course is over.

Throughout the course a permanent contact is established between the user and the support team. As soon as the user starts to be late, a manual reminder is sent by slack message or e-mail. If there is no response or no progress, the support team tries to contact the user by phone to check the status. If after 10 days we have no news from the user an email can be sent to the manager. It is this personalized follow-up that ensures a 100% completion rate of all our courses.

The user must notify the support in case of absence longer than 7 working days. If this absence causes a delay, the user’s schedule can be readjusted. 

  • Every week, exhaustive reports with all types of quantitative indicators (hours, exercises, certifications) are sent to HR and business line managers.
  • Every 5 to 6 weeks, the cohort director on the DataScientest side of the group gets in touch with his contacts on the group side in order to provide them with individual follow-up information as well as the progress of each of the group’s projects.

We have made the strictest standards and restrictions our own. Indeed, our https platform hosted on dedicated servers guarantees us a maximum of security, so much so that we are, today, deployed in the strictest groups of the hexagon on these subjects. 

Our partnership with the fume cupboard is built around short certifications. Indeed, after an audit of the platform and the content of our modules, the prestigious university decided to grant us an over-certification. Concretely, following the data scientist certification obtained after the assessments, these are sent to the university, which will then serve a paper diploma at a certain cost that will remain unchanged regardless of the number of certifications per person.

The curricula are built in collaboration with the groups. Indeed, they will depend on a multitude of criteria such as the needs of the group, the skills of the learners or the strategic choices made by the group (language, bookshops etc…).

Les prérequis dépendent évidemment du cursus et des modules choisis. D’une manière générale, nous considérons qu’une licence ou équivalent en mathématiques ou informatique est nécessaire, au moins pour les premiers modules d’introductions aux langages.

Once again, it depends on the curriculum and the number of months chosen, but generally speaking, a minimum of 4 to 5 hours per week is necessary throughout the curriculum to complete it successfully and in the best conditions.

Firstly, we offer a business-oriented content with a theoretical skills development associated with practical business use cases. While DataCamp is a platform that has been designed for students and adapted for business, the DataScientest platform has been designed for companies and to increase the skills of employees in Data Science. In addition, our architecture is similar to the architecture of a Data Lab.

As far as our platform is concerned, our content is available in both English and French.
We provide live chat support (days and working hours) supported by the professors who created our courses. This support allows us to ensure 100% completion of our courses!

In order to evaluate users, we have set up certification exams. These certifications are delivered by the University Paris La Sorbonne. As for the exams, they have a real value on the market because our platform is used as a recruitment tool by large groups such as Allianz or BCG.

E-LEARNING

The e-learning courses are delivered on our full Saas secure platform.
The chosen format is the Jupyter notebook, which means that the course does not require any prior installation. This means that you can start coding as soon as you receive your login and password.
If you have any questions, live chat support will be available via Slack to answer all your questions about the course.

If a technical problem occurs during your course, do not hesitate to contact help@datascientest.com who will try to answer your questions as soon as possible.

If you have any questions during your course, you can contact our support via Slack, who will try to guide you and answer all your questions.