Data Scientist or Data Engineer? Choose Your Path On Udacity
Written by Sue Gee   
Wednesday, 16 September 2020

There's no stopping the flood of data - 2.5 million terrabytes are created every day, so storing, organizing and analyzing data is becoming more important than ever. Udacity has refreshed the Nanodegree Programs offered by its School of Data Science and they start on September 23..

Udacity's School of Data Science now has a total of thirteen Nanodegree Programs, eleven of which as flagged as being new programs. It also identifies paths to specific careers with two divergent paths for developers who are interested in data one leading to the role of Data Scientist and the other to Data Engineer.

Data Scientists can be thought of as those who make sense of the data, present the information it contains and contribute to making decisions based on it. Setting out a path to this career, Udacity reminds us:

There is a shortage of qualified Data Scientists in the workforce, and individuals with these skills are in high demand. Build skills in programming, data wrangling, machine learning, experiment design, and data visualization, and launch a career in data science.

At Advanced level Udacity's Data Scientist Nanodegree is the final step on the path outlined if you want a career in this role. The learning starts with Programming for Data Science with Python. This nanodegree, which is estimated to take 3 months, is at beginner level. It covers the fundamentals of Python and gets you familiar with basic data programming tools including SQL and also version control with Git. 

datascipic

The intermediate step is the Data Analyst Nanodegree, which is a 4-month program in which you use Python, SQL, and statistics to uncover insights, communicate critical findings, and create data-driven solutions. Its modules and projects (titles in CAPS) are as follows:

  • Introduction to Data Analysis
    Learn the data analysis process of wrangling, exploring, analyzing, and communicating data. Work with data in Python, using libraries like NumPy and Pandas.
    EXPLORE WEATHER TRENDS
    INVESTIGATE A DATASET
  • Practical Statistics

Learn how to apply inferential statistics and probability to real-world scenarios, such as analyzing A/B tests and building supervised learning models.
ANALYZE EXPERIMENT RESULTS

  • Data Wrangling
    Learn the data wrangling process of gathering, assessing, and cleaning data. Learn to use Python to wrangle data programmatically and prepare it for analysis.
    WRANGLE AND ANALYZE DATA

  • Data Visualization with Python
    Learn to apply visualization principles to the data analysis process. Explore data visually at multiple levels to find insights and create a compelling story.
    COMMUNICATE DATA FINDINGS

Prior to embarking on the 4-month Data Scientist Nanodegree you also need a grounding in machine learning. It has these modules and projects plus a final capstone project to put it all together: 

  • Solving Data Science Problems
    Learn the data science process, including how to build effective data visualizations, and how to communicate with various stakeholders.
    WRITE A DATA SCIENCE BLOG POST
  • Software Engineering for Data Scientists
    Develop software engineering skills that are essential for data scientists, such as creating unit tests and building classes.
  • Data Engineering for Data Scientists
    Learn to work with data through the entire data science process, from running pipelines, transforming data, building models, and deploying solutions to the cloud.
    BUILD PIPELINES TO CLASSIFY MESSAGES WITH FIGURE EIGHT
  • Experiment Design and Recommendations
    Learn to design experiments and analyze A/B test results. Explore approaches for building recommendation systems.
    DESIGN A RECOMMENDATION ENGINE WITH IBM

data2 

The alternative career path that you might want to follow leads to the role of Data Engineer. According to Sam Nelson, Product Lead of Udacity's School of Data Science.

Data Engineers build the engines that help companies make sense of it all. They are crucial to any company's data strategy. Without the right infrastructure, you can collect data, but it just sits and takes up space.

The first step on this path is again the beginner-level nanodegree Programming for Data Science with Python. The second, at intermediate level is Data Engineer Nanodegreewhich is designed to show you how to understand the data ecosystem, give you the right tools to navigate it and enable you to apply what you learn by completing hands-on, portfolio-ready projects. It is a 5-month program with the following modules and and projects plus a final capstone project to put it all together: 

  • Data Modeling
    Learn to create relational and NoSQL data models to fit the diverse needs of data consumers. Use ETL to build databases in PostgreSQL and Apache Cassandra.
    DATA MODELING WITH POSTGRES
    DATA MODELING WITH APACHE CASSANDRA
  • Cloud Data Warehouses
    Sharpen your data warehousing skills and deepen your understanding of data infrastructure. Create cloud-based data warehouses on Amazon Web Services (AWS).
    BUILD A CLOUD DATA WAREHOUSE
  • Spark and Data Lakes
    Understand the big data ecosystem and how to use Spark to work with massive datasets. Store big data in a data lake and query it with Spark.
    BUILD A DATA LAKE
  • Data Pipelines with Airflow
    Schedule, automate, and monitor data pipelines using Apache Airflow. Run data quality checks, track data lineage, and work with data pipelines in production.
    DATA PIPELINES WITH AIRFLOW

The third step on the path, at advanced level, is the Data Streaming Nanodegree which we reported on when it was originally launched in March 2020. Estimated to require 2 months and with two courses and two projects, it is designed to teach you how to process data in real-time by building fluency in modern data engineering tools, such as Apache Spark, Kafka, Spark Streaming, and Kafka Streaming.

speeddata

More Information

Udacity School of Data Science

Data Visualization Nanodegree

Data Engineer Nanodegree

Data Streaming Nanodegree

Data Scientist Nanodegree

Data Analyst Nanodegree

Programming for Data Science with Python

Programming for Data Science with R

SQL Nanodegree 

Related Articles

Udacity Launches School of Data Science

Udacity Launches Data Scientist Nanodegree

Udacity Data Science Nanodegrees Restarting

New Udacity Nanodegree In Data Streaming

Beginner-Level SQL Nanodegree From Udacity

Data Scientist Best Paying Entry-Level Job Says Glassdoor 

What is a Data Scientist and How Do I Become One? 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

 

Banner


Google Updates Responsible AI Toolkit
01/11/2024

Google has announced updates to the Responsible Generative AI Toolkit to enable it to be used with any LLM model. The Responsible GenAI Toolkit provides resources to design, build, and evaluate open A [ ... ]



Apollo Adds REST APIs For GraphQL
29/10/2024

Apollo has added a simpler way to integrate REST APIs into a federated GraphQL environment. Available now in public preview, can be used to map REST API endpoints to their GraphQL schema using a decla [ ... ]


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Wednesday, 03 November 2021 )