The Online Graduate Certificate from Penn Engineering Online is a for-credit credential that will produce an academic transcript and paper certificate. To earn a certificate, students can take a maximum of four (4) course units. Two of these four course units may be double-counted from your Penn Engineering graduate degree program.

Students may earn a maximum of two certificates. No course may be triple counted, i.e., counted for more than two credentials.

While most individuals will complete the Online Graduate Certificate program within one year, students may choose to extend their studies. In this case, all Certificate requirements must be met within a maximum of two years.

*Note: Degree students will receive first priority for course registration.

Data Science Online Graduate Certificate Courses (Current Curriculum)

CIS 5450 Big Data Analytics

CIS 5450

Big Data Analytics

In the new era of big data, we are increasingly faced with the challenges of processing vast volumes of data. Given the limits of individual machines (compute power, memory, bandwidth), increasingly the solution is to process the data in parallel on many machines. This course focuses on the fundamentals of scaling computation to handle common data analytics tasks. You will learn about basic tasks in collecting, wrangling, and structuring data; programming models for performing certain kinds of computation in a scalable way across many compute nodes; common approaches to converting algorithms to such programming models; standard toolkits for data analysis consisting of a wide variety of primitives; and popular distributed frameworks for analytics tasks such as filtering, graph analysis, clustering, and classification.

Pre-Requisites

CIT 5910 Introduction to Software Development or equivalent programming experience; Broad familiarity with probability and statistics, as well as programming in Python; Additional background in statistics, data analysis (e.g., in Matlab or R), and machine learning is helpful (example: ESE 5420 Statistics for Data Science: An Applied Machine Learning Course)

CIS 5500 Database & Information Systems

CIS 5500

Database & Information Systems

Structured information is the lifeblood of commerce, government, and science today. This course provides an introduction to the broad field of information management systems, covering a range of topics relating to structured data, from data modeling to logical foundations and popular languages, to system implementations. We will study the relational data model; SQL; database design using the Entity-Relationship model and relational design theory; transactions and updates; efficient storage of data; indexes; query execution and query optimization; and “big data” and NoSQL systems.

Pre-Requisites

CIT 5910 Introduction to Software Development, CIT 5920 Mathematical Foundations of Computer Science | Knowledge of Javascript & Web Development (HTML, CSS) is recommended. | Recommended Corequisite: CIT 5960 Algorithms & Computation

ESE 5410 Machine Learning for Data Science

ESE 5410

Machine Learning for Data Science

Machine Learning for Data Science is a foundational course designed to equip students with the essential skills necessary for a career in data science and machine learning. This comprehensive course delves into the fundamentals of machine learning, addressing key concepts such as the curse of dimensionality, model selection and validation, regularization, bootstrap and uncertainty quantification. Students will gain hands-on experience with a variety of machine learning models including regression and classification trees, ensemble learning, boosting, support vector machines, neural networks, hierarchical clustering and K-means. The curriculum is structured to provide practical Python programming skills, which are crucial for succeeding in subsequent courses. By applying these techniques to real-world scenarios in finance, business and industry, the course ensures that students not only understand the theory behind machine learning but also how to apply it effectively in professional settings. This course is an indispensable part of the educational journey for aspiring data scientists, laying the groundwork for further studies and applications in the field.

Pre-Requisites

CIT 5920 Mathematical Foundations of Computer Science, Programming background, Basic Probability

ESE 5420 Statistics for Data Science

ESE 5420

Statistics for Data Science

The course covers the methodological foundations of data science, emphasizing basic concepts in statistics and learning theory, but also modern methodologies. Learning of distributions and their parameters. Testing of multiple hypotheses. Linear and nonlinear regression and prediction. Classification. Uncertainty quantification. Model validation. Clustering. Dimensionality reduction. Probably approximately correct (PAC) learning. Such theoretical concepts are further complemented by exemplar applications, case studies (datasets), and programming exercises (in Python) drawn from electrical engineering, computer science, the life sciences, finance, and social networks.

Pre-Requisites

CIT 5920 Mathematical Foundations of Computer Science, Programming background, Basic Probability

DATS 5980 Data Science Capstone

DATS 5980

Data Science Capstone

This pilot course in the Data Science Program provides students an opportunity to work on an end-to-end real-world data science project by leveraging students’ existing industry partners. Students will work with their Capstone mentors and the course instruction team to identify a data science problem, apply knowledge from previous courses to design a solution and learn new skills and techniques to implement their proposed solution.

Weekly instructor office hours will be used to discuss questions around the components of the data science project lifecycle, consider common issues with projects, brainstorm ideas for addressing stumbling blocks, and seek and share feedback on project decisions and progress. The student will be guided jointly by the course instructor and by a Capstone mentor selected by the student in the area of the project. There will be a mid-course, optional, in-person gathering for students to share project progress and receive input from their peers and faculty.

Upon completing the course, students are expected to have gained essential skills to tackle real-world problems through a data science perspective. This course is specifically designed for Data Science students who have already identified a semester-long project that covers all aspects of the data science pipeline and have secured an industry mentor. We strongly advise you to enroll only if you have these two aspects arranged in advance. If for some reason you have not yet found a mentor, please contact Alexander Savoth (asavoth@seas.upenn.edu) before the course begins.

Pre-Capstone Materials

Pre-Requisites

Four MSE-DS Core Courses and Two MSE-DS Electives | Only available to MSE-DS Degree Students