MSE-DS Degree Requirements for students enrolled Spring 2025 and forward : To earn an MSE-DS Online degree, you’ll complete ten (10) course units – three (3) foundational courses units, four (4) core course, two (2) technical elective units and one (1) open elective unit. All courses are fully online, and there are no required real-time sessions.
This course is a continuation of CIT 5930 and introduces students to fundamental concepts in computing systems. The course is divided into two parts. The first half of the course introduces important concepts in modern operating systems: processes, scheduling, caching, and virtual memory. The second half of the course provides an introduction to fundamental concepts in the design and implementation of networked systems, their protocols, and applications. The course will use the C program language, and will develop your knowledge on C system calls, and libraries for process/thread creation and manipulation, synchronization, and network communication.
This course focuses primarily on the design and analysis of algorithms. It begins with sorting and searching algorithms and then investigates graph algorithms. In order to study graph algorithms, general algorithm design patterns like dynamic programming and greedy algorithms are introduced. A section of this course is also devoted to understanding NP-Completeness.
Pre-Requisites
CIT 5920 | Co-requisite: CIT 5940 (Taking concurrently is allowed but taking beforehand is preferred)
Offerings
1 Course Unit
Students may select CIS 5150 or EAS 5160 Mathematical Foundations for Machine Learning I: Probability (0.5 CU) and EAS TBD Mathematical Foundations for Machine Learning II: Algebra (0.5 CU, tentative title).
Fundamentals of Linear Algebra & Optimization (Math for Machine Learning)
There are hardly any machine learning problems whose solutions do not make use of linear algebra. This course presents tools from linear algebra and basic optimization that are used to solve various machine learning and computer science problems. It places emphasis on linear regression, data compression, support vector machines and more, which will provide a basis for further study in machine learning, computer vision, and data science. Both theoretical and algorithmic aspects will be discussed, and students will apply theory to real-world situations through MATLAB projects.
Pre-Requisites
Calculus (Chapters 8, 9, 10, and 48 of Schaum’s Outlines of Calculus fifth edition by Frank Ayers and Elliott Mendelssohn) Suggested: Undergraduate course in linear algebra (helpful but not required), Chapters 1 through 3 of Schaums Outline of Linear Algebra, fourth version by Seymour Lipschitz and Marc Lipson
Mathematical Foundations for Machine Learning I: Probability (0.5 CU)
This course introduces students to the mathematical foundations of the theory of probability. In addition to a host of classical domains, probability is one of the foundational elements of modern data science, machine learning, and artificial intelligence. The course begins with an exploration of combinatorial probabilities in the classical setting of games of chance, proceeds to the development of an axiomatic, fully mathematical theory of probability, and concludes with the discovery of the remarkable limit laws and the eminence grise of the classical theory, the central limit theorem. The topics covered include: discrete and continuous probability space, distributions, mass functions, densities; conditional probability; independence; the Bernoulli schema: the binomial, Poisson, and waiting time distributions; uniform, exponential, normal, and related densities; expectation, variance, moments; conditional expectation; inequalities, tail bounds, and limit laws. This material is presented in its lush and glorious historical context, the mathematical theory buttressed and made vivid by rich and beautiful applications drawn from the world around us. Students are assessed by weekly problem set assignments and a proctored exam.
In the new era of big data, we are increasingly faced with the challenges of processing vast volumes of data. Given the limits of individual machines (compute power, memory, bandwidth), increasingly the solution is to process the data in parallel on many machines. This course focuses on the fundamentals of scaling computation to handle common data analytics tasks. You will learn about basic tasks in collecting, wrangling, and structuring data; programming models for performing certain kinds of computation in a scalable way across many compute nodes; common approaches to converting algorithms to such programming models; standard toolkits for data analysis consisting of a wide variety of primitives; and popular distributed frameworks for analytics tasks such as filtering, graph analysis, clustering, and classification.
Pre-Requisites
CIT 5910 Introduction to Software Development or equivalent programming experience; Broad familiarity with probability and statistics, as well as programming in Python; Additional background in statistics, data analysis (e.g., in Matlab or R), and machine learning is helpful (example: ESE 5420 Statistics for Data Science: An Applied Machine Learning Course)
Structured information is the lifeblood of commerce, government, and science today. This course provides an introduction to the broad field of information management systems, covering a range of topics relating to structured data, from data modeling to logical foundations and popular languages, to system implementations. We will study the relational data model; SQL; database design using the Entity-Relationship model and relational design theory; transactions and updates; efficient storage of data; indexes; query execution and query optimization; and “big data” and NoSQL systems.
Pre-Requisites
CIT 5910 Introduction to Software Development, CIT 5920 Mathematical Foundations of Computer Science | Knowledge of Javascript & Web Development (HTML, CSS) is recommended. | Recommended Corequisite: CIT 5960 Algorithms & Computation
Machine Learning for Data Science is a foundational course designed to equip students with the essential skills necessary for a career in data science and machine learning. This comprehensive course delves into the fundamentals of machine learning, addressing key concepts such as the curse of dimensionality, model selection and validation, regularization, bootstrap and uncertainty quantification. Students will gain hands-on experience with a variety of machine learning models including regression and classification trees, ensemble learning, boosting, support vector machines, neural networks, hierarchical clustering and K-means. The curriculum is structured to provide practical Python programming skills, which are crucial for succeeding in subsequent courses. By applying these techniques to real-world scenarios in finance, business and industry, the course ensures that students not only understand the theory behind machine learning but also how to apply it effectively in professional settings. This course is an indispensable part of the educational journey for aspiring data scientists, laying the groundwork for further studies and applications in the field.
Pre-Requisites
CIT 5920 Mathematical Foundations of Computer Science, Programming background, Basic Probability
The course covers the methodological foundations of data science, emphasizing basic concepts in statistics and learning theory, but also modern methodologies. Learning of distributions and their parameters. Testing of multiple hypotheses. Linear and nonlinear regression and prediction. Classification. Uncertainty quantification. Model validation. Clustering. Dimensionality reduction. Probably approximately correct (PAC) learning. Such theoretical concepts are further complemented by exemplar applications, case studies (datasets), and programming exercises (in Python) drawn from electrical engineering, computer science, the life sciences, finance, and social networks.
Pre-Requisites
CIT 5920 Mathematical Foundations of Computer Science, Programming background, Basic Probability
This course investigates algorithms to implement resource-limited knowledge-based agents which sense and act in the world. Topics include: search, machine learning, probabilistic reasoning, natural language processing, knowledge representation and logic. After a brief introduction to the language, programming assignments will be in Python. This course must be taken in the first semester of the program.
This course provides an overview of the field of natural language processing. The goal of the field is to build technologies that will allow machines to understand human languages. Applications include machine translation, automatic summarization, question answering systems, and dialog systems. NLP is used in technologies like Amazon Alexa and Google Translate.
Pre-Requisites
CIT 5910 Introduction to Software Development, CIT 5920 Mathematical Foundations of Computer Science , and CIT 5940 Data Structures & Software Design. Recommended: CIT 5960
In this course, we will explore massively parallel programming, specifically on graphics processing units (GPUs), with immediate application to machine learning (ML) and artificial intelligence (AI). We’ll first outline computational aspects of ML and connect parallel programming to common components of deep learning. You will gain proficiency in GPU programming basics through hands-on projects with industry best practices and tools, eventually building up to implementing components of modern neural models.
After completing the course, you will have knowledge of:
– parallel programming concepts (hardware, software, and networks) working together to accelerate performance;
– modern distributed ML computation in a GPU datacenter setting as it relates to large-scale neural networks;
– machine and deep learning workloads from a computational perspective to build more efficient systems;
– using tools like profilers and debuggers to accelerate performance and solve programming challenges.
Please note that upon registration, this course requires a $300 Computing Resources Fee in addition to the Online Services Fee.
Pre-Requisites
Required: CIT 5930, CIT 5940, CIT 5950, CIT 5960, course projects require knowledge of C/C++. Recommended: ESE 5460
This is an introductory course to computer vision and computational photography. This course will explore four topics: 1) image feature detection, 2) image morphing, 3) image stitching, and 4) deep learning related to images. This course is intended to provide a hands-on experience with interesting things to do on images/pixels. The world is becoming image-centric. Cameras are now found everywhere: in our cell phones, automobiles, and even in medical surgery tools. In addition, computer vision technology has led to innovations in areas such as movie production, medical diagnosis, biometrics, and digital library. This course is suited for students with any engineering background who have a basic understanding of linear algebra and programming, along with plenty of imagination.
Pre-Requisites
CIT 5910 Introduction to Software Development, CIT 5920 Mathematical Foundations of Computer Science, CIT 5930 Introduction to Computer Systems and CIT 5940 Data Structures & Software Design. Students may take CIT 5950 Computer Systems Programming and/or CIT 5960 Algorithms & Computation concurrently with this elective.
This pilot course in the Data Science Program provides students an opportunity to work on an end-to-end real-world data science project by leveraging students’ existing industry partners. Students will work with their Capstone mentors and the course instruction team to identify a data science problem, apply knowledge from previous courses to design a solution and learn new skills and techniques to implement their proposed solution.
Weekly instructor office hours will be used to discuss questions around the components of the data science project lifecycle, consider common issues with projects, brainstorm ideas for addressing stumbling blocks, and seek and share feedback on project decisions and progress. The student will be guided jointly by the course instructor and by a Capstone mentor selected by the student in the area of the project. There will be a mid-course, optional, in-person gathering for students to share project progress and receive input from their peers and faculty.
Upon completing the course, students are expected to have gained essential skills to tackle real-world problems through a data science perspective. This course is specifically designed for Data Science students who have already identified a semester-long project that covers all aspects of the data science pipeline and have secured an industry mentor. We strongly advise you to enroll only if you have these two aspects arranged in advance. If for some reason you have not yet found a mentor, please contact Alexander Savoth (asavoth@seas.upenn.edu) before the course begins.
This 0.5 CU course is an excellent introduction for those who want to learn about the mechanics of data, performing data analysis to gain insights, applying data science techniques to make predictions, and applying data analytics to answer questions and to address interesting business problems. Students will learn how to interpret and frame business problems to be addressed by analytics. The course will also cover different elements of the data analytics process, including data wrangling and cleaning, data exploration and descriptive analytics, data modeling, machine learning, predictive analytics, data visualization and the presentation of analysis and insights using data storytelling. While we will touch upon essential theoretical and technical concepts, our primary focus in this course will be on the practical application of data skills.
This 0.5 CU course provides a comprehensive introduction to the field of imaging informatics, with a focus on radiology as the clinical imaging domain. Students will learn about the importance of informatics to the clinical practice of radiology, the unique types of data encountered, relevant data and transactional standards, the growing role of artificial intelligence in radiology, and the challenges faced by imaging informaticists around the globe. This course is geared to any student interested in imaging informatics, and does not require prior training or experience in medicine or medical imaging. Homework assignments include synthesizing reading content and preparing written responses, managing radiology data though coding, and using generative AI to explore health literacy. Unlike other offerings in the course catalog, Imaging Informatics provides a distinctive blend of informatics and radiology, focusing on practical applications and hands-on experience in managing and interpreting medical imaging data, with less focus on intensive coding and technical skill development.
This 0.5 CU course provides a comprehensive introduction to medical image analysis. Students will learn the basics of Computer Vision with an emphasis on the special challenges of automated medical image analysis for clinical healthcare and medical research. Students will be required to visually assess the images, and work with key Machine Learning technology to interpret data on the actual medical image scans. The course is appropriate for students without prior medical or imaging training.
Deep networks are at the heart of modern approaches in computer vision, natural language processing and robotics. Design of these networks requires a combination of intuition, theoretical foundation and empirical experience; this course discusses general principles of deep learning that cut across these three. It develops insight into popular empirical practices with a focus on the training of deep networks, builds theoretical skills to develop new ideas in deep learning and to deploy deep networks in real world applications. A fair degree of mathematical and programming proficiency is necessary to complete the coursework.
MCIT Online Students must have completed 4 of their core courses and CIS 5150 or ESE 5420 | MSE-DS Online Students must have completed 5 courses including CIS 5150 or ESE 5420.
Engineers design and build the world we live in. From algorithms to bridges, cars to drones, every day we entrust our safety and prosperity to the decisions that engineers make. It is unsurprising that ethics is part of the foundation on which the modern engineering profession is built. But sometimes ethical engineering decisions nonetheless harm us more than they help us. Where this happens, engineers may face legal liability. Such liability is a legal question, not an engineering question. This course introduces students both to traditional concepts of engineering ethics as well as to the legal and policy background against which the ethics of engineering decisions are ultimately evaluated. Particular attention is paid to questions that arise in the context of new technologies such as artificial intelligence; case studies involving artificial intelligence and similar technology are considered throughout.
This course is organized in three units. It begins with a broad consideration of the role of engineers and ethics in society and the role that the law plays in formalizing a society’s ethical intuitions. It then considers the different ways that engineers and the law understand risk and how those differing understandings of risk affect product design and professional liability. It concludes by surveying legal topics of particular interest to engineering professions such as intellectual property, privacy and security, regulation, and antitrust. Contemporary challenges, such as ethical issues posed by artificial intelligence and the challenges of regulating firms with significant market power, are considered throughout.
Students enrolled in this class will be asked to read a range of materials, including excerpts from legal memos and judicial opinions, philosophical texts, and engineering studies. Assessments will include regular short writing assignments.
Pre-Requisites
MCIT: CIT 5910 and CIT 5920
Free Electives
Choose 1 Course Unit
Any online EAS/CIS/ESE/DATS course.
Note: MSE-DS Online students are waived from needing to complete CIT 5910, CIT 5920, CIT 5930, & CIT 5940 as pre-req requirements.