Data Science is currently one of the fastest growing fields in the world. Employers want to make evidence based decisions which requires data manipulation and analysis. Many high schools have not made or are not equipped to handle this transition towards data science. UMBC's Data Science Academy curriculum was created to spark an interest in high school students who want to learn more about this expanding field. The Machine Learning and Statistics II course is for high school students who would like to get a head start on college level material that is digestible for high school students. The course will dive deeper within statistical inferences by learning various methods such as two sample means and proportion comparisons, analysis of variance, and Chi-Square Tests using R and real data. This course will delve into the process of straight-forward machine learning algorithms such regression, K-Nearest Neighbors and Decision Trees using data sets that students find interesting within R. 

What will be covered in Statistics II

  • Simulation Based Inference 
  • Machine Learning Algorithms (Regression, K-Nearest Neighbors, Decision Trees)
  • Confusion Matrix
The data science program is an overview to topics discussed in introductory data science and statistics courses. This program was designed to provide students with exposure to a wide variety of topics that will prepare them for AP Statistics course, elements of a high school programming course and a college level data science course. This program gives students an edge with critical skills that colleges and employers see as essential in almost every math, science, computer and engineering field. In Data Science I students develop the ability to mine, manipulate and visualize data. Learn how to extract data from websites such as Wikipedia, GitHub and other data science websites.

The class will be able to access countless data sets based on what the students themselves deem interesting when in the class.  Past classes have used data sets that include:

This course will be administered a full week for 3 hours each day. Learn More:

Open to all levels and experience.

Live Online Synchronous: (LOS) Students are in class continuously for the duration of the class, for presentations, assignments and group work with an instructor present.

Students will need a computer and be able to download R and R Studio on to their machines.  Here is a link to help make downloading R and R Studio easy!  How to download R and R Studio

Why take Data Science and Statistics:  This program is especially important for students that need to see application in order for them to comprehend the concept. There are many hands on examples i.e. from building a web application to visualize complex datasets to web scraping data from Wikipedia. These courses will help students that find math or programming intimidating or difficult because data science is a gateway to these subjects. With everyone playing catch-up, students will be more competitive for college and advance high school courses than their peers with this data science program.  Data science connects to any field (i.e. biology, journalism, economics) which make it a practical skill for any student to acquire while in high school.

Taught by UMBC graduate, college professor and data scientist, Immanuel Williams is a passionate data enthusiast who constantly looks for ways to connect data to the real world. He developed this curriculum to motivate and challenge high school students with problems and projects that connects to topics that they deem interesting.

Program Design: Students must take Data Science I before progressing to the full data science or the statistics track. In order to advance to either Data Science II and Data Science III or Statistics I and Statistics II,  student must take the prerequisites courses Data Science I.  

Data Science I is the corner stone of the Data Science Academy program because it gives students the ability to not only acquire the basics within R and R Studio, but also have the skills to advance to the subsequent courses in the program. Student will be able to see more connections to the real world through the lens of data science and statistics skills.

What the full program covers?-Extraction, Manipulation, and Visualization of Data-Principles of SQL within R, Develop Web Applications, and Story Telling with Data-Iterations, Advance R Functions, Cloud Computing with AWS-Sampling, Probability & Counting, Principles of Inference-Simulation Based Inference, Principles of Machine Learning Algorithms, Confusion Matrix

Available Sessions - Click on date(s) below