Multimedia Notebook
 
STUDENT
 
FACULTY
 
SCHOOL
 
SUPPORT
 
PUBLIC
 
SIGNUP
DAILY QUIZ
 
     
  B U L L E T I N    B O A R D

Proposed Syllabus for ISM 326 – Data provision for Applied AI by Bill McHenry

(Subject: Data Analytics/Authored by: Liping Liu on 12/6/2024 5:00:00 AM)/Views: 385
Blog    News    Post   

Course Description: While enterprise databases store and supply data for operations, AI-based analytical applications require greater volume, variety, and velocity of data provision. This course prepares students to understand and work with a variety of data models and storage approaches; prepare, extract, transform, and combine data in data pipelines; and consider how to ensure quality, scale, and manage data using principles of data governance.

Format: Face-to-face. Assignments will include problem sets using software to solve business-oriented problems in business scenarios, at least one presentation of results to the class, and two exams (could be three if the sizes of the modules are adjusted). (It may also be possible to flip the order of Units 1 and 2.)

Course Backdrop:

A key role played by information systems professionals is to ensure that enterprises and organizations support business and organizational functions with information systems that are aligned with their strategic, tactical, and operational needs. Declining storage costs and recognition of the inherent value of data has given rise to a new appreciation of treating data as a resource in and of itself, which then can be used to support the full range of analytics-oriented IS and applied AI applications.

 

Specific Units and Modules:

UNIT 1: Data models and systems that support analytics as opposed to operations

  • Moving from transaction-oriented DBMS that prioritize ACID principles and minimal redundancy (primary topic of ISM 324) to the star schema and its extensions for data warehouses and data marts
    • At least one star schema exercise to emphasize how it works
  • Types of data warehouses and data marts and DBMS that support them, including NOSQL approaches such as columnar databases, graph databases, etc. Data lakes, the “data fabric”, spreadsheets and other local storage, including on-prem vs. the cloud​, digital services platforms
    • Case study: Goodyear’s transformation from an on-prem environment using Teradata to a cloud-based environment using Snowflake (video from BAIS 2023: https://www.uakron.edu/cba/bais/2023.
  • The end result: actionable data that can be presented using visualization tools such as Tableau and PowerBI, including exploring data using 2D and 3D visualization and animation
    • Small module on Tableau skills resulting in actionable visualizations

MIDTERM EXAM

UNIT 2: The data pipeline

  • Enterprise data flows: ETL, ELT, moving data around, data wrangling, handling IoT and other sources of big data​, bringing “cold data” and unstructured into forms that can be used for analysis
  • Data preparation and quality: controlling the gateway into the data platform and data warehouse, data provenance
    • Understanding various data representation/storage approaches (these may include the following, depending on time and requirements of other courses: such as XML, HTML, CSV, Excel, SAS, JSON, graphical model, document-oriented model, key-value pair model, relational, object-relational model, and map-reduce)
    • Using Alteryx low-code/no-code approach to create data transformation flows that results in data that can be analyzed
    • Assignments for this unit lead students to Alteryx micro-certifications in Data Preparation, Data Manipulation, and Data Transformation

 

UNIT 3: Scaling the data pipepine, governance

  • Introduction to MLOPs: data pipelines, model management and retuning
  • Introduction to governance issues
    • setting policies around data, providing data audits, and other functions related to Data Governance
    • maintaining master data
      • Assignment involving slowly-changing dimensions types 1, 2, and 3 with Alteryx and Tableau

 

FINAL EXAM

Textbooks

For Alteryx and Tableau, there are ample materials available from the SparkED program from Alteryx (https://www.alteryx.com/sparked) and from the help and training materials available from Tableau (https://www.tableau.com/support/help).

Selected readings may be taken from:

Christopher Adamson, Star Schema: The Complete Reference, McGraw Hill, 2010, ISBN 978-0-07-174432-4. We have access through the library: https://library.uakron.edu:443/record=b7179992~S24

Ann Jackson, Luke Stanke. Tableau strategies: solving real, practical problems with data analytics, 2021. https://library.uakron.edu:443/record=b7213820~S24

Kirk Munroe. Data Modeling With Tableau: a practical guide to help data analysts build data models using tableau prep and tableau desktop. PACKT Publishing Limited, 2023. https://library.uakron.edu:443/record=b7476816~S24

Michael Kaufmann, Andreas Meier: SQL and NoSQL databases: modeling, languages, security and architectures for big data management. Cham: Springer, 2023. https://library.uakron.edu:443/record=b7575876~S24

Alejandro Vaisman, Esteban Zimányi, Data warehouse systems: design and implementation. Berlin: Springer, 2022, Second edition. https://library.uakron.edu:443/record=b7398983~S24.

Treveil, M., Dreyfus-Schmidt, L., Lefevre, K., Omont, N. (2021). Introducing MLOps : how to scale machine learning in the enterprise. O’Reilly Media, Inc. https://olc1.ohiolink.edu:443/record=b42532269

 

 

 

APPENDIX: Old Contents of ISM 425

  • Data models that support analytics as opposed to operations
    • - moving from transaction-oriented DBMS that prioritize ACID principles and minimal redundancy (primary topic of ISM 324) to
    • - the star schema and its extensions for data warehouses and data marts
  • Types of data warehouses and data marts and DBMS that support them
    • - in the past I have covered lots of NoSQL approaches, more recently I have only focused on columnar DBMS such as Snowflake
  • Visualization of data in the star schema using Tableau
    • - In the first half of the class I take the students through a series of design exercises that teach them the various elements of the star schema in detail. They either use data I provide or make data that fits in their schemas and implement star schemas that can then be analyzed using Tableau
  • ETL and ELT
    • - I teach Alteryx to create data transformation flows that results in data that can be analyzed using methods form the first half of the class.
    • - Many students opt to get a basic certification in Alteryx, include micro-certifications in Data Preparation, Data Manipulation, and Data Transformation
    • - this is the students' only exposure to a low-code/no-code approach, and teaches them to think differently about how algorithms are implemented
  • I have dropped data mining from the class, and I would say that the emphasis on decision support has waned. Every assignment concludes with “decisions” and judgments that students have to make, but this has not received as much emphasis recently.

           Register

Blog    News    Post
 
     
 
Blog Posts    News Digest    Contact Us    About Developer    Privacy Policy

©1997-2024 ecourse.org. All rights reserved.