Workshop Descriptions

Introduction to Linux

Introduction to the Linux Command Line Interface for researchers

Introduction to Research Computing on Palmetto Cluster

This workshop introduces participants to the Palmetto Cluster--Clemson University's largest high-performance computing resource--its structure and basic usage and how to submit computational tasks to the cluster.

Introduction to Programming in Python

This Workshop will introduce Python for those that have little to no programming experience and consists of three parts:

  • Python I: Introduction to Python and core Programming Concepts (No prior programming experience required).
  • Python II: Introduction to Numpy, Matplotlib and Anaconda Environments (Prerequisite: Python I)
  • Python III: Introduction to Data Analysis using Pandas (Prerequisite: Python I Recommended: Python II)

Introduction to Hadoop on Palmetto

This workshop introduces participants to the Hadoop ecosystem deployable on Palmetto. We will cover Hadoop’s architecture, how it can be deployed on Palmetto, import and export of big-data, basic usage, and how to submit scalable data analysis jobs. This workshop will incorporate the use of JupyterHub and Jupyter “Notebooks”. An understanding of the Linux command line and some Python experience is necessary.

Introduction to Big Data Analytics using Spark/Python

This workshop will teach how to how to utilize Apache Spark and Python to perform large-scale, in-memory data analytics. Learning outcomes of this workshop include understanding the overall conceptual design of Spark and demonstrate the advantages of using Spark over traditional Hadoop MapReduce. Participants will also learn to develop Spark programs using Python and to leverage Spark’s specific capabilities such as SQLContext and DataFrame to assist with data analytics.

Introduction to R Programming

Introduction to R language for data analytics using RStudio on PC and also Jupyter notebooks on Palmetto. Workshop contents include basic understand of R, installation of additional R modules, introduction to data manipulation, introduction to visualization, and several best practices for using R. No prior knowledge of R or programming in general is required.