This workshop will give an overview about Hadoop, an open source software framework for large scale data processing and the Hadoop Distributed File System (HDFS). Pig, a high-level data processing language will be used to perform data analysis exercises. Please bring your own laptop; a virtual machine with a single-node Hadoop installation will be provided. You can download an image of the virtual machine at
https://drive.google.com/file/d/0B156vzUuXrDEeUViV1lQSjNsNVk/view?usp=sharing