Data Science

Prerequisites:
- Basic statistics
- Basic computer science
- Basic programming

Course structure: 19 lectures + 17 lab’s = 36 = 2 times a week. x 18 weeks

Program:

1. Introduction to Data Science. (1+0)

2. Information what to do with it. (1+0)

2.1 Data Scientist job & Analytics process
2.2 Sample tasks. Predictions. Consumer behavior. Social data analysis. Knowledge aggregation. (Typical use-cases are described – bank scoring, recommendation service, fraud detection) Speak about business requirements.

3. Information. Where and how to get it. Data Mining. (3+3)

3.1 Databases. SQL. DB connectors basics.
- Lab – Connect to MySQL and make a simple report
3.2 Unstructured data. Logs. Regular expressions.
- Lab - Given a number of logs – filter necessary information using regular expressions
3.3 Internet structure and Protocols. Crawlers. Parsers.
- Lab – crawl Twitter records using simple parser & Twitter API

4. Statistics & ML part. (6+6)

4.1 Statistics for business. (Lab = real example, language = R)
- Hypotheses testing, measures, dimension reduction
- Linear & Logistic regression & EM & K-means
- Time series & HMM
- SVM & boosting
- Collaborative filtering & Anomaly detection
- LDA & NLP

5. Big Data & Advanced techniques. (7+7)

5.1 Big data basic tools. (Lab = VM)
- Hadoop. HDFS/MR.
- Hive, Pig,
- Hive &Cassandra
- New technologies: Hadoop+ SQL - Impala, HAWQ
- Document – oriented: MongoDB
- Reddis & in-memory storage
- Graph: Neo4j

6. Data visualization. (1+1)