Semester Project
This page summarizes some resources to help you to conduct the semester project.
Dataset
There are multiple online datasets.
- SNAP from Stanford (recommended): http://snap.stanford.edu/data/index.html
 - Arizona State University: http://socialcomputing.asu.edu/pages/datasets
 - UCI dataset: https://archive.ics.uci.edu/ml/datasets.html
 
Datasets we have at CISPA:
- Location datasets from Instagram (https://yangzhangalmo.github.io/papers/CCS17.pdf)
 - Hashtag datasets (https://yangzhangalmo.github.io/papers/WWW18.pdf)
 
You can also collect datasets yourself
- Twitter's API with Tweepy
 
Some papers
Attribute Inference:
- paper: https://cs.stanford.edu/~jure/pubs/node2vec-kdd16.pdf
 - video: https://www.youtube.com/watch?v=1_QH5BEP5BM
 
Unicity of Location:
- paper: https://www.nature.com/articles/srep01376
 - video: https://www.youtube.com/watch?v=DPqreIYe0UU
 - video: https://www.youtube.com/watch?v=kLMKYORwjTk
 
Membership Inference of Machine Learning:
- paper: http://www.comp.nus.edu.sg/~reza/files/Shokri-SP2017.pdf
 - video: https://www.youtube.com/watch?v=rDm1n2gceJY
 
Stealing Machine Learning Models:
- paper: https://www.usenix.org/system/files/conference/usenixsecurity16/sec16_paper_tramer.pdf
 - video: https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/tramer
 
walk2friends:
- paper: https://yangzhangalmo.github.io/papers/CCS17.pdf
 - video: https://www.youtube.com/watch?v=GSQ9JPuvU6U
 
Tagvisor:
- paper: https://yangzhangalmo.github.io/papers/WWW18.pdf
 
Programming
It is highly recommended to use Anaconda to do your project, Anaconda is a data science distribution of python, it contains almost all the packages you need for your project.
Anaconda can be downloaded here: https://www.anaconda.com/download
The packages we need are:
pandas, numpy, scipy, and scikit-learn
There are multiple videos on youtube that introduce pandas, you can take any of them and have a quick look,
e.g., https://www.youtube.com/watch?v=-NR-ynQg0YM
