Semester Project
This page summarizes some resources to help you to conduct the semester project.
Dataset
There are multiple online datasets.
- SNAP from Stanford (recommended): http://snap.stanford.edu/data/index.html
- Arizona State University: http://socialcomputing.asu.edu/pages/datasets
- UCI dataset: https://archive.ics.uci.edu/ml/datasets.html
Datasets we have at CISPA:
- Location datasets from Instagram (https://yangzhangalmo.github.io/papers/CCS17.pdf)
- Hashtag datasets (https://yangzhangalmo.github.io/papers/WWW18.pdf)
You can also collect datasets yourself
- Twitter's API with Tweepy
Some papers
Attribute Inference:
- paper: https://cs.stanford.edu/~jure/pubs/node2vec-kdd16.pdf
- video: https://www.youtube.com/watch?v=1_QH5BEP5BM
Unicity of Location:
- paper: https://www.nature.com/articles/srep01376
- video: https://www.youtube.com/watch?v=DPqreIYe0UU
- video: https://www.youtube.com/watch?v=kLMKYORwjTk
Membership Inference of Machine Learning:
- paper: http://www.comp.nus.edu.sg/~reza/files/Shokri-SP2017.pdf
- video: https://www.youtube.com/watch?v=rDm1n2gceJY
Stealing Machine Learning Models:
- paper: https://www.usenix.org/system/files/conference/usenixsecurity16/sec16_paper_tramer.pdf
- video: https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/tramer
walk2friends:
- paper: https://yangzhangalmo.github.io/papers/CCS17.pdf
- video: https://www.youtube.com/watch?v=GSQ9JPuvU6U
Tagvisor:
- paper: https://yangzhangalmo.github.io/papers/WWW18.pdf
Programming
It is highly recommended to use Anaconda to do your project, Anaconda is a data science distribution of python, it contains almost all the packages you need for your project.
Anaconda can be downloaded here: https://www.anaconda.com/download
The packages we need are:
pandas, numpy, scipy, and scikit-learn
There are multiple videos on youtube that introduce pandas, you can take any of them and have a quick look,
e.g., https://www.youtube.com/watch?v=-NR-ynQg0YM